feat(ultraplan-local): v1.7.0 — self-verifying plan chain
Wave 1 of a 6-session parallel build revealed three failure modes: (1) hallucinated completion (status=completed after 2/5 steps, last tool call was an arbitrary file review), (2) fail-late bash (3/6 sessions had push blocked inside sub-agent sandbox after all work was done), (3) no objective verification (plans were prose). v1.7 closes all three by making the plan an executable contract. Per-step YAML manifest (expected_paths, commit_message_pattern, bash_syntax_check, forbidden_paths, must_contain) is the objective completion predicate. Plan-critic dimension 10 (Manifest quality) is a hard gate. Session decomposer propagates manifests verbatim and emits an obligatory Step 0 pre-flight (git push --dry-run, exit 77 sentinel) in every session spec. ultraexecute-local gets Phase 7.5 (independent manifest audit from git log + filesystem, ignoring agent bookkeeping) and Phase 7.6 (bounded recovery dispatch, recovery_depth ≤ 2). Hard Rule 17 forbids marking a step passed without manifest verification. Hard Rule 18 forbids ending on an arbitrary tool call before reporting. Division of labor is made explicit: - /ultraresearch-local gathers context (no build decisions) - /ultraplan-local produces an executable contract (manifests, plan-critic gate) - /ultraexecute-local executes disciplined (does NOT compensate for weak plans — escalates) Code complete. Docs partial (Arbeidsdeling table + manifest section added to plugin + marketplace READMEs). Verification tests (10-sequence) pending — see REMEMBER.md. Backward compat: v1.6 plans without plan_version marker get legacy mode with synthesized manifests and legacy_plan: true in progress file. Plan-critic emits advisory, not block. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
72f2e8f6c9
commit
d1befac35a
11 changed files with 651 additions and 27 deletions
|
|
@ -4,6 +4,90 @@ All notable changes to this project will be documented in this file.
|
|||
|
||||
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
|
||||
|
||||
## [1.7.0] - 2026-04-12
|
||||
|
||||
### The self-verifying plan chain
|
||||
|
||||
Wave 1 of a parallel 6-session build revealed three failure modes: (1) a
|
||||
session reported `status=completed` after only 2/5 steps — last tool call
|
||||
was an arbitrary file review, not a completion check; (2) 3/6 sessions
|
||||
had push blocked inside the sub-agent bash sandbox *after* all work was
|
||||
done; (3) plans and blueprints were prose, so the orchestrator had no
|
||||
machine-readable way to verify completion. v1.7.0 closes all three by
|
||||
making the plan itself an executable contract.
|
||||
|
||||
### Added
|
||||
|
||||
- **Per-step verification manifest** in plan format (`plan_version: 1.7`).
|
||||
Every step now ends with a YAML `manifest:` block declaring
|
||||
`expected_paths`, `min_file_count`, `commit_message_pattern`,
|
||||
`bash_syntax_check`, `forbidden_paths`, `must_contain`. The manifest is
|
||||
the objective completion predicate — the Verify command is necessary but
|
||||
not sufficient.
|
||||
- **Plan-critic dimension 10 — Manifest quality (hard gate).** Missing
|
||||
or invalid manifest (unparseable regex, path contradiction, missing
|
||||
block) is a `major` finding. v1.6 plans get a legacy-mode warning
|
||||
instead of a block.
|
||||
- **Session Manifest aggregate** in session specs — synthesized by
|
||||
`session-decomposer` as the union of per-step manifests. Gives
|
||||
`ultraexecute-local` a single YAML block per session to audit against.
|
||||
- **Step 0: Sandbox pre-flight** — obligatory first step in every
|
||||
generated session spec. Runs `git push --dry-run origin HEAD`; exit 77
|
||||
= sandbox cannot push, session status becomes `blocked` (not `failed`),
|
||||
no real work attempted. Escape hatch: `ULTRAEXECUTE_SKIP_PREFLIGHT=1`.
|
||||
- **Launch script pre-flight** — `headless-launch-template.md` adds a
|
||||
`git push --dry-run` check outside the sandbox, before any session
|
||||
spawns, catching credential issues at the earliest possible point.
|
||||
- **Phase 7.5 — Manifest audit (independent).** After all steps complete,
|
||||
`ultraexecute-local` re-verifies expected paths, commit count, commit
|
||||
message patterns, bash syntax, and forbidden-path untouched-ness from
|
||||
git log and filesystem. Agent's own bookkeeping is ignored. Disagreement
|
||||
with progress file → status overridden to `partial`.
|
||||
- **Phase 7.6 — Recovery dispatch (bounded).** When Phase 7.5 detects
|
||||
drift in multi-session parent context, synthesize a temp session spec
|
||||
containing only missing steps and dispatch via existing
|
||||
`claude -p "/ultraexecute-local --session N"`. `recovery_depth ≤ 2`
|
||||
hard cap — third drift escalates to user.
|
||||
- **Hard Rule 17: Manifest is the completion predicate.** A step may
|
||||
not be marked passed if its manifest does not verify, regardless of
|
||||
Verify's exit code.
|
||||
- **Hard Rule 18: Last-activity rule.** Executor's final tool call
|
||||
before Phase 8 must be a manifest check, never an arbitrary file
|
||||
review. Prevents hallucinated completion.
|
||||
|
||||
### Changed
|
||||
|
||||
- **Plan template (`templates/plan-template.md`)** — adds
|
||||
`plan_version: 1.7` metadata line, `Manifest:` field on every step,
|
||||
"Manifest — objective completion predicate" section.
|
||||
- **Plan-critic scoring** rebalanced: Headless readiness 0.15 → 0.10,
|
||||
Manifest quality 0.05 added. Legacy v1.6 plans skip the Manifest
|
||||
dimension and keep Headless readiness at 0.15.
|
||||
- **Planning-orchestrator Phase 5** adds "Manifest generation rules
|
||||
(REQUIRED for every step)" with mechanical derivation from `Files:`
|
||||
and Checkpoint. Validates regex compilation and path existence before
|
||||
handoff to plan-critic.
|
||||
- **Session-decomposer** parses plan manifests and propagates them
|
||||
verbatim into session specs. For v1.7+ plans with missing manifests:
|
||||
abort with pointer to failing step. For legacy v1.6 plans: synthesize
|
||||
minimal manifests and flag `legacy_synthesis: true`.
|
||||
- **ultraexecute-local Phase 2** parses manifest YAML. Ugyldig YAML =
|
||||
abort with pointer to step. v<1.7 plans: synthesize + log
|
||||
`legacy_plan: true`.
|
||||
- **ultraexecute-local Phase 6** — sub-step D renamed to D1 "Command
|
||||
verification"; new D2 "Manifest verification" runs after D1 with 5
|
||||
checks. F "Checkpoint" adds `checkpoint_drift` logging when HEAD
|
||||
message doesn't match `commit_message_pattern` (non-fatal).
|
||||
- **Phase 8 report** — table gets Manifest column; JSON summary adds
|
||||
`plan_version`, `manifest_audit`, `drift_details`, `recovery_dispatched`,
|
||||
`recovery_depth`, `legacy_plan`. Result vocabulary strict:
|
||||
`completed | partial | blocked | failed | stopped`.
|
||||
- **Division of labor clarified** in README — `/ultraresearch-local`
|
||||
gathers context (no decisions), `/ultraplan-local` transforms intent
|
||||
into an executable contract (manifests, plan-critic gate),
|
||||
`/ultraexecute-local` executes the contract disciplined (does NOT
|
||||
compensate for weak plans — escalates).
|
||||
|
||||
## [1.6.0] - 2026-04-08
|
||||
|
||||
### Added
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue