feat(ultraplan-local): v1.7.0 — self-verifying plan chain
Wave 1 of a 6-session parallel build revealed three failure modes: (1) hallucinated completion (status=completed after 2/5 steps, last tool call was an arbitrary file review), (2) fail-late bash (3/6 sessions had push blocked inside sub-agent sandbox after all work was done), (3) no objective verification (plans were prose). v1.7 closes all three by making the plan an executable contract. Per-step YAML manifest (expected_paths, commit_message_pattern, bash_syntax_check, forbidden_paths, must_contain) is the objective completion predicate. Plan-critic dimension 10 (Manifest quality) is a hard gate. Session decomposer propagates manifests verbatim and emits an obligatory Step 0 pre-flight (git push --dry-run, exit 77 sentinel) in every session spec. ultraexecute-local gets Phase 7.5 (independent manifest audit from git log + filesystem, ignoring agent bookkeeping) and Phase 7.6 (bounded recovery dispatch, recovery_depth ≤ 2). Hard Rule 17 forbids marking a step passed without manifest verification. Hard Rule 18 forbids ending on an arbitrary tool call before reporting. Division of labor is made explicit: - /ultraresearch-local gathers context (no build decisions) - /ultraplan-local produces an executable contract (manifests, plan-critic gate) - /ultraexecute-local executes disciplined (does NOT compensate for weak plans — escalates) Code complete. Docs partial (Arbeidsdeling table + manifest section added to plugin + marketplace READMEs). Verification tests (10-sequence) pending — see REMEMBER.md. Backward compat: v1.6 plans without plan_version marker get legacy mode with synthesized manifests and legacy_plan: true in progress file. Plan-critic emits advisory, not block. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
72f2e8f6c9
commit
d1befac35a
11 changed files with 651 additions and 27 deletions
14
README.md
14
README.md
|
|
@ -59,15 +59,17 @@ Key commands: `/config-audit posture`, `/config-audit discover`, `/config-audit
|
|||
|
||||
---
|
||||
|
||||
### [Ultra {research | plan | execute} - local](plugins/ultraplan-local/) `v1.6.0`
|
||||
### [Ultra {research | plan | execute} - local](plugins/ultraplan-local/) `v1.7.0`
|
||||
|
||||
Deep research, implementation planning, and autonomous execution with specialized agent swarms, adversarial review, and failure recovery.
|
||||
Deep research, implementation planning, and self-verifying execution with specialized agent swarms, adversarial review, and failure recovery.
|
||||
|
||||
Three commands, one pipeline: research first, then plan, then execute.
|
||||
Three commands, one pipeline with clear division of labor:
|
||||
|
||||
- **`/ultraresearch-local`** — Deep multi-source research with triangulation: 5 local agents + 4 external agents + Gemini bridge, producing structured briefs with confidence ratings
|
||||
- **`/ultraplan-local`** — Interview, 6-8 specialized agents explore the codebase in parallel, adversarial review by plan-critic and scope-guardian. Accepts research briefs via `--research`
|
||||
- **`/ultraexecute-local`** — Step-by-step implementation with git checkpoints, automatic failure recovery, and parallel session decomposition
|
||||
- **`/ultraresearch-local`** — Gather context. Deep multi-source research with triangulation: 5 local agents + 4 external agents + Gemini bridge, producing structured briefs with confidence ratings. Makes no build decisions.
|
||||
- **`/ultraplan-local`** — Transform intent into an executable contract. Per-step YAML manifests (`expected_paths`, `commit_message_pattern`, `bash_syntax_check`). Plan-critic is a hard gate on manifest quality. Accepts research briefs via `--research`.
|
||||
- **`/ultraexecute-local`** — Execute the contract disciplined. Manifest-based verification, independent Phase 7.5 audit from git log + filesystem (ignores agent bookkeeping), Phase 7.6 bounded recovery dispatch for missing steps. Step 0 pre-flight catches sandbox push-denial before any work.
|
||||
|
||||
v1.7 self-verifying chain: a step may not be marked `completed` unless its manifest verifies. The executor's last tool call before reporting must be a manifest check, not an arbitrary file review — preventing hallucinated completion.
|
||||
|
||||
Defense-in-depth security: plugin hooks block destructive commands and sensitive path writes, prompt-level denylist works in headless sessions, pre-execution plan scan catches dangerous commands before they run, scoped `--allowedTools` replaces `--dangerously-skip-permissions` in parallel sessions.
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue