feat(ultraplan-local): v1.8.0 — close Opus 4.7 schema-drift gap

Opus 4.7 reads agent instructions more literally than 4.6. The v1.7 planning-orchestrator described the Step+Manifest schema via prose + procedural rules, which 4.6 inferred correctly but 4.7 sometimes rendered as narrative "Fase N" prose — producing plans ultraexecute Phase 2 rejected. First observed 2026-04-17 during llm-security v6.2.0 planning. v1.8.0 closes the gap: - planning-orchestrator Phase 5 embeds a literal copyable Step+Manifest example (JWT middleware) replacing "read the template" prose - Explicit forbidden-format clause: ## Fase N, ### Phase N, ### Stage N, and any non-"### Step N:" heading are denied - Phase 5.5 schema self-check: grep-verify canonical Step count matches Manifest count and narrative heading count is zero, before handing to plan-critic - ultraexecute-local --validate mode: schema-only check that parses steps + manifests, reports READY/FAIL with actionable error hints, no security scan, no execution. Fast sanity check between /ultraplan-local and full execution. Static verification: 17/17 PASS. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-17 18:01:14 +02:00 · 2026-04-17 18:01:14 +02:00 · 9ecd66929c
commit 9ecd66929c
parent 9f893c3858
7 changed files with 203 additions and 9 deletions
--- a/README.md
+++ b/README.md
@ -61,7 +61,7 @@ Key commands: `/config-audit posture`, `/config-audit feature-gap`, `/config-aud
 ---
-### [Ultra {research | plan | execute} - local](plugins/ultraplan-local/) `v1.7.0`
+### [Ultra {research | plan | execute} - local](plugins/ultraplan-local/) `v1.8.0`
 Deep research, implementation planning, and self-verifying execution with specialized agent swarms, adversarial review, and failure recovery.
@ -69,10 +69,12 @@ Three commands, one pipeline with clear division of labor:
 - **`/ultraresearch-local`** — Gather context. Deep multi-source research with triangulation: 5 local agents + 4 external agents + Gemini bridge, producing structured briefs with confidence ratings. Makes no build decisions.
 - **`/ultraplan-local`** — Transform intent into an executable contract. Per-step YAML manifests (`expected_paths`, `commit_message_pattern`, `bash_syntax_check`). Plan-critic is a hard gate on manifest quality. Accepts research briefs via `--research`.
- **`/ultraexecute-local`** — Execute the contract disciplined. Manifest-based verification, independent Phase 7.5 audit from git log + filesystem (ignores agent bookkeeping), Phase 7.6 bounded recovery dispatch for missing steps. Step 0 pre-flight catches sandbox push-denial before any work.
+- **`/ultraexecute-local`** — Execute the contract disciplined. Manifest-based verification, independent Phase 7.5 audit from git log + filesystem (ignores agent bookkeeping), Phase 7.6 bounded recovery dispatch for missing steps. Step 0 pre-flight catches sandbox push-denial before any work. `--validate` mode offers a fast schema-only sanity-check between planning and execution.
 v1.7 self-verifying chain: a step may not be marked `completed` unless its manifest verifies. The executor's last tool call before reporting must be a manifest check, not an arbitrary file review — preventing hallucinated completion.
 v1.8 closes the Opus 4.7 prompt-literalism gap: planning-orchestrator now emits a literal copyable Step+Manifest template (not prose rules), explicitly forbids narrative `Fase/Phase/Stage` headers, and runs a schema self-check before handing off to plan-critic. Prevents the "plan generated, executor rejects" loop observed when 4.7 inferred the v1.7 schema from prose descriptions alone.
 Defense-in-depth security: plugin hooks block destructive commands and sensitive path writes, prompt-level denylist works in headless sessions, pre-execution plan scan catches dangerous commands before they run, scoped `--allowedTools` replaces `--dangerously-skip-permissions` in parallel sessions.
 Modes: default, spec-driven, research-enriched, foreground, quick, decompose, export
--- a/plugins/ultraplan-local/.claude-plugin/plugin.json
+++ b/plugins/ultraplan-local/.claude-plugin/plugin.json
@ -1,7 +1,7 @@
 {
  "name": "ultraplan-local",
  "description": "Deep implementation planning and research with interview, specialized agent swarms, external research, triangulation, adversarial review, session decomposition, and headless execution support.",
-  "version": "1.7.0",
+  "version": "1.8.0",
  "author": {
    "name": "Kjell Tore Guttormsen"
  },
--- a/plugins/ultraplan-local/CHANGELOG.md
+++ b/plugins/ultraplan-local/CHANGELOG.md
@ -4,6 +4,62 @@ All notable changes to this project will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
 ## [1.8.0] - 2026-04-17
 ### Opus 4.7 prompt literalism — closing the schema-drift gap
 Opus 4.7 reads agent instructions more literally than 4.6 (per 4.7 system
 card §6.3.1.1). The v1.7 planning-orchestrator described the Step+Manifest
 schema via prose + procedural rules ("read the template"), which 4.6
 inferred correctly but 4.7 sometimes rendered as narrative "Fase N" prose.
 The result: plans that executed cleanly on 4.6 were rejected by
 ultraexecute Phase 2 parsing on 4.7 — first observed during v6.2.0 planning
 for `llm-security`. v1.8.0 closes the gap by replacing prose rules with a
 literal copyable template, explicit forbidden-format clauses, and a
 pre-handoff schema self-check.
 ### Added
 - **Inline literal Step+Manifest template** in `planning-orchestrator`
  Phase 5 — a complete, copyable example (JWT middleware step) replaces
  "read the template" prose. Removes ambiguity about heading format,
  field order, and manifest YAML structure.
 - **Forbidden heading-format clause** in Phase 5 — explicit denylist for
  `## Fase N`, `### Phase N`, `### Stage N`, and other narrative formats
  the executor cannot parse. Negative constraints land harder on 4.7.
 - **Phase 5.5 schema self-check** in `planning-orchestrator` — after
  writing the plan, grep-verify canonical `### Step N:` count matches
  `manifest:` count, and narrative heading count is zero. Rewrite plan
  if self-check fails, before handing to plan-critic.
 - **`--validate` mode** in `ultraexecute-local` — schema-only check that
  parses steps and manifests, reports `READY | FAIL` with specific
  error hints, and exits without security scan or execution. Intended
  as a fast sanity-check between `/ultraplan-local` and full execution:
  ```bash
  /ultraplan-local "task"
  /ultraexecute-local --validate <plan>.md   # READY or actionable FAIL
  /ultraexecute-local <plan>.md              # full execution
  ```
 ### Changed
 - `planning-orchestrator` Phase 5 now embeds the canonical Step template
  inline (~60 lines of literal example) rather than referring to
  `templates/plan-template.md`. Template file remains authoritative for
  cross-referencing but is no longer load-bearing for plan generation.
 - `ultraexecute-local` Phase 2.3 added as a hard exit point for
  `--validate` mode; Phase 2.4 security scan explicitly skips this mode.
 ### Rationale
 v1.7.0's self-verifying chain assumed the orchestrator reliably produces
 the v1.7 schema. That held on 4.6. v1.8.0 makes the assumption robust to
 4.7-style literal interpretation by moving from "describe the format" to
 "show the exact format and forbid alternatives", plus a self-check loop
 before human-visible output. Pairs with `--validate` as a user-facing
 verification step that catches any residual drift before execution side
 effects begin.
 ## [1.7.0] - 2026-04-12
 ### The self-verifying plan chain
--- a/plugins/ultraplan-local/CLAUDE.md
+++ b/plugins/ultraplan-local/CLAUDE.md
@ -45,6 +45,7 @@ Flags can be combined: `--local --fg`, `--external --quick`.
 | _(default)_ | Execute plan — auto-detects Execution Strategy for multi-session |
 | `--resume` | Resume from last progress checkpoint |
 | `--dry-run` | Validate plan structure without executing |
 | `--validate` | Schema-only check — parse steps + manifests, report `READY \| FAIL`, no execution (v1.8) |
 | `--step N` | Execute only step N |
 | `--fg` | Force foreground — run all steps sequentially, ignore Execution Strategy |
 | `--session N` | Execute only session N from plan's Execution Strategy |
--- a/plugins/ultraplan-local/README.md
+++ b/plugins/ultraplan-local/README.md
@ -1,6 +1,6 @@
 # ultraplan-local — Plan, Research, Execute
-![Version](https://img.shields.io/badge/version-1.7.0-blue)
+![Version](https://img.shields.io/badge/version-1.8.0-blue)
 ![License](https://img.shields.io/badge/license-MIT-green)
 ![Platform](https://img.shields.io/badge/platform-Claude%20Code-purple)
@ -233,6 +233,7 @@ Reads a plan from `/ultraplan-local` and implements it with strict discipline. N
 | **Default** | `/ultraexecute-local plan.md` | Auto-detects Execution Strategy, parallel if available |
 | **Resume** | `/ultraexecute-local plan.md --resume` | Resume from last progress checkpoint |
 | **Dry run** | `/ultraexecute-local plan.md --dry-run` | Validate plan structure + preview sessions and billing |
 | **Validate** | `/ultraexecute-local plan.md --validate` | Schema-only check — parse steps + manifests, report `READY \| FAIL`, no execution |
 | **Single step** | `/ultraexecute-local plan.md --step 3` | Execute only step 3 |
 | **Foreground** | `/ultraexecute-local plan.md --fg` | Force sequential, ignore Execution Strategy |
 | **Single session** | `/ultraexecute-local plan.md --session 2` | Execute only session 2 from Execution Strategy |
--- a/plugins/ultraplan-local/agents/planning-orchestrator.md
+++ b/plugins/ultraplan-local/agents/planning-orchestrator.md
@ -191,6 +191,62 @@ the title. This signals to ultraexecute-local that the plan includes per-step
 verification manifests and enables strict audit mode. Plans without this
 marker are treated as legacy v1.6 with synthesized minimal manifests.
 ### Mandatory step format — copy this exactly
 The Implementation Plan section MUST contain numbered steps using the EXACT
 format shown below. The executor (`ultraexecute-local`) parses plans with
 strict regex matching. Any deviation breaks parsing and forces the user to
 re-run planning.
 **FORBIDDEN heading formats** (the executor's parser rejects these):
 - `## Fase 1`, `### Fase 1` — Norwegian narrative format
 - `## Phase 1`, `### Phase 1` — narrative phase format
 - `## Stage 1`, `### Stage 1` — narrative stage format
 - `### 1.` or `### 1)` — numbered without "Step"
 - `### Step 1 —` (em-dash instead of colon)
 - Any heading that doesn't match the regex `^### Step \d+: `
 **REQUIRED heading format:** `### Step N: <description>` (where N is 1, 2, 3, ...
 and the colon is followed by a single space then the description).
 **REQUIRED step body** — every step MUST include all of these fields, in this
 order, formatted as bullet points:
 ```markdown
 ### Step 1: Add JWT verification middleware
 - **Files:** `src/middleware/jwt.ts`
 - **Changes:** Create new middleware function `verifyJWT(req, res, next)` that reads `Authorization: Bearer <token>` header, verifies signature with `process.env.JWT_SECRET`, attaches decoded payload to `req.user`, and returns 401 on invalid/missing token. (new file)
 - **Reuses:** `jsonwebtoken.verify()` (already in package.json), pattern from `src/middleware/cors.ts`
 - **Test first:**
  - File: `src/middleware/jwt.test.ts` (new)
  - Verifies: valid token attaches user; invalid token returns 401; missing header returns 401
  - Pattern: `src/middleware/cors.test.ts` (follow this style)
 - **Verify:** `npm test -- jwt.test.ts` → expected: `3 passing`
 - **On failure:** revert — `git checkout -- src/middleware/jwt.ts src/middleware/jwt.test.ts`
 - **Checkpoint:** `git commit -m "feat(auth): add JWT verification middleware"`
 - **Manifest:**
  ```yaml
  manifest:
    expected_paths:
      - src/middleware/jwt.ts
      - src/middleware/jwt.test.ts
    min_file_count: 2
    commit_message_pattern: "^feat\\(auth\\): add JWT verification middleware$"
    bash_syntax_check: []
    forbidden_paths:
      - src/middleware/cors.ts
    must_contain:
      - path: src/middleware/jwt.ts
        pattern: "verifyJWT"
  ```
 ```
 The example above is the canonical shape. Substitute your own file paths,
 descriptions, and patterns — but preserve the exact heading format, bullet
 field names, and Manifest YAML structure. Do not invent new field names. Do
 not skip fields. Do not nest steps under sub-headings.
 ### Manifest generation rules (REQUIRED for every step)
 Every implementation step MUST include a `Manifest:` block as its last field,
@ -230,6 +286,32 @@ Derive the manifest fields mechanically from the step's other fields:
 If any validation fails, fix the plan before handing to Phase 6 review.
 ### Phase 5.5 — Schema self-check (REQUIRED before Phase 6)
 After writing the plan file, verify the output conforms to the executor's
 parser BEFORE handing to plan-critic. Use Bash to grep the plan file:
 ```bash
 # Count canonical step headings
 grep -c '^### Step [0-9]\+: ' "$plan_path"
 # Count manifest blocks
 grep -c '^  manifest:' "$plan_path"
 # Detect forbidden narrative formats
 grep -cE '^(##|###) (Fase|Phase|Stage) [0-9]' "$plan_path"
 ```
 **Pass criteria:**
 - Step count ≥ 1
 - Manifest count == Step count
 - Forbidden narrative count == 0
 **If the plan fails schema self-check:** rewrite the Implementation Plan
 section using the exact literal template shown earlier in Phase 5. Do NOT
 proceed to Phase 6 with a schema-failing plan — plan-critic cannot repair
 format drift, only content issues.
 ### Failure recovery (REQUIRED for every step)
 Each implementation step MUST include:
--- a/plugins/ultraplan-local/commands/ultraexecute-local.md
+++ b/plugins/ultraplan-local/commands/ultraexecute-local.md
@ -1,7 +1,7 @@
 ---
 name: ultraexecute-local
 description: Disciplined plan executor — single-session or multi-session with parallel orchestration, failure recovery, and headless support
-argument-hint: "[--fg | --resume | --dry-run | --step N | --session N] <plan.md>"
+argument-hint: "[--fg | --resume | --dry-run | --validate | --step N | --session N] <plan.md>"
 model: opus
 allowed-tools: Read, Write, Edit, Bash, Glob, Grep, AskUserQuestion
 ---
@ -21,11 +21,12 @@ Parse `$ARGUMENTS` for mode flags:
 1. If arguments contain `--fg`: extract the file path. Set **mode = foreground**.
 2. If arguments contain `--resume`: extract the file path. Set **mode = resume**.
 3. If arguments contain `--dry-run`: extract the file path. Set **mode = dry-run**.
-4. If arguments contain `--step N` (N is a positive integer): extract N and the file path.
+4. If arguments contain `--validate`: extract the file path. Set **mode = validate**.
 5. If arguments contain `--step N` (N is a positive integer): extract N and the file path.
   Set **mode = step**, **target-step = N**.
-5. If arguments contain `--session N` (N is a positive integer): extract N and the file path.
+6. If arguments contain `--session N` (N is a positive integer): extract N and the file path.
   Set **mode = session**, **target-session = N**.
-6. Otherwise: the entire argument string is the file path. Set **mode = execute**.
+7. Otherwise: the entire argument string is the file path. Set **mode = execute**.
 If no path is provided, output usage and stop:
@ -34,6 +35,7 @@ Usage: /ultraexecute-local <plan.md>
       /ultraexecute-local --fg <plan.md>
       /ultraexecute-local --resume <plan.md>
       /ultraexecute-local --dry-run <plan.md>
       /ultraexecute-local --validate <plan.md>
       /ultraexecute-local --step N <plan.md>
       /ultraexecute-local --session N <plan.md>
@ -42,6 +44,7 @@ Modes:
  --fg           Force foreground — all steps sequentially, ignore Execution Strategy
  --resume       Resume from last progress checkpoint
  --dry-run      Validate plan and show execution strategy without running
  --validate     Schema-only check — parse steps + manifests, no security scan, no execution
  --step N       Execute only step N (foreground)
  --session N    Execute only session N from the plan's Execution Strategy
@ -50,6 +53,7 @@ Examples:
  /ultraexecute-local --fg .claude/plans/ultraplan-2026-04-06-auth-refactor.md
  /ultraexecute-local --session 2 .claude/plans/ultraplan-2026-04-06-auth-refactor.md
  /ultraexecute-local --dry-run .claude/plans/ultraplan-2026-04-06-auth-refactor.md
  /ultraexecute-local --validate .claude/plans/ultraplan-2026-04-06-auth-refactor.md
 ```
 If the file does not exist, report and stop:
@ -153,9 +157,57 @@ Steps: {N}
 {if warnings}: Warnings: {list}
 ```
 ## Phase 2.3 — Validate-only mode exit (if mode = validate)
 **If mode = validate, stop after Phase 2 parsing** and emit a schema-only
 report. Do NOT run security scan, do NOT touch progress files, do NOT
 execute any steps. This gives the user a fast sanity-check of plan
 schema compliance without side effects.
 If Phase 2 parsing succeeded (no fatal errors, every step has a valid
 Manifest block in strict mode, or synthesized manifests in legacy mode):
 ```
 === Schema Validation: READY ===
 File: {path}
 Type: {plan | session-spec}
 plan_version: {1.7 | legacy}
 Steps: {N}
 Manifests: {N valid | N synthesized (legacy)}
 Warnings: {count}
 {if warnings}: - {each warning on own line}
 Plan is schema-compliant. Safe to run:
  /ultraexecute-local {path}
 ```
 If Phase 2 parsing failed (unrecognized format, missing Manifest in strict
 mode, malformed YAML, invalid regex):
 ```
 === Schema Validation: FAIL ===
 File: {path}
 Reason: {specific error from Phase 2}
 {if format not recognized}:
  Detected heading format: {e.g. "### Fase 1:", "## Phase 1"}
  Expected: "### Step N: <description>"
  Fix: re-run /ultraplan-local — planning-orchestrator must emit v1.7 format
 {if missing manifest}:
  Step {N} has no Manifest block (plan_version=1.7 requires one per step)
  Fix: re-run /ultraplan-local — planning-orchestrator must include manifest YAML
 {if malformed YAML or invalid regex}:
  Step {N}: {specific YAML/regex error}
  Fix: edit the plan manually or re-run /ultraplan-local
 ```
 Exit after emitting the report. Do not continue to Phase 2.4 or later.
 ## Phase 2.4 — Pre-execution security scan
-**Runs for all modes except dry-run** (dry-run has its own report format).
+**Runs for all modes except dry-run and validate** (those modes exit earlier or have their own report format).
 Scan every `Verify:` and `Checkpoint:` command in the parsed plan against the
 executor security denylist. This catches dangerous commands before execution begins.