feat(ultraplan-local): v1.8.0 — close Opus 4.7 schema-drift gap

Opus 4.7 reads agent instructions more literally than 4.6. The v1.7 planning-orchestrator described the Step+Manifest schema via prose + procedural rules, which 4.6 inferred correctly but 4.7 sometimes rendered as narrative "Fase N" prose — producing plans ultraexecute Phase 2 rejected. First observed 2026-04-17 during llm-security v6.2.0 planning. v1.8.0 closes the gap: - planning-orchestrator Phase 5 embeds a literal copyable Step+Manifest example (JWT middleware) replacing "read the template" prose - Explicit forbidden-format clause: ## Fase N, ### Phase N, ### Stage N, and any non-"### Step N:" heading are denied - Phase 5.5 schema self-check: grep-verify canonical Step count matches Manifest count and narrative heading count is zero, before handing to plan-critic - ultraexecute-local --validate mode: schema-only check that parses steps + manifests, reports READY/FAIL with actionable error hints, no security scan, no execution. Fast sanity check between /ultraplan-local and full execution. Static verification: 17/17 PASS. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-17 18:01:14 +02:00 · 2026-04-17 18:01:14 +02:00 · 9ecd66929c
commit 9ecd66929c
parent 9f893c3858
7 changed files with 203 additions and 9 deletions
--- a/README.md
+++ b/README.md
@ -61,7 +61,7 @@ Key commands: `/config-audit posture`, `/config-audit feature-gap`, `/config-aud

 ---

-### [Ultra {research | plan | execute} - local](plugins/ultraplan-local/) `v1.7.0`
+### [Ultra {research | plan | execute} - local](plugins/ultraplan-local/) `v1.8.0`

 Deep research, implementation planning, and self-verifying execution with specialized agent swarms, adversarial review, and failure recovery.

@ -69,10 +69,12 @@ Three commands, one pipeline with clear division of labor:

 - **`/ultraresearch-local`** — Gather context. Deep multi-source research with triangulation: 5 local agents + 4 external agents + Gemini bridge, producing structured briefs with confidence ratings. Makes no build decisions.
 - **`/ultraplan-local`** — Transform intent into an executable contract. Per-step YAML manifests (`expected_paths`, `commit_message_pattern`, `bash_syntax_check`). Plan-critic is a hard gate on manifest quality. Accepts research briefs via `--research`.
- **`/ultraexecute-local`** — Execute the contract disciplined. Manifest-based verification, independent Phase 7.5 audit from git log + filesystem (ignores agent bookkeeping), Phase 7.6 bounded recovery dispatch for missing steps. Step 0 pre-flight catches sandbox push-denial before any work.
+- **`/ultraexecute-local`** — Execute the contract disciplined. Manifest-based verification, independent Phase 7.5 audit from git log + filesystem (ignores agent bookkeeping), Phase 7.6 bounded recovery dispatch for missing steps. Step 0 pre-flight catches sandbox push-denial before any work. `--validate` mode offers a fast schema-only sanity-check between planning and execution.

 v1.7 self-verifying chain: a step may not be marked `completed` unless its manifest verifies. The executor's last tool call before reporting must be a manifest check, not an arbitrary file review — preventing hallucinated completion.

+v1.8 closes the Opus 4.7 prompt-literalism gap: planning-orchestrator now emits a literal copyable Step+Manifest template (not prose rules), explicitly forbids narrative `Fase/Phase/Stage` headers, and runs a schema self-check before handing off to plan-critic. Prevents the "plan generated, executor rejects" loop observed when 4.7 inferred the v1.7 schema from prose descriptions alone.
+
 Defense-in-depth security: plugin hooks block destructive commands and sensitive path writes, prompt-level denylist works in headless sessions, pre-execution plan scan catches dangerous commands before they run, scoped `--allowedTools` replaces `--dangerously-skip-permissions` in parallel sessions.

 Modes: default, spec-driven, research-enriched, foreground, quick, decompose, export
--- a/plugins/ultraplan-local/.claude-plugin/plugin.json
+++ b/plugins/ultraplan-local/.claude-plugin/plugin.json
@ -1,7 +1,7 @@
 {
  "name": "ultraplan-local",
  "description": "Deep implementation planning and research with interview, specialized agent swarms, external research, triangulation, adversarial review, session decomposition, and headless execution support.",
-  "version": "1.7.0",
+  "version": "1.8.0",
  "author": {
    "name": "Kjell Tore Guttormsen"
  },
--- a/plugins/ultraplan-local/CHANGELOG.md
+++ b/plugins/ultraplan-local/CHANGELOG.md
@ -4,6 +4,62 @@ All notable changes to this project will be documented in this file.

 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).

+## [1.8.0] - 2026-04-17
+
+### Opus 4.7 prompt literalism — closing the schema-drift gap
+
+Opus 4.7 reads agent instructions more literally than 4.6 (per 4.7 system
+card §6.3.1.1). The v1.7 planning-orchestrator described the Step+Manifest
+schema via prose + procedural rules ("read the template"), which 4.6
+inferred correctly but 4.7 sometimes rendered as narrative "Fase N" prose.
+The result: plans that executed cleanly on 4.6 were rejected by
+ultraexecute Phase 2 parsing on 4.7 — first observed during v6.2.0 planning
+for `llm-security`. v1.8.0 closes the gap by replacing prose rules with a
+literal copyable template, explicit forbidden-format clauses, and a
+pre-handoff schema self-check.
+
+### Added
+
+- **Inline literal Step+Manifest template** in `planning-orchestrator`
+  Phase 5 — a complete, copyable example (JWT middleware step) replaces
+  "read the template" prose. Removes ambiguity about heading format,
+  field order, and manifest YAML structure.
+- **Forbidden heading-format clause** in Phase 5 — explicit denylist for
+  `## Fase N`, `### Phase N`, `### Stage N`, and other narrative formats
+  the executor cannot parse. Negative constraints land harder on 4.7.
+- **Phase 5.5 schema self-check** in `planning-orchestrator` — after
+  writing the plan, grep-verify canonical `### Step N:` count matches
+  `manifest:` count, and narrative heading count is zero. Rewrite plan
+  if self-check fails, before handing to plan-critic.
+- **`--validate` mode** in `ultraexecute-local` — schema-only check that
+  parses steps and manifests, reports `READY | FAIL` with specific
+  error hints, and exits without security scan or execution. Intended
+  as a fast sanity-check between `/ultraplan-local` and full execution:
+  ```bash
+  /ultraplan-local "task"
+  /ultraexecute-local --validate <plan>.md   # READY or actionable FAIL
+  /ultraexecute-local <plan>.md              # full execution
+  ```
+
+### Changed
+
+- `planning-orchestrator` Phase 5 now embeds the canonical Step template
+  inline (~60 lines of literal example) rather than referring to
+  `templates/plan-template.md`. Template file remains authoritative for
+  cross-referencing but is no longer load-bearing for plan generation.
+- `ultraexecute-local` Phase 2.3 added as a hard exit point for
+  `--validate` mode; Phase 2.4 security scan explicitly skips this mode.
+
+### Rationale
+
+v1.7.0's self-verifying chain assumed the orchestrator reliably produces
+the v1.7 schema. That held on 4.6. v1.8.0 makes the assumption robust to
+4.7-style literal interpretation by moving from "describe the format" to
+"show the exact format and forbid alternatives", plus a self-check loop
+before human-visible output. Pairs with `--validate` as a user-facing
+verification step that catches any residual drift before execution side
+effects begin.
+
 ## [1.7.0] - 2026-04-12

 ### The self-verifying plan chain
--- a/plugins/ultraplan-local/CLAUDE.md
+++ b/plugins/ultraplan-local/CLAUDE.md
@ -45,6 +45,7 @@ Flags can be combined: `--local --fg`, `--external --quick`.
 | _(default)_ | Execute plan — auto-detects Execution Strategy for multi-session |
 | `--resume` | Resume from last progress checkpoint |
 | `--dry-run` | Validate plan structure without executing |
+| `--validate` | Schema-only check — parse steps + manifests, report `READY \| FAIL`, no execution (v1.8) |
 | `--step N` | Execute only step N |
 | `--fg` | Force foreground — run all steps sequentially, ignore Execution Strategy |
 | `--session N` | Execute only session N from plan's Execution Strategy |
--- a/plugins/ultraplan-local/README.md
+++ b/plugins/ultraplan-local/README.md
@ -1,6 +1,6 @@
 # ultraplan-local — Plan, Research, Execute

-![Version](https://img.shields.io/badge/version-1.7.0-blue)
+![Version](https://img.shields.io/badge/version-1.8.0-blue)
 ![License](https://img.shields.io/badge/license-MIT-green)
 ![Platform](https://img.shields.io/badge/platform-Claude%20Code-purple)

@ -233,6 +233,7 @@ Reads a plan from `/ultraplan-local` and implements it with strict discipline. N
 | **Default** | `/ultraexecute-local plan.md` | Auto-detects Execution Strategy, parallel if available |
 | **Resume** | `/ultraexecute-local plan.md --resume` | Resume from last progress checkpoint |
 | **Dry run** | `/ultraexecute-local plan.md --dry-run` | Validate plan structure + preview sessions and billing |
+| **Validate** | `/ultraexecute-local plan.md --validate` | Schema-only check — parse steps + manifests, report `READY \| FAIL`, no execution |
 | **Single step** | `/ultraexecute-local plan.md --step 3` | Execute only step 3 |
 | **Foreground** | `/ultraexecute-local plan.md --fg` | Force sequential, ignore Execution Strategy |
 | **Single session** | `/ultraexecute-local plan.md --session 2` | Execute only session 2 from Execution Strategy |
--- a/plugins/ultraplan-local/agents/planning-orchestrator.md
+++ b/plugins/ultraplan-local/agents/planning-orchestrator.md
@ -191,6 +191,62 @@ the title. This signals to ultraexecute-local that the plan includes per-step
 verification manifests and enables strict audit mode. Plans without this
 marker are treated as legacy v1.6 with synthesized minimal manifests.

+### Mandatory step format — copy this exactly
+
+The Implementation Plan section MUST contain numbered steps using the EXACT
+format shown below. The executor (`ultraexecute-local`) parses plans with
+strict regex matching. Any deviation breaks parsing and forces the user to
+re-run planning.
+
+**FORBIDDEN heading formats** (the executor's parser rejects these):
+- `## Fase 1`, `### Fase 1` — Norwegian narrative format
+- `## Phase 1`, `### Phase 1` — narrative phase format
+- `## Stage 1`, `### Stage 1` — narrative stage format
+- `### 1.` or `### 1)` — numbered without "Step"
+- `### Step 1 —` (em-dash instead of colon)
+- Any heading that doesn't match the regex `^### Step \d+: `
+
+**REQUIRED heading format:** `### Step N: <description>` (where N is 1, 2, 3, ...
+and the colon is followed by a single space then the description).
+
+**REQUIRED step body** — every step MUST include all of these fields, in this
+order, formatted as bullet points:
+
+```markdown
+### Step 1: Add JWT verification middleware
+
+- **Files:** `src/middleware/jwt.ts`
+- **Changes:** Create new middleware function `verifyJWT(req, res, next)` that reads `Authorization: Bearer <token>` header, verifies signature with `process.env.JWT_SECRET`, attaches decoded payload to `req.user`, and returns 401 on invalid/missing token. (new file)
+- **Reuses:** `jsonwebtoken.verify()` (already in package.json), pattern from `src/middleware/cors.ts`
+- **Test first:**
+  - File: `src/middleware/jwt.test.ts` (new)
+  - Verifies: valid token attaches user; invalid token returns 401; missing header returns 401
+  - Pattern: `src/middleware/cors.test.ts` (follow this style)
+- **Verify:** `npm test -- jwt.test.ts` → expected: `3 passing`
+- **On failure:** revert — `git checkout -- src/middleware/jwt.ts src/middleware/jwt.test.ts`
+- **Checkpoint:** `git commit -m "feat(auth): add JWT verification middleware"`
+- **Manifest:**
+  ```yaml
+  manifest:
+    expected_paths:
+      - src/middleware/jwt.ts
+      - src/middleware/jwt.test.ts
+    min_file_count: 2
+    commit_message_pattern: "^feat\\(auth\\): add JWT verification middleware$"
+    bash_syntax_check: []
+    forbidden_paths:
+      - src/middleware/cors.ts
+    must_contain:
+      - path: src/middleware/jwt.ts
+        pattern: "verifyJWT"
+  ```
+```
+
+The example above is the canonical shape. Substitute your own file paths,
+descriptions, and patterns — but preserve the exact heading format, bullet
+field names, and Manifest YAML structure. Do not invent new field names. Do
+not skip fields. Do not nest steps under sub-headings.
+
 ### Manifest generation rules (REQUIRED for every step)

 Every implementation step MUST include a `Manifest:` block as its last field,
@ -230,6 +286,32 @@ Derive the manifest fields mechanically from the step's other fields:

 If any validation fails, fix the plan before handing to Phase 6 review.

+### Phase 5.5 — Schema self-check (REQUIRED before Phase 6)
+
+After writing the plan file, verify the output conforms to the executor's
+parser BEFORE handing to plan-critic. Use Bash to grep the plan file:
+
+```bash
+# Count canonical step headings
+grep -c '^### Step [0-9]\+: ' "$plan_path"
+
+# Count manifest blocks
+grep -c '^  manifest:' "$plan_path"
+
+# Detect forbidden narrative formats
+grep -cE '^(##|###) (Fase|Phase|Stage) [0-9]' "$plan_path"
+```
+
+**Pass criteria:**
+- Step count ≥ 1
+- Manifest count == Step count
+- Forbidden narrative count == 0
+
+**If the plan fails schema self-check:** rewrite the Implementation Plan
+section using the exact literal template shown earlier in Phase 5. Do NOT
+proceed to Phase 6 with a schema-failing plan — plan-critic cannot repair
+format drift, only content issues.
+
 ### Failure recovery (REQUIRED for every step)

 Each implementation step MUST include:
--- a/plugins/ultraplan-local/commands/ultraexecute-local.md
+++ b/plugins/ultraplan-local/commands/ultraexecute-local.md
@ -1,7 +1,7 @@
 ---
 name: ultraexecute-local
 description: Disciplined plan executor — single-session or multi-session with parallel orchestration, failure recovery, and headless support
-argument-hint: "[--fg | --resume | --dry-run | --step N | --session N] <plan.md>"
+argument-hint: "[--fg | --resume | --dry-run | --validate | --step N | --session N] <plan.md>"
 model: opus
 allowed-tools: Read, Write, Edit, Bash, Glob, Grep, AskUserQuestion
 ---
@ -21,11 +21,12 @@ Parse `$ARGUMENTS` for mode flags:
 1. If arguments contain `--fg`: extract the file path. Set **mode = foreground**.
 2. If arguments contain `--resume`: extract the file path. Set **mode = resume**.
 3. If arguments contain `--dry-run`: extract the file path. Set **mode = dry-run**.
-4. If arguments contain `--step N` (N is a positive integer): extract N and the file path.
+4. If arguments contain `--validate`: extract the file path. Set **mode = validate**.
+5. If arguments contain `--step N` (N is a positive integer): extract N and the file path.
   Set **mode = step**, **target-step = N**.
-5. If arguments contain `--session N` (N is a positive integer): extract N and the file path.
+6. If arguments contain `--session N` (N is a positive integer): extract N and the file path.
   Set **mode = session**, **target-session = N**.
-6. Otherwise: the entire argument string is the file path. Set **mode = execute**.
+7. Otherwise: the entire argument string is the file path. Set **mode = execute**.

 If no path is provided, output usage and stop:

@ -34,6 +35,7 @@ Usage: /ultraexecute-local <plan.md>
       /ultraexecute-local --fg <plan.md>
       /ultraexecute-local --resume <plan.md>
       /ultraexecute-local --dry-run <plan.md>
+       /ultraexecute-local --validate <plan.md>
       /ultraexecute-local --step N <plan.md>
       /ultraexecute-local --session N <plan.md>

@ -42,6 +44,7 @@ Modes:
  --fg           Force foreground — all steps sequentially, ignore Execution Strategy
  --resume       Resume from last progress checkpoint
  --dry-run      Validate plan and show execution strategy without running
+  --validate     Schema-only check — parse steps + manifests, no security scan, no execution
  --step N       Execute only step N (foreground)
  --session N    Execute only session N from the plan's Execution Strategy

@ -50,6 +53,7 @@ Examples:
  /ultraexecute-local --fg .claude/plans/ultraplan-2026-04-06-auth-refactor.md
  /ultraexecute-local --session 2 .claude/plans/ultraplan-2026-04-06-auth-refactor.md
  /ultraexecute-local --dry-run .claude/plans/ultraplan-2026-04-06-auth-refactor.md
+  /ultraexecute-local --validate .claude/plans/ultraplan-2026-04-06-auth-refactor.md
 ```

 If the file does not exist, report and stop:
@ -153,9 +157,57 @@ Steps: {N}
 {if warnings}: Warnings: {list}
 ```

+## Phase 2.3 — Validate-only mode exit (if mode = validate)
+
+**If mode = validate, stop after Phase 2 parsing** and emit a schema-only
+report. Do NOT run security scan, do NOT touch progress files, do NOT
+execute any steps. This gives the user a fast sanity-check of plan
+schema compliance without side effects.
+
+If Phase 2 parsing succeeded (no fatal errors, every step has a valid
+Manifest block in strict mode, or synthesized manifests in legacy mode):
+
+```
+=== Schema Validation: READY ===
+File: {path}
+Type: {plan | session-spec}
+plan_version: {1.7 | legacy}
+Steps: {N}
+Manifests: {N valid | N synthesized (legacy)}
+Warnings: {count}
+{if warnings}: - {each warning on own line}
+
+Plan is schema-compliant. Safe to run:
+  /ultraexecute-local {path}
+```
+
+If Phase 2 parsing failed (unrecognized format, missing Manifest in strict
+mode, malformed YAML, invalid regex):
+
+```
+=== Schema Validation: FAIL ===
+File: {path}
+Reason: {specific error from Phase 2}
+
+{if format not recognized}:
+  Detected heading format: {e.g. "### Fase 1:", "## Phase 1"}
+  Expected: "### Step N: <description>"
+  Fix: re-run /ultraplan-local — planning-orchestrator must emit v1.7 format
+
+{if missing manifest}:
+  Step {N} has no Manifest block (plan_version=1.7 requires one per step)
+  Fix: re-run /ultraplan-local — planning-orchestrator must include manifest YAML
+
+{if malformed YAML or invalid regex}:
+  Step {N}: {specific YAML/regex error}
+  Fix: edit the plan manually or re-run /ultraplan-local
+```
+
+Exit after emitting the report. Do not continue to Phase 2.4 or later.
+
 ## Phase 2.4 — Pre-execution security scan

-**Runs for all modes except dry-run** (dry-run has its own report format).
+**Runs for all modes except dry-run and validate** (those modes exit earlier or have their own report format).

 Scan every `Verify:` and `Checkpoint:` command in the parsed plan against the
 executor security denylist. This catches dangerous commands before execution begins.