feat(ultraplan-local): v1.8.0 — close Opus 4.7 schema-drift gap

Opus 4.7 reads agent instructions more literally than 4.6. The v1.7
planning-orchestrator described the Step+Manifest schema via prose +
procedural rules, which 4.6 inferred correctly but 4.7 sometimes
rendered as narrative "Fase N" prose — producing plans ultraexecute
Phase 2 rejected. First observed 2026-04-17 during llm-security v6.2.0
planning.

v1.8.0 closes the gap:

- planning-orchestrator Phase 5 embeds a literal copyable Step+Manifest
  example (JWT middleware) replacing "read the template" prose
- Explicit forbidden-format clause: ## Fase N, ### Phase N, ### Stage N,
  and any non-"### Step N:" heading are denied
- Phase 5.5 schema self-check: grep-verify canonical Step count matches
  Manifest count and narrative heading count is zero, before handing to
  plan-critic
- ultraexecute-local --validate mode: schema-only check that parses
  steps + manifests, reports READY/FAIL with actionable error hints,
  no security scan, no execution. Fast sanity check between
  /ultraplan-local and full execution.

Static verification: 17/17 PASS.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Kjell Tore Guttormsen 2026-04-17 18:01:14 +02:00
commit 9ecd66929c
7 changed files with 203 additions and 9 deletions

View file

@ -4,6 +4,62 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
## [1.8.0] - 2026-04-17
### Opus 4.7 prompt literalism — closing the schema-drift gap
Opus 4.7 reads agent instructions more literally than 4.6 (per 4.7 system
card §6.3.1.1). The v1.7 planning-orchestrator described the Step+Manifest
schema via prose + procedural rules ("read the template"), which 4.6
inferred correctly but 4.7 sometimes rendered as narrative "Fase N" prose.
The result: plans that executed cleanly on 4.6 were rejected by
ultraexecute Phase 2 parsing on 4.7 — first observed during v6.2.0 planning
for `llm-security`. v1.8.0 closes the gap by replacing prose rules with a
literal copyable template, explicit forbidden-format clauses, and a
pre-handoff schema self-check.
### Added
- **Inline literal Step+Manifest template** in `planning-orchestrator`
Phase 5 — a complete, copyable example (JWT middleware step) replaces
"read the template" prose. Removes ambiguity about heading format,
field order, and manifest YAML structure.
- **Forbidden heading-format clause** in Phase 5 — explicit denylist for
`## Fase N`, `### Phase N`, `### Stage N`, and other narrative formats
the executor cannot parse. Negative constraints land harder on 4.7.
- **Phase 5.5 schema self-check** in `planning-orchestrator` — after
writing the plan, grep-verify canonical `### Step N:` count matches
`manifest:` count, and narrative heading count is zero. Rewrite plan
if self-check fails, before handing to plan-critic.
- **`--validate` mode** in `ultraexecute-local` — schema-only check that
parses steps and manifests, reports `READY | FAIL` with specific
error hints, and exits without security scan or execution. Intended
as a fast sanity-check between `/ultraplan-local` and full execution:
```bash
/ultraplan-local "task"
/ultraexecute-local --validate <plan>.md # READY or actionable FAIL
/ultraexecute-local <plan>.md # full execution
```
### Changed
- `planning-orchestrator` Phase 5 now embeds the canonical Step template
inline (~60 lines of literal example) rather than referring to
`templates/plan-template.md`. Template file remains authoritative for
cross-referencing but is no longer load-bearing for plan generation.
- `ultraexecute-local` Phase 2.3 added as a hard exit point for
`--validate` mode; Phase 2.4 security scan explicitly skips this mode.
### Rationale
v1.7.0's self-verifying chain assumed the orchestrator reliably produces
the v1.7 schema. That held on 4.6. v1.8.0 makes the assumption robust to
4.7-style literal interpretation by moving from "describe the format" to
"show the exact format and forbid alternatives", plus a self-check loop
before human-visible output. Pairs with `--validate` as a user-facing
verification step that catches any residual drift before execution side
effects begin.
## [1.7.0] - 2026-04-12
### The self-verifying plan chain