feat(ultraplan-local): v1.8.0 — close Opus 4.7 schema-drift gap
Opus 4.7 reads agent instructions more literally than 4.6. The v1.7 planning-orchestrator described the Step+Manifest schema via prose + procedural rules, which 4.6 inferred correctly but 4.7 sometimes rendered as narrative "Fase N" prose — producing plans ultraexecute Phase 2 rejected. First observed 2026-04-17 during llm-security v6.2.0 planning. v1.8.0 closes the gap: - planning-orchestrator Phase 5 embeds a literal copyable Step+Manifest example (JWT middleware) replacing "read the template" prose - Explicit forbidden-format clause: ## Fase N, ### Phase N, ### Stage N, and any non-"### Step N:" heading are denied - Phase 5.5 schema self-check: grep-verify canonical Step count matches Manifest count and narrative heading count is zero, before handing to plan-critic - ultraexecute-local --validate mode: schema-only check that parses steps + manifests, reports READY/FAIL with actionable error hints, no security scan, no execution. Fast sanity check between /ultraplan-local and full execution. Static verification: 17/17 PASS. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
parent
9f893c3858
commit
9ecd66929c
7 changed files with 203 additions and 9 deletions
|
|
@ -4,6 +4,62 @@ All notable changes to this project will be documented in this file.
|
|||
|
||||
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
|
||||
|
||||
## [1.8.0] - 2026-04-17
|
||||
|
||||
### Opus 4.7 prompt literalism — closing the schema-drift gap
|
||||
|
||||
Opus 4.7 reads agent instructions more literally than 4.6 (per 4.7 system
|
||||
card §6.3.1.1). The v1.7 planning-orchestrator described the Step+Manifest
|
||||
schema via prose + procedural rules ("read the template"), which 4.6
|
||||
inferred correctly but 4.7 sometimes rendered as narrative "Fase N" prose.
|
||||
The result: plans that executed cleanly on 4.6 were rejected by
|
||||
ultraexecute Phase 2 parsing on 4.7 — first observed during v6.2.0 planning
|
||||
for `llm-security`. v1.8.0 closes the gap by replacing prose rules with a
|
||||
literal copyable template, explicit forbidden-format clauses, and a
|
||||
pre-handoff schema self-check.
|
||||
|
||||
### Added
|
||||
|
||||
- **Inline literal Step+Manifest template** in `planning-orchestrator`
|
||||
Phase 5 — a complete, copyable example (JWT middleware step) replaces
|
||||
"read the template" prose. Removes ambiguity about heading format,
|
||||
field order, and manifest YAML structure.
|
||||
- **Forbidden heading-format clause** in Phase 5 — explicit denylist for
|
||||
`## Fase N`, `### Phase N`, `### Stage N`, and other narrative formats
|
||||
the executor cannot parse. Negative constraints land harder on 4.7.
|
||||
- **Phase 5.5 schema self-check** in `planning-orchestrator` — after
|
||||
writing the plan, grep-verify canonical `### Step N:` count matches
|
||||
`manifest:` count, and narrative heading count is zero. Rewrite plan
|
||||
if self-check fails, before handing to plan-critic.
|
||||
- **`--validate` mode** in `ultraexecute-local` — schema-only check that
|
||||
parses steps and manifests, reports `READY | FAIL` with specific
|
||||
error hints, and exits without security scan or execution. Intended
|
||||
as a fast sanity-check between `/ultraplan-local` and full execution:
|
||||
```bash
|
||||
/ultraplan-local "task"
|
||||
/ultraexecute-local --validate <plan>.md # READY or actionable FAIL
|
||||
/ultraexecute-local <plan>.md # full execution
|
||||
```
|
||||
|
||||
### Changed
|
||||
|
||||
- `planning-orchestrator` Phase 5 now embeds the canonical Step template
|
||||
inline (~60 lines of literal example) rather than referring to
|
||||
`templates/plan-template.md`. Template file remains authoritative for
|
||||
cross-referencing but is no longer load-bearing for plan generation.
|
||||
- `ultraexecute-local` Phase 2.3 added as a hard exit point for
|
||||
`--validate` mode; Phase 2.4 security scan explicitly skips this mode.
|
||||
|
||||
### Rationale
|
||||
|
||||
v1.7.0's self-verifying chain assumed the orchestrator reliably produces
|
||||
the v1.7 schema. That held on 4.6. v1.8.0 makes the assumption robust to
|
||||
4.7-style literal interpretation by moving from "describe the format" to
|
||||
"show the exact format and forbid alternatives", plus a self-check loop
|
||||
before human-visible output. Pairs with `--validate` as a user-facing
|
||||
verification step that catches any residual drift before execution side
|
||||
effects begin.
|
||||
|
||||
## [1.7.0] - 2026-04-12
|
||||
|
||||
### The self-verifying plan chain
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue