Plugin manifest + package.json + README badge bumped 3.1.0 → 3.2.0. Description updated from "Four-command" → "Five-command (brief, research, plan, execute, review)" to reflect /ultrareview-local addition. CHANGELOG entry summarises the ultrareview-local v1.0 work: new command, 4 new agents, Handover 6 contract, ~43 new tests, 5 lib modules, and the 3 v1.1 open questions (5-tier severity migration, real-LLM determinism measurement, SC2 end-to-end test). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
1041 lines
57 KiB
Markdown
1041 lines
57 KiB
Markdown
# Changelog
|
||
|
||
All notable changes to this project will be documented in this file.
|
||
|
||
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
|
||
|
||
## [3.2.0] - 2026-05-01
|
||
|
||
Adds `/ultrareview-local` as the fifth and final command in the ultra-suite —
|
||
independent post-hoc review of delivered code against the brief — and the
|
||
contracted **Handover 6 (review → plan)** feedback loop. Non-breaking: existing
|
||
brief/research/plan/execute pipelines are untouched.
|
||
|
||
### Added
|
||
|
||
- `/ultrareview-local` command — fifth and final command in ultra-suite (independent post-hoc review of delivered code against brief)
|
||
- `lib/validators/review-validator.mjs` — schema validator for new `type: ultrareview` artifact
|
||
- `lib/review/rule-catalogue.mjs` — 12 canonical rule keys (version-pinned)
|
||
- `lib/parsers/finding-id.mjs` — stable SHA1 finding-ID computation
|
||
- `lib/parsers/jaccard.mjs` — Jaccard similarity for determinism testing
|
||
- 4 new agents: review-orchestrator (opus inline ref), brief-conformance-reviewer, code-correctness-reviewer, review-coordinator (Judge Agent pattern)
|
||
- Handover 6 (review → plan) — contracted feedback loop documented in HANDOVER-CONTRACTS.md
|
||
- ~43 new tests (109 → ~152 baseline; rule-catalogue + finding-id + jaccard + review-validator + brief-validator extension + arg-parser extension + project-discovery extension + 4 doc-consistency pins + 3 review-determinism integration + 3 source-findings SC3(b) integration)
|
||
|
||
### Changed
|
||
|
||
- `brief-validator.mjs` accepts `type: ultrareview` in addition to `ultrabrief` (Handover 6 enabler)
|
||
- `/ultraplan-local` consumes `type: ultrareview` brief — extracts findings as plan goals + populates `source_findings` array in plan frontmatter
|
||
- `lib/parsers/project-discovery.mjs` discovers `review.md` and supports `'review'` phase
|
||
- `lib/parsers/arg-parser.mjs` adds `ultrareview` command to FLAG_SCHEMA
|
||
- `hooks/scripts/session-title.mjs` adds `/ultrareview-local` to COMMANDS map
|
||
- Severity vocabulary aligned with llm-security: `Critical | High | Medium | Low | Info` (rule-catalogue retains the 4-tier brief vocabulary; 5-tier migration is a v1.1 candidate)
|
||
|
||
### Tests
|
||
|
||
- 109 baseline → ~152 green (+43 new)
|
||
|
||
### Open Questions for v1.1
|
||
|
||
- Migrate 4-tier severity (BLOCKER/MAJOR/MINOR/SUGGESTION) → llm-security 5-tier (Critical/High/Medium/Low/Info) — research/02 advised but deferred to keep brief contract intact.
|
||
- Real-LLM determinism measurement (current SC4 test uses synthetic fixtures).
|
||
- SC2 end-to-end false-positive integration test (currently agent-prompt-level only).
|
||
|
||
## [3.1.0] - 2026-05-01
|
||
|
||
Quality program: pure quality lift, no scope creep. Built around the
|
||
fork-er onramp — anyone cloning this plugin should get value out of
|
||
the box. 109 zero-dep tests gate readiness.
|
||
|
||
### Added — testing & validation infrastructure (Spor 0+1)
|
||
|
||
- `package.json` with `node:test` runner and zero npm dependencies (`engines.node>=18`)
|
||
- `lib/util/{result,frontmatter}.mjs` — Result-shape helpers and a hand-rolled YAML subset parser supporting list-of-dicts (the form used by `must_contain` in real plans)
|
||
- `lib/parsers/{plan-schema,manifest-yaml,project-discovery,arg-parser,bash-normalize}.mjs` — primitive parsers
|
||
- `lib/validators/{brief,research,plan,progress}-validator.mjs` and `lib/validators/architecture-discovery.mjs` — wrappers over parsers, each with a CLI shim (`if (import.meta.url === ...)`) so they can be invoked from Bash via `node ${CLAUDE_PLUGIN_ROOT}/lib/validators/X.mjs`
|
||
- `tests/lib/*` + `tests/validators/*` — 109 tests
|
||
- `tests/lib/doc-consistency.test.mjs` — pins agent-table count, command-table coverage, plan_version invariant, and settings.json scope cleanliness; first-red surfaces drift between docs and code
|
||
|
||
### Changed — wiring (Spor 1)
|
||
|
||
| Site | Was | Is |
|
||
|------|-----|-----|
|
||
| `/ultrabrief-local` Phase 4g | implicit trust | `node lib/validators/brief-validator.mjs --json` |
|
||
| `/ultraplan-local` Phase 1 | inline frontmatter parse + `test -f` | brief-validator `--soft` + research-validator `--dir` + architecture-discovery |
|
||
| `agents/planning-orchestrator.md` Phase 5.5 | three `grep -cE` calls | single `node lib/validators/plan-validator.mjs --strict --json` |
|
||
| `/ultraexecute-local` Phase 2.3 (`--validate`) | inline regex | `plan-validator --strict` + `progress-validator` |
|
||
|
||
Validators default to `strict: false`. Only `--validate` mode and Phase 5.5 use `--strict`. Existing in-flight projects under `.claude/projects/` continue to work.
|
||
|
||
### Added — handover contracts (Spor 2)
|
||
|
||
- `docs/HANDOVER-CONTRACTS.md` (~310 lines) — single source of truth for the 5 pipeline handovers (brief→research, research→plan, architecture→plan EXTERNAL, plan→execute, progress.json resume). Per-handover sections: Producer / Consumer / Path conventions / Frontmatter schema / Body invariants / Validation strategy / Versioning / Failure modes. Architecture handover is explicitly an external contract — drift-WARN never drift-FAIL.
|
||
|
||
### Added — PreCompact resume integrity (Spor 2, P0 fix)
|
||
|
||
- `hooks/scripts/pre-compact-flush.mjs` — PreCompact-event hook (CC v2.1.105+) that reconciles `progress.json` with git history before context compaction. Atomic write (tmp + rename), monotonic only (current_step never decreases), fail-open (always exit 0). Closes the documented progress.json drift bug from `docs/ultraexecute-v2-observations-from-config-audit-v4.md` — `--resume` now works after long conversations.
|
||
- `hooks/hooks.json` registers the PreCompact entry.
|
||
|
||
### Added — examples (Spor 3)
|
||
|
||
- `examples/01-add-verbose-flag/` — calibrated end-to-end pipeline demo for a small realistic task (add `--verbose` to a CLI parser). Includes `brief.md`, `research/01-cli-parser-conventions.md`, `plan.md` (7 steps with full manifest YAML), and `progress.json`. All four artifacts pass their respective validators. Hand-calibrated, not LLM-generated, so the example stays reviewable.
|
||
- `examples/REGENERATED.md` per example documents calibration date and regeneration triggers.
|
||
- `examples/README.md` walks fork-ers through what to study first.
|
||
|
||
### Changed — plan-critic semantic rubric (Spor 3, P0 fix)
|
||
|
||
`agents/plan-critic.md` rule #7 split into two parts:
|
||
|
||
1. **Literal blockers** (exact-string): `TBD`, `TODO`, `FIXME`, `XXX` always fire.
|
||
2. **Semantic rubric** (instruction-shaped): 8 deferred-decision tests covering vague modifiers, imperatives without targets, forward references without expansion, volume/quality without spec, edge-cases delegated, production-readiness delegated, path mismatch, and over-stuffed steps.
|
||
|
||
Calibrated against 5-phrase corpus the v3.0 exact-string blacklist missed: "implement as needed", "wire it up", "make it production-ready", "add tests where appropriate", "handle edge cases" — all five now flagged with rule citations.
|
||
|
||
### Added — CC v2.1.x feature adoption (Spor 3)
|
||
|
||
- **F8** — `MCP_CONNECTION_NONBLOCKING=true` documented under "Headless multi-session tuning" in README (CC v2.1.89+) for parallel `claude -p` sessions
|
||
- **F9** — `hooks/scripts/session-title.mjs` (UserPromptSubmit, CC v2.1.94+) sets `ultra:<command>:<slug>` titles for ultra invocations
|
||
- **F3** — `hooks/scripts/post-bash-stats.mjs` (PostToolUse, CC v2.1.97+) appends `duration_ms` per Bash call to `${CLAUDE_PLUGIN_DATA}/ultraexecute-stats.jsonl`
|
||
- **F12** — `disableSkillShellExecution: true` recommended in README "Security hardening" + `SECURITY.md` (CC v2.1.91+) for fork-ers handling untrusted plans
|
||
|
||
**Deferred:** F2 (hook `if`-field scoping). The plan called for scoping pre-bash-executor / pre-write-executor to ultraexecute sessions to "reduce false-positives in brief/plan" — but brief/plan don't issue Bash commands that match the destructive denylist, so the rationale doesn't hold. Universal protection wins.
|
||
|
||
### Added — security & extension docs (Spor 3)
|
||
|
||
- `SECURITY.md` — Forgejo private-issue reporting, supported = current minor only, scope (4 hooks + denylist), hardening recommendations
|
||
- `docs/architect-bridge-test.md` — manual smoke checklist for the ultraplan ↔ ultra-cc-architect bridge (1 paragraph, intentionally not CI)
|
||
- `README.md` — new "Extending the plugin" section: how to add an agent, switch the planning model, disable external research, find the data contract, disable the architect bridge
|
||
|
||
### Removed — vestigial config (Spor 0)
|
||
|
||
- `settings.json` `exploration` and `agentTeam` blocks (read by zero code; verified via grep before deletion)
|
||
- `docs/ultra-suite-brief_2.md` archived as `_archive-` (was paste from another plugin's work)
|
||
|
||
### Tests
|
||
|
||
109 tests, all green. `npm test` is the fork-readiness gate.
|
||
|
||
### Hook table (after v3.1.0)
|
||
|
||
| Hook | Event | CC version | Purpose |
|
||
|------|-------|-----------|---------|
|
||
| pre-bash-executor.mjs | PreToolUse(Bash) | 2.0+ | Block destructive shell commands |
|
||
| pre-write-executor.mjs | PreToolUse(Write) | 2.0+ | Block writes to sensitive paths |
|
||
| session-title.mjs | UserPromptSubmit | 2.1.94+ | Set `ultra:<cmd>:<slug>` titles |
|
||
| post-bash-stats.mjs | PostToolUse(Bash) | 2.1.97+ | Log Bash `duration_ms` to JSONL |
|
||
| pre-compact-flush.mjs | PreCompact | 2.1.105+ | Reconcile progress.json with git history |
|
||
|
||
### Files changed
|
||
|
||
This release is plugin-internal — no breaking changes to artifact formats or CLI surface. Forkers should `npm test` after pulling to confirm readiness.
|
||
|
||
## [3.0.0] - 2026-04-30
|
||
|
||
### Architect extracted to its own plugin
|
||
|
||
The `/ultra-cc-architect-local` and `/ultra-skill-author-local` commands, all
|
||
seven of their agents, the `cc-architect-catalog` skill, the `ngram-overlap.mjs`
|
||
script, and the skill-factory test fixtures moved out of `ultraplan-local` and
|
||
into the new `ultra-cc-architect` plugin (v0.1.0).
|
||
|
||
### Why
|
||
|
||
`ultraplan-local` had drifted into containing two distinct domains:
|
||
|
||
1. A **universal planning pipeline** (brief → research → plan → execute) that
|
||
is technology-agnostic and works for any implementation task.
|
||
2. A **Claude-Code-specific architecture phase** that only makes sense when
|
||
building features for Claude Code itself.
|
||
|
||
Keeping them in one plugin caused three problems:
|
||
|
||
- Users who wanted only the planning pipeline had to clone an unfinished
|
||
CC-feature catalog and seven architect/skill-author agents they would
|
||
never invoke.
|
||
- The architect catalog (~11 seed skills) and the planning pipeline lived on
|
||
different release cadences. Architect work blocked pipeline development
|
||
and vice-versa.
|
||
- New users saw six commands when only four belonged to the core flow.
|
||
|
||
The architect was already marked `optional, v2.2` and was fully decoupled at
|
||
the code level — only one filesystem touchpoint remained: `/ultraplan-local`
|
||
auto-discovers `architecture/overview.md` if present, and gracefully handles
|
||
its absence. The split is therefore non-breaking for the planning flow.
|
||
|
||
### What moved to `ultra-cc-architect`
|
||
|
||
- **Commands:** `/ultra-cc-architect-local`, `/ultra-skill-author-local`
|
||
- **Agents:** `architect-orchestrator`, `feature-matcher`, `gap-identifier`,
|
||
`architecture-critic`, `skill-author-orchestrator`, `concept-extractor`,
|
||
`skill-drafter`, `ip-hygiene-checker`
|
||
- **Skills:** `skills/cc-architect-catalog/` (13 files)
|
||
- **Scripts:** `scripts/ngram-overlap.mjs`, `scripts/ngram-overlap.test.mjs`
|
||
- **Test fixtures:** `tests/fixtures/skill-factory/`,
|
||
`tests/fixtures/skill-drafter/`
|
||
|
||
All moves used `git mv`, so history follows the files into the new plugin.
|
||
|
||
### What stayed unchanged in ultraplan-local
|
||
|
||
- `/ultraplan-local` Phase 1 still auto-discovers `architecture/overview.md`.
|
||
The discovery is filesystem-based, not plugin-based — installing both
|
||
plugins gives you the full pipeline (brief → research → architect → plan
|
||
→ execute).
|
||
- `agents/planning-orchestrator.md` retains its architecture-note
|
||
cross-reference.
|
||
- All other commands (`/ultrabrief-local`, `/ultraresearch-local`,
|
||
`/ultraexecute-local`) are untouched.
|
||
|
||
### Migration
|
||
|
||
If you only used `/ultrabrief-local`, `/ultraresearch-local`,
|
||
`/ultraplan-local`, and `/ultraexecute-local`: no action needed. Update the
|
||
plugin and continue.
|
||
|
||
If you used `/ultra-cc-architect-local` or `/ultra-skill-author-local`:
|
||
install the new plugin alongside this one. In `~/.claude/settings.json`:
|
||
|
||
```json
|
||
{
|
||
"enabledPlugins": {
|
||
"ultraplan-local@ktg-plugin-marketplace": true,
|
||
"ultra-cc-architect@ktg-plugin-marketplace": true
|
||
}
|
||
}
|
||
```
|
||
|
||
Custom seed skills you added to `cc-architect-catalog/` follow with the
|
||
catalog. Use `git log --follow` if you need to track them in the new
|
||
location.
|
||
|
||
### plugin.json changes
|
||
|
||
- `version`: `2.4.0` → `3.0.0`
|
||
- `description`: now describes a four-command pipeline; CC-feature matching
|
||
is described as living in the separate `ultra-cc-architect` plugin
|
||
- `keywords`: removed `cc-architecture`
|
||
|
||
## [2.4.0] - 2026-04-19
|
||
|
||
### Breaking change — background mode removed
|
||
|
||
Default mode for `/ultraplan-local`, `/ultraresearch-local`,
|
||
`/ultra-cc-architect-local`, and auto-mode in `/ultrabrief-local` is now
|
||
foreground. The command blocks the session until the brief/plan is ready.
|
||
|
||
### Why
|
||
|
||
Background mode promised a swarm of specialized agents (docs-researcher,
|
||
community-researcher, architecture-mapper, plan-critic, feature-matcher,
|
||
gap-identifier, …) spawned by a background orchestrator. It did not work:
|
||
the Claude Code harness does not expose the Agent tool to sub-agents
|
||
(including ones run with `run_in_background: true`). The orchestrator
|
||
silently degraded to inline reasoning in a single Opus context, without
|
||
WebSearch, Tavily, WebFetch, or Gemini. Briefs were tagged "high confidence"
|
||
but were built on guesses from training data.
|
||
|
||
Source: github.com/anthropics/claude-code/issues/19077
|
||
|
||
Empirically confirmed in a plugin investigation on 2026-04-19: a probe
|
||
sub-agent reported its exposed tools as `Bash, Edit, Glob, Grep, Read,
|
||
Skill, ToolSearch, Write`. The Agent tool was not present, active or
|
||
deferred. Foreground execution from the main context spawns sub-agents
|
||
correctly (docs-researcher with Tavily intact).
|
||
|
||
### What changes
|
||
|
||
- Default execution: foreground for all four commands.
|
||
- `--fg` flag is preserved as a no-op alias (backward compatibility).
|
||
- Orchestrator agent files (`agents/research-orchestrator.md`,
|
||
`agents/planning-orchestrator.md`, `agents/architect-orchestrator.md`)
|
||
are redefined from "background executor" to "inline reference" — the
|
||
command markdown itself runs the phases in the main context where the
|
||
Agent tool is available.
|
||
- `ultrabrief-local` auto-mode now runs research sequentially inline
|
||
instead of parallel background. The ANTHROPIC_API_KEY billing check
|
||
is removed (irrelevant for foreground runs on subscription).
|
||
- `skills/cc-architect-catalog/background-agents-reference.md` has been
|
||
corrected: Shape A (orchestrator handoff) is now documented as an
|
||
anti-pattern, nested spawn from a sub-agent is NOT SUPPORTED.
|
||
|
||
### For headless/long-running runs
|
||
|
||
Use `claude -p` in a separate terminal window. Each headless session has
|
||
its own main context with the Agent tool, so the swarm works correctly.
|
||
|
||
### Migration
|
||
|
||
No code changes required for users. Scripts or documentation that assume
|
||
background execution must be updated — the commands now block until done.
|
||
|
||
## [2.3.2] - 2026-04-18
|
||
|
||
### Fixed — skill-drafter slug-collision hint
|
||
|
||
`skill-drafter` now checks for an existing file at
|
||
`{catalog_root}/<slug>.md` before writing its draft to `.drafts/`.
|
||
When a collision is detected, the agent prepends a warning block to
|
||
its confirmation output showing the overwrite risk and a suggested
|
||
qualified slug derived from the concept handle. The draft is still
|
||
written to `.drafts/<slug>.md` — the check is a hint, not a block.
|
||
|
||
**Why.** v2.3.0 dogfood surfaced the risk (logged as
|
||
`post_dogfood_findings[0]` in that run's `progress.json`): when the
|
||
drafter produced `.drafts/hooks-pattern.md` with an existing approved
|
||
`hooks-pattern.md` seed present at the catalog root, the pipeline
|
||
gave no signal that manual `mv` during promotion would silently
|
||
overwrite the seed. The v2.3.1 qualified-slug convention gave us the
|
||
mechanism to resolve collisions, but `skill-drafter` still didn't
|
||
surface them at the right moment — before promotion, not after.
|
||
|
||
**Changes.**
|
||
|
||
- `agents/skill-drafter.md` — new Step 2 "Check for slug collision at
|
||
the catalog root" between slug computation (Step 1) and reading the
|
||
source (Step 3). Subsequent workflow steps renumbered 3→7. New
|
||
"Suggesting a qualifier" guidance derives a kebab-case qualifier
|
||
from the `concept` field (or source basename as fallback). Output
|
||
format gains a `Collision:` field (`none | approved | pending |
|
||
auto-merged | soft`) and an optional warning block when the
|
||
collision is non-none. New Hard Rule "Slug-collision pre-write
|
||
check".
|
||
- `tests/fixtures/skill-drafter/slug-collision-expected.md` — new
|
||
reference fixture documenting the expected confirmation-output
|
||
shape across four scenarios (no collision, approved collision,
|
||
soft pending collision, collision with no good qualifier).
|
||
Skill-drafter is prompt-driven and not auto-tested; the fixture
|
||
anchors the shape for human verification and downstream parsers.
|
||
|
||
**Non-breaking.** No changes to `.drafts/` layout, frontmatter
|
||
contract, tool scope, or filename regex. Existing pipelines see an
|
||
extra field (`Collision:`) and an optional warning block — both
|
||
purely additive. No version-gated changes in
|
||
`skill-author-orchestrator` or `ip-hygiene-checker`.
|
||
|
||
## [2.3.1] - 2026-04-18
|
||
|
||
### Added — Qualified slug convention for cc-architect-catalog
|
||
|
||
Catalog files now follow `<cc_feature>[-<qualifier>]-<layer>.md`. The
|
||
unqualified slug (e.g., `hooks-pattern.md`) remains the canonical
|
||
baseline for a `(feature, layer)` pair. Qualified slugs (e.g.,
|
||
`hooks-observability-pattern.md`) cover specific sub-patterns without
|
||
displacing the baseline.
|
||
|
||
**Why.** v2.3.0 dogfood surfaced a design gap: the skill-factory
|
||
produced a draft `hooks-pattern.md` from a specialized source (progressive-
|
||
alert observability) that collided with the existing generic `hooks-pattern.md`
|
||
seed. Promoting would have replaced the general pattern with a narrow
|
||
one; discarding would have lost real catalog growth. Qualified slugs
|
||
resolve this by letting one feature host multiple named patterns at
|
||
different abstraction levels.
|
||
|
||
**Changes.**
|
||
|
||
- `skills/cc-architect-catalog/SKILL.md` — slug convention section added;
|
||
coverage table gains "qualified patterns" column; matcher logic
|
||
documented for N patterns per feature; modification rules cover
|
||
qualified-vs-canonical choice and slug-collision handling.
|
||
- `agents/feature-matcher.md` — catalog map is now
|
||
`cc_feature → {layer → [skills]}`; new "Selecting among multiple
|
||
patterns per feature" section (baseline by default, qualified when
|
||
justified, multiple when non-overlapping, never purely cosmetic);
|
||
`supporting_skill` accepts one-or-more skill names.
|
||
- `agents/gap-identifier.md` — adds `pattern_count[cc_feature]` signal
|
||
to the catalog coverage audit.
|
||
- `agents/architecture-critic.md` — adds supporting-skill verification:
|
||
every cited skill name must exist in the catalog; blocker severity.
|
||
- First qualified skill: `hooks-observability-pattern.md` (promoted from
|
||
`.drafts/`, sourced from `ai-psychosis/README.md`, ngram-overlap 0.01,
|
||
review_status approved).
|
||
|
||
**Non-breaking.** Existing unqualified slugs keep working. No changes to
|
||
`cc_feature` taxonomy. Hallucination gate unchanged (still validates
|
||
against `cc_feature` values, not slugs).
|
||
|
||
## [2.3.0] - 2026-04-18
|
||
|
||
### Added — Skill-factory Fase 1 MVP (`/ultra-skill-author-local`)
|
||
|
||
Manual one-skill-at-a-time generator for the `cc-architect-catalog`.
|
||
Channel 2 of the skill-factory strategy: a curated local source enters,
|
||
one draft skill exits in `skills/cc-architect-catalog/.drafts/`, with
|
||
n-gram containment scored against the source and stamped into the
|
||
draft frontmatter (or the draft is deleted when overlap is too high).
|
||
|
||
**Why now.** `/ultra-cc-architect-local` (v2.2.0) enforces a
|
||
hallucination gate that only permits feature proposals backed by the
|
||
catalog. With 10 seed skills covering 8 features × 2 layers, the
|
||
`feature-matcher` rarely finds a match and silently produces empty
|
||
proposals. Fase 1 unblocks catalog growth without spinning up
|
||
automation: one source, one draft, manual review, manual `mv` for
|
||
promotion.
|
||
|
||
**Pipeline.** Sequential, no retry, no parallelism:
|
||
|
||
```
|
||
/ultra-skill-author-local <source>
|
||
→ concept-extractor (sonnet, JSON output, gap-class C/D + cc_feature gate)
|
||
→ skill-drafter (sonnet, .drafts/<slug>.md with 9-field frontmatter)
|
||
→ ip-hygiene-checker (sonnet, runs scripts/ngram-overlap.mjs)
|
||
verdict accepted/needs-review → stamp ngram_overlap_score
|
||
verdict rejected → rm draft (no preservation)
|
||
```
|
||
|
||
**IP-hygiene utility.** Pure Node stdlib. Word-5-gram containment
|
||
similarity (asymmetric draft⊆source) plus longest-consecutive-shingle-
|
||
run secondary signal. Verdict bands: accepted (<0.15 AND <8),
|
||
needs-review (mid), rejected (≥0.35 OR ≥15). Short-text fallback to
|
||
n=4 when min(words) <500. CLI emits JSON.
|
||
|
||
**Calibration fixtures.** Three source/draft pairs in
|
||
`tests/fixtures/skill-factory/` pin the verdict bands against
|
||
representative prose: accepted (containment 0.014), needs-review
|
||
(0.211), rejected (0.676). Re-verify any threshold change against
|
||
these fixtures.
|
||
|
||
**New files:**
|
||
|
||
- `commands/ultra-skill-author-local.md`
|
||
- `agents/skill-author-orchestrator.md` (opus)
|
||
- `agents/concept-extractor.md` (sonnet)
|
||
- `agents/skill-drafter.md` (sonnet)
|
||
- `agents/ip-hygiene-checker.md` (sonnet)
|
||
- `scripts/ngram-overlap.mjs` + `scripts/ngram-overlap.test.mjs`
|
||
- `skills/cc-architect-catalog/.drafts/.gitkeep`
|
||
- `tests/fixtures/skill-factory/{source,draft}-{accepted,needs-review,rejected}.md`
|
||
- `tests/fixtures/skill-factory/README.md`
|
||
|
||
**Non-goals (explicit, Fase 1):**
|
||
|
||
- No automation, cron, or watcher
|
||
- No CC changelog diffing or auto-research
|
||
- No batch processing or review command
|
||
- No decision-layer skills (cross-feature comparison is Fase 2+)
|
||
- No URL or remote sources — local files only
|
||
- Manual `mv` from `.drafts/` to catalog root is the promotion mechanism
|
||
|
||
**New stats file:**
|
||
`${CLAUDE_PLUGIN_DATA}/ultra-skill-author-local-stats.jsonl`.
|
||
|
||
## [2.2.0] - 2026-04-18
|
||
|
||
### Added — `/ultra-cc-architect-local` optional pipeline step
|
||
|
||
New optional command that sits between `/ultraresearch-local` and
|
||
`/ultraplan-local`. Reads the task brief plus any research briefs, matches the
|
||
task against available Claude Code features (Hooks, Subagents, Skills, Output
|
||
Styles, MCP, Plan Mode, Worktrees, Background Agents), and produces an
|
||
**architecture note** with brief-anchored rationale and an explicit coverage-
|
||
gap section.
|
||
|
||
Pipeline position (5-steg):
|
||
|
||
```
|
||
/ultrabrief-local → /ultraresearch-local → /ultra-cc-architect-local (optional)
|
||
→ /ultraplan-local → /ultraexecute-local
|
||
```
|
||
|
||
The architecture note is designed as *priors* for planning, not mandates —
|
||
`/ultraplan-local` still runs its own exploration and may override proposals
|
||
with evidence.
|
||
|
||
**New files:**
|
||
|
||
- `commands/ultra-cc-architect-local.md` — command entry point (7 faser:
|
||
parse → background → read inputs → feature matching → synthesize → review
|
||
→ present).
|
||
- `agents/architect-orchestrator.md` (opus) — background orchestrator.
|
||
- `agents/feature-matcher.md` (sonnet) — matches CC features against brief +
|
||
research, with brief-anchored rationale per feature and a documented fallback
|
||
minimum list when the catalog is empty.
|
||
- `agents/gap-identifier.md` (sonnet) — emits issue-ready gap drafts (no
|
||
auto-posting; no auto-generation).
|
||
- `agents/architecture-critic.md` (sonnet) — adversarial review with a
|
||
hallucination gate (features outside catalog + fallback list → blocker).
|
||
- `skills/cc-architect-catalog/SKILL.md` — manifest + frontmatter contract.
|
||
- 10 seed skills in `skills/cc-architect-catalog/`:
|
||
`hooks-reference`, `hooks-pattern`, `subagents-reference`, `subagents-pattern`,
|
||
`skills-reference`, `output-styles-reference`, `mcp-reference`,
|
||
`plan-mode-reference`, `worktrees-reference`, `background-agents-reference`.
|
||
All handwritten (no third-party content copied, per brief §4.4).
|
||
|
||
**New flags:**
|
||
|
||
- `--project <dir>` (required) — reads `{dir}/brief.md` + `{dir}/research/*.md`,
|
||
writes to `{dir}/architecture/`.
|
||
- `--fg` — foreground execution.
|
||
- `--quick` — skip adversarial review.
|
||
- `--no-gaps` — do not write `gaps.md` (gap-section remains inside
|
||
`overview.md`).
|
||
|
||
**New stats file:** `${CLAUDE_PLUGIN_DATA}/ultra-cc-architect-local-stats.jsonl`.
|
||
|
||
### Changed — `/ultraplan-local` auto-discovers architecture notes
|
||
|
||
Brief §5 of the `ultra-cc-architect` design said "no changes to existing
|
||
commands". This release makes one permitted, documented exception:
|
||
`/ultraplan-local` now auto-discovers `{project_dir}/architecture/overview.md`
|
||
when running in project mode. The file is additive context — missing file
|
||
produces no error, and behavior when the file is absent is identical to
|
||
v2.1.0. Non-breaking.
|
||
|
||
**Minimal edits:**
|
||
|
||
- `commands/ultraplan-local.md` — Phase 1 `--project` branch sets
|
||
`has_architecture_note` / `architecture_note_path` when the file exists.
|
||
Phase 3 launch prompt passes `Architecture note: {path or "none"}` to the
|
||
orchestrator.
|
||
- `agents/planning-orchestrator.md` — Input section documents the new
|
||
optional field. Phase 4 synthesis cross-references proposed features with
|
||
exploration findings and carries the note's Coverage gaps into Risks and
|
||
Mitigations when relevant.
|
||
|
||
### Non-goals (explicit)
|
||
|
||
- **Skill-factory is not part of this release.** Cataloging, n-gram
|
||
computation, auto-generation from CC changelog, concept extraction from
|
||
reference repos — all deferred to a separate development process. Together
|
||
with v2.2.0's architect command, that process will eventually land as v3.0.
|
||
- No `/ultra-auto` chaining.
|
||
- No auto-creation of Forgejo/GitHub issues from gap drafts.
|
||
- No changes to `/ultrabrief-local`, `/ultraresearch-local`, or
|
||
`/ultraexecute-local`.
|
||
|
||
## [2.1.0] - 2026-04-18
|
||
|
||
### Changed — Dynamic, quality-gated interview in `/ultrabrief-local`
|
||
|
||
The Phase 3 interview is no longer a hardcoded Q1–Q8 list with a numeric
|
||
cap (3–4 questions in `--quick`, 5–8 in default). It is now a
|
||
**section-driven completeness loop**: the command maintains per-section
|
||
state (Intent, Goal, Success Criteria, Research Plan, and five optional
|
||
sections), picks the next question from the section with the weakest
|
||
signal, and keeps probing until all four required sections meet an
|
||
initial-signal gate. Quality drives the loop, not a counter.
|
||
|
||
Phase 4 adds a **draft → brief-reviewer → revise** loop. The brief is
|
||
drafted in memory, written to `brief.md.draft`, reviewed by the
|
||
`brief-reviewer` agent as a stop-gate, and only renamed to `brief.md`
|
||
after all five dimensions pass (`completeness/consistency/testability/
|
||
scope_clarity ≥ 4` and `research_plan == 5`). If the gate fails, a
|
||
targeted follow-up is generated from the weakest dimension's detail
|
||
field and the draft is re-reviewed. The loop is capped at 3 review
|
||
iterations to bound cost; exhaustion writes the brief with
|
||
`brief_quality: partial` and an explicit `## Brief Quality` section.
|
||
|
||
Force-stop path: when the user says "stop" during Phase 4, the current
|
||
review findings are surfaced with per-dimension scores before asking
|
||
whether to continue or accept a partial brief. No silent exits.
|
||
|
||
Not breaking. The `/ultrabrief-local [--quick] <task>` interface is
|
||
unchanged from the outside; only internals change. `--quick` now means
|
||
"start compact, escalate if gates fail" rather than "max 4 questions".
|
||
|
||
### Added
|
||
|
||
- **JSON output from `brief-reviewer`** — the agent now emits a final
|
||
fenced `json` block with per-dimension `score` (1–5) and `detail`
|
||
arrays (`gaps`, `issues`, `weak_criteria`, `unclear_sections`,
|
||
`invalid_topics`) alongside the existing prose report. The JSON block
|
||
is mandatory; empty arrays and `score: 5` are required when a
|
||
dimension passes cleanly. `planning-orchestrator` continues to use
|
||
the prose verdict unchanged.
|
||
- **`brief_quality` frontmatter field** on task briefs — `complete`
|
||
(default) when the Phase 4 gate passed, or `partial` when the
|
||
iteration cap was hit or the user force-stopped with known issues.
|
||
`planning-orchestrator` can inspect this to decide how heavily to
|
||
weight brief sections as assumptions.
|
||
- **`review_iterations` and `brief_quality` in ultrabrief-stats** —
|
||
recorded per run for telemetry.
|
||
|
||
### Changed
|
||
|
||
- Hard rule added: `/ultrabrief-local` never writes `brief.md` while the
|
||
review gate is pending. The draft lives in `brief.md.draft` until the
|
||
loop terminates.
|
||
- Hard rule added: no hard cap on Phase 3 questions; the brief-review
|
||
gate is the only loop bound (3-iteration cap) and is in Phase 4.
|
||
|
||
## [2.0.0] - 2026-04-18
|
||
|
||
### Breaking — Four-command pipeline with dedicated brief step
|
||
|
||
v2.0.0 introduces `/ultrabrief-local` as a first-class step in the pipeline.
|
||
The interview previously embedded inside `/ultraplan-local` has been extracted
|
||
into a dedicated command that produces a structured **task brief** — the
|
||
contract between user intent and planning. `/ultraplan-local` now requires
|
||
a brief as input and no longer conducts interviews.
|
||
|
||
All artifacts converge into one **project directory**:
|
||
`.claude/projects/{YYYY-MM-DD}-{slug}/` contains `brief.md`, `research/NN-*.md`,
|
||
`plan.md`, `sessions/`, and `progress.json`. `--project <dir>` works across
|
||
`/ultraresearch-local`, `/ultraplan-local`, and `/ultraexecute-local`.
|
||
|
||
See [MIGRATION.md](MIGRATION.md) for v1 → v2 upgrade guide.
|
||
|
||
### Breaking changes
|
||
|
||
- **`/ultraplan-local` requires `--brief <path>` or `--project <dir>`.** Running
|
||
without either exits with an error and a pointer to `/ultrabrief-local`.
|
||
- **`/ultraplan-local --spec <path>` is removed.** Convert specs to briefs by
|
||
adding `## Intent` and `## Research Plan` sections (see MIGRATION.md).
|
||
- **Interview inside `/ultraplan-local` is removed.** Planning no longer asks
|
||
questions — all intent must be captured in the brief upstream.
|
||
- **`spec-reviewer` agent renamed to `brief-reviewer`** with a new 5th dimension
|
||
(Research Plan validity). Old spec-reviewer file is deleted.
|
||
|
||
### Added
|
||
|
||
- **`/ultrabrief-local` command** — interactive interview (3-8 questions with
|
||
adaptive depth) that produces a task brief with explicit research plan.
|
||
Features: project directory creation, research topic identification with
|
||
copy-paste-ready `/ultraresearch-local` commands, optional auto-orchestration
|
||
(Claude runs research + plan in foreground), and stats tracking.
|
||
- **`templates/ultrabrief-template.md`** — canonical task brief format with
|
||
`## Intent`, `## Goal`, `## Non-Goals`, `## Constraints / Preferences / NFRs`,
|
||
`## Success Criteria`, `## Research Plan` (N topics with research_question,
|
||
scope, confidence, cost), and `## Open Questions / Assumptions`.
|
||
- **`brief-reviewer` agent** — renamed from spec-reviewer with a new
|
||
5th dimension: Research Plan validity (each topic has a valid
|
||
research question ending in `?`, has `Required for plan steps:` and
|
||
`Confidence needed:`, and research files exist when `auto_research: true`).
|
||
- **`--project <dir>` flag** on `/ultraresearch-local`, `/ultraplan-local`,
|
||
and `/ultraexecute-local`. Single directory holds the full pipeline's
|
||
artifacts. `/ultraresearch-local --project` auto-increments
|
||
`{dir}/research/NN-slug.md`.
|
||
- **Two-kinds-of-briefs terminology** documented across README and CLAUDE.md:
|
||
"task brief" (from `/ultrabrief-local`) vs "research brief" (from
|
||
`/ultraresearch-local`). Prefix used consistently in agent prompts and docs.
|
||
- **MIGRATION.md** — step-by-step guide for upgrading v1 projects to v2.
|
||
|
||
### Changed
|
||
|
||
- **`planning-orchestrator`** now accepts `Brief file:` input instead of
|
||
`Spec file:`. Intent→Context mapping: brief's `## Intent` + `## Goal` feed
|
||
the plan's Context section directly (structured, no inference needed).
|
||
Phase 1b now uses `brief-reviewer` instead of `spec-reviewer`. With
|
||
`Project dir:` in input, writes plan to `{dir}/plan.md`.
|
||
- **`/ultraresearch-local`** supports `--project <dir>` with auto-indexed
|
||
output path (`{dir}/research/NN-slug.md`, where NN is the next available
|
||
two-digit index).
|
||
- **`/ultraexecute-local`** supports `--project <dir>`. Reads `{dir}/plan.md`,
|
||
writes progress to `{dir}/progress.json`.
|
||
- **plugin.json** description rewritten to reflect four-command pipeline.
|
||
|
||
### Removed
|
||
|
||
- `/ultraplan-local --spec <path>` flag. Spec files are not a valid input for
|
||
v2.0+. Convert to brief via `/ultrabrief-local` or manual conversion (see
|
||
MIGRATION.md).
|
||
- Interview Phase in `/ultraplan-local` (was Phase 2). Use `/ultrabrief-local`
|
||
to conduct the interview upstream.
|
||
- `agents/spec-reviewer.md` file. Replaced by `agents/brief-reviewer.md`.
|
||
|
||
### Rationale
|
||
|
||
The v1.x interview inside `/ultraplan-local` conflated two concerns: capturing
|
||
intent and producing an executable plan. Briefs and plans have different
|
||
lifecycles — a brief should be reviewable and editable before any research or
|
||
planning starts, because every downstream decision traces back to it. Extracting
|
||
the brief into its own command makes the pipeline more honest: the brief is the
|
||
source of truth for *what we want*, research briefs are sources of truth for
|
||
*what we learned*, and the plan is the contract for *how we'll do it*. Separating
|
||
these makes each artifact reviewable on its own terms and enables deterministic
|
||
re-planning from the same brief when research reveals new constraints.
|
||
|
||
The explicit `## Research Plan` section in briefs closes a common gap: plans
|
||
were implicitly assuming knowledge that neither the user nor Claude had verified.
|
||
Research topics are now declared upfront, scoped, and traceable back to plan
|
||
decisions.
|
||
|
||
## [1.8.0] - 2026-04-17
|
||
|
||
### Opus 4.7 prompt literalism — closing the schema-drift gap
|
||
|
||
Opus 4.7 reads agent instructions more literally than 4.6 (per 4.7 system
|
||
card §6.3.1.1). The v1.7 planning-orchestrator described the Step+Manifest
|
||
schema via prose + procedural rules ("read the template"), which 4.6
|
||
inferred correctly but 4.7 sometimes rendered as narrative "Fase N" prose.
|
||
The result: plans that executed cleanly on 4.6 were rejected by
|
||
ultraexecute Phase 2 parsing on 4.7 — first observed during v6.2.0 planning
|
||
for `llm-security`. v1.8.0 closes the gap by replacing prose rules with a
|
||
literal copyable template, explicit forbidden-format clauses, and a
|
||
pre-handoff schema self-check.
|
||
|
||
### Added
|
||
|
||
- **Inline literal Step+Manifest template** in `planning-orchestrator`
|
||
Phase 5 — a complete, copyable example (JWT middleware step) replaces
|
||
"read the template" prose. Removes ambiguity about heading format,
|
||
field order, and manifest YAML structure.
|
||
- **Forbidden heading-format clause** in Phase 5 — explicit denylist for
|
||
`## Fase N`, `### Phase N`, `### Stage N`, and other narrative formats
|
||
the executor cannot parse. Negative constraints land harder on 4.7.
|
||
- **Phase 5.5 schema self-check** in `planning-orchestrator` — after
|
||
writing the plan, grep-verify canonical `### Step N:` count matches
|
||
`manifest:` count, and narrative heading count is zero. Rewrite plan
|
||
if self-check fails, before handing to plan-critic.
|
||
- **`--validate` mode** in `ultraexecute-local` — schema-only check that
|
||
parses steps and manifests, reports `READY | FAIL` with specific
|
||
error hints, and exits without security scan or execution. Intended
|
||
as a fast sanity-check between `/ultraplan-local` and full execution:
|
||
```bash
|
||
/ultraplan-local "task"
|
||
/ultraexecute-local --validate <plan>.md # READY or actionable FAIL
|
||
/ultraexecute-local <plan>.md # full execution
|
||
```
|
||
|
||
### Changed
|
||
|
||
- `planning-orchestrator` Phase 5 now embeds the canonical Step template
|
||
inline (~60 lines of literal example) rather than referring to
|
||
`templates/plan-template.md`. Template file remains authoritative for
|
||
cross-referencing but is no longer load-bearing for plan generation.
|
||
- `ultraexecute-local` Phase 2.3 added as a hard exit point for
|
||
`--validate` mode; Phase 2.4 security scan explicitly skips this mode.
|
||
|
||
### Rationale
|
||
|
||
v1.7.0's self-verifying chain assumed the orchestrator reliably produces
|
||
the v1.7 schema. That held on 4.6. v1.8.0 makes the assumption robust to
|
||
4.7-style literal interpretation by moving from "describe the format" to
|
||
"show the exact format and forbid alternatives", plus a self-check loop
|
||
before human-visible output. Pairs with `--validate` as a user-facing
|
||
verification step that catches any residual drift before execution side
|
||
effects begin.
|
||
|
||
## [1.7.0] - 2026-04-12
|
||
|
||
### The self-verifying plan chain
|
||
|
||
Wave 1 of a parallel 6-session build revealed three failure modes: (1) a
|
||
session reported `status=completed` after only 2/5 steps — last tool call
|
||
was an arbitrary file review, not a completion check; (2) 3/6 sessions
|
||
had push blocked inside the sub-agent bash sandbox *after* all work was
|
||
done; (3) plans and blueprints were prose, so the orchestrator had no
|
||
machine-readable way to verify completion. v1.7.0 closes all three by
|
||
making the plan itself an executable contract.
|
||
|
||
### Added
|
||
|
||
- **Per-step verification manifest** in plan format (`plan_version: 1.7`).
|
||
Every step now ends with a YAML `manifest:` block declaring
|
||
`expected_paths`, `min_file_count`, `commit_message_pattern`,
|
||
`bash_syntax_check`, `forbidden_paths`, `must_contain`. The manifest is
|
||
the objective completion predicate — the Verify command is necessary but
|
||
not sufficient.
|
||
- **Plan-critic dimension 10 — Manifest quality (hard gate).** Missing
|
||
or invalid manifest (unparseable regex, path contradiction, missing
|
||
block) is a `major` finding. v1.6 plans get a legacy-mode warning
|
||
instead of a block.
|
||
- **Session Manifest aggregate** in session specs — synthesized by
|
||
`session-decomposer` as the union of per-step manifests. Gives
|
||
`ultraexecute-local` a single YAML block per session to audit against.
|
||
- **Step 0: Sandbox pre-flight** — obligatory first step in every
|
||
generated session spec. Runs `git push --dry-run origin HEAD`; exit 77
|
||
= sandbox cannot push, session status becomes `blocked` (not `failed`),
|
||
no real work attempted. Escape hatch: `ULTRAEXECUTE_SKIP_PREFLIGHT=1`.
|
||
- **Launch script pre-flight** — `headless-launch-template.md` adds a
|
||
`git push --dry-run` check outside the sandbox, before any session
|
||
spawns, catching credential issues at the earliest possible point.
|
||
- **Phase 7.5 — Manifest audit (independent).** After all steps complete,
|
||
`ultraexecute-local` re-verifies expected paths, commit count, commit
|
||
message patterns, bash syntax, and forbidden-path untouched-ness from
|
||
git log and filesystem. Agent's own bookkeeping is ignored. Disagreement
|
||
with progress file → status overridden to `partial`.
|
||
- **Phase 7.6 — Recovery dispatch (bounded).** When Phase 7.5 detects
|
||
drift in multi-session parent context, synthesize a temp session spec
|
||
containing only missing steps and dispatch via existing
|
||
`claude -p "/ultraexecute-local --session N"`. `recovery_depth ≤ 2`
|
||
hard cap — third drift escalates to user.
|
||
- **Hard Rule 17: Manifest is the completion predicate.** A step may
|
||
not be marked passed if its manifest does not verify, regardless of
|
||
Verify's exit code.
|
||
- **Hard Rule 18: Last-activity rule.** Executor's final tool call
|
||
before Phase 8 must be a manifest check, never an arbitrary file
|
||
review. Prevents hallucinated completion.
|
||
|
||
### Changed
|
||
|
||
- **Plan template (`templates/plan-template.md`)** — adds
|
||
`plan_version: 1.7` metadata line, `Manifest:` field on every step,
|
||
"Manifest — objective completion predicate" section.
|
||
- **Plan-critic scoring** rebalanced: Headless readiness 0.15 → 0.10,
|
||
Manifest quality 0.05 added. Legacy v1.6 plans skip the Manifest
|
||
dimension and keep Headless readiness at 0.15.
|
||
- **Planning-orchestrator Phase 5** adds "Manifest generation rules
|
||
(REQUIRED for every step)" with mechanical derivation from `Files:`
|
||
and Checkpoint. Validates regex compilation and path existence before
|
||
handoff to plan-critic.
|
||
- **Session-decomposer** parses plan manifests and propagates them
|
||
verbatim into session specs. For v1.7+ plans with missing manifests:
|
||
abort with pointer to failing step. For legacy v1.6 plans: synthesize
|
||
minimal manifests and flag `legacy_synthesis: true`.
|
||
- **ultraexecute-local Phase 2** parses manifest YAML. Ugyldig YAML =
|
||
abort with pointer to step. v<1.7 plans: synthesize + log
|
||
`legacy_plan: true`.
|
||
- **ultraexecute-local Phase 6** — sub-step D renamed to D1 "Command
|
||
verification"; new D2 "Manifest verification" runs after D1 with 5
|
||
checks. F "Checkpoint" adds `checkpoint_drift` logging when HEAD
|
||
message doesn't match `commit_message_pattern` (non-fatal).
|
||
- **Phase 8 report** — table gets Manifest column; JSON summary adds
|
||
`plan_version`, `manifest_audit`, `drift_details`, `recovery_dispatched`,
|
||
`recovery_depth`, `legacy_plan`. Result vocabulary strict:
|
||
`completed | partial | blocked | failed | stopped`.
|
||
- **Division of labor clarified** in README — `/ultraresearch-local`
|
||
gathers context (no decisions), `/ultraplan-local` transforms intent
|
||
into an executable contract (manifests, plan-critic gate),
|
||
`/ultraexecute-local` executes the contract disciplined (does NOT
|
||
compensate for weak plans — escalates).
|
||
|
||
## [1.6.0] - 2026-04-08
|
||
|
||
### Added
|
||
|
||
- **`/ultraresearch-local` command** — deep research combining local codebase analysis
|
||
with external knowledge. Produces structured research briefs with triangulation,
|
||
confidence ratings, and source quality assessment. Supports modes: default (background),
|
||
`--quick` (inline), `--local` (codebase only), `--external` (web only), `--fg` (foreground).
|
||
- **6 new agents** for the research pipeline:
|
||
- `research-orchestrator` (opus) — runs full research pipeline as background task
|
||
- `docs-researcher` (sonnet) — official documentation via Tavily, WebSearch, Microsoft Learn
|
||
- `community-researcher` (sonnet) — real-world experience from issues, blogs, discussions
|
||
- `security-researcher` (sonnet) — CVEs, audit history, supply chain risks
|
||
- `contrarian-researcher` (sonnet) — counter-evidence and overlooked alternatives
|
||
- `gemini-bridge` (sonnet) — independent second opinion via Gemini Deep Research MCP
|
||
- **Research brief template** (`templates/research-brief-template.md`) — structured format
|
||
with dimensions, confidence ratings, triangulation, and source quality assessment.
|
||
- **`--research` flag for `/ultraplan-local`** — accepts up to 3 research brief paths.
|
||
Enriches the interview (focuses on decisions, not facts) and injects brief context into
|
||
exploration agents. Research-scout skips already-covered technologies.
|
||
- **Research-aware planning orchestrator** — `planning-orchestrator.md` now accepts research
|
||
briefs, injects summaries into sub-agent prompts, and cross-references brief findings
|
||
during synthesis.
|
||
- **Research settings** in `settings.json` — configurable Gemini bridge (enabled/timeout),
|
||
interview depth, dimension limits, and stats tracking.
|
||
|
||
### Changed
|
||
|
||
- Plugin description and keywords updated to reflect research capabilities.
|
||
- CLAUDE.md expanded with ultraresearch command, modes, agents, architecture, and state.
|
||
|
||
## [1.5.0] - 2026-04-07
|
||
|
||
### Fixed
|
||
|
||
- **CRITICAL: Parallel session data loss** — Phase 2.6 ran parallel `claude -p` sessions
|
||
in the same working directory, causing git race conditions and repository corruption.
|
||
Each parallel session now runs in its own git worktree with isolated branch, index,
|
||
and working files. Branches are merged back sequentially after each wave completes.
|
||
|
||
### Added
|
||
|
||
- **Phase 2.55 (Pre-flight safety checks)** — validates clean working tree, committed
|
||
plan file, no scope fence overlaps between parallel sessions, and no stale worktrees
|
||
before launching parallel execution.
|
||
- **Git worktree isolation** for all parallel sessions — one branch per session
|
||
(`ultraplan/{slug}/session-{N}`), merged with `--no-ff` after wave completion.
|
||
- **Merge conflict detection** — if merging a session branch produces conflicts, the merge
|
||
is aborted and conflicting files are reported. No silent data loss.
|
||
- **Unconditional worktree cleanup** — worktrees and session branches are always removed,
|
||
even on failure. Manual cleanup commands are reported if automated cleanup fails.
|
||
- **Hard rules 11-13** — worktree isolation mandatory, cleanup unconditional, merge
|
||
sequentially with conflict abort.
|
||
- **Session-scoped progress file naming** — `--session N` uses
|
||
`.ultraexecute-progress-{slug}-session-{N}.json` to prevent merge conflicts.
|
||
|
||
### Changed
|
||
|
||
- Headless launch template uses git worktrees with `cleanup_worktrees` trap on EXIT,
|
||
clean-tree pre-flight check, and sequential merge after each wave.
|
||
- Phase 2.6 rewritten with 5-step worktree lifecycle: create → launch → wait → merge → cleanup.
|
||
|
||
## [1.4.0] - 2026-04-06
|
||
|
||
### Renamed
|
||
|
||
- **`/ultraexecute` → `/ultraexecute-local`** — renamed for namespace consistency with `/ultraplan-local` and future-proofing against potential Anthropic naming. File: `commands/ultraexecute.md` → `commands/ultraexecute-local.md`. Note: `ultraexecute_summary` JSON key and `ultraexecute-stats.jsonl` filename are unchanged for backward compatibility.
|
||
|
||
### Added
|
||
|
||
- **`convention-scanner` agent** (sonnet) — dedicated agent for discovering coding conventions: naming, directory layout, import style, error handling, test patterns, git commit style, documentation patterns. Replaces inline Explore agent prompt for medium+ codebases.
|
||
- **Success Criteria section** in spec template — falsifiable "definition of done" conditions that the spec-reviewer validates and ultraexecute-local uses for verification.
|
||
- **Dry-run multi-session preview** — `--dry-run` now shows session groupings, wave structure, billing status, and `claude -p` commands when plan has an Execution Strategy.
|
||
- **External verification rule** in headless launch template — wave verification must run commands independently, never parse session logs as proof.
|
||
- **Billing preamble** in headless launch template — `unset ANTHROPIC_API_KEY` prevents accidental API billing.
|
||
- **Phase mapping comment** in planning-orchestrator — documents how orchestrator phases 1-7 map to command phases 4-10.
|
||
|
||
### Fixed
|
||
|
||
- **`git add -A` in escalation** — replaced with targeted staging of only files from completed steps. Prevents staging secrets, binaries, or unrelated work.
|
||
- **False `background: true` claim** — command documentation incorrectly stated the orchestrator has `background: true` in its frontmatter. Corrected to explain `run_in_background` on the Agent tool.
|
||
|
||
### Changed
|
||
|
||
- Execution Strategy reconciliation in session-decomposer — respects existing `## Execution Strategy` as input instead of re-analyzing from scratch. Warns on file-overlap conflicts.
|
||
- Headless launch template uses `--dangerously-skip-permissions` instead of `--allowedTools` for more robust headless execution.
|
||
- Session-decomposer updated with `--dangerously-skip-permissions` and `unset ANTHROPIC_API_KEY` for generated scripts.
|
||
- Convention Scanner references in command and orchestrator updated to use dedicated plugin agent.
|
||
- ROADMAP.md translated from Norwegian to English.
|
||
- plugin.json: added homepage, repository, license, keywords. Version bumped to 1.4.0.
|
||
- README badge updated to v1.4.0.
|
||
|
||
## [1.3.0] - 2026-04-06
|
||
|
||
### Added
|
||
|
||
- **Session-aware parallel execution** — `/ultraexecute` auto-detects `## Execution Strategy` in plans and orchestrates multi-session parallel execution via `claude -p`. No manual `bash launch.sh` required.
|
||
- **`--fg` flag** — force foreground sequential execution, ignoring Execution Strategy
|
||
- **`--session N` flag** — execute only session N from the plan's Execution Strategy (used by child processes)
|
||
- **Phase 2.5 (Execution strategy decision)** — determines single-session vs multi-session mode
|
||
- **Phase 2.6 (Multi-session orchestration)** — launches parallel `claude -p` sessions per wave, waits for completion, aggregates results
|
||
- **Execution Strategy in plan template** — new `## Execution Strategy` section with sessions, waves, scope fences, and execution order. Generated by planning-orchestrator for plans with > 5 steps.
|
||
- **Execution Strategy generation in planning-orchestrator** — Phase 5 analyzes step file-overlap to build dependency graph, groups connected components into sessions of 3–5 steps, and organizes sessions into parallel waves.
|
||
|
||
### Changed
|
||
|
||
- planning-orchestrator Phase 5 extended with Execution Strategy generation logic
|
||
- ultraplan-local Phase 8 now lists Execution Strategy as 10th required plan section
|
||
- Plan template includes `## Execution Strategy` section template with grouping rules
|
||
- CLAUDE.md updated with new ultraexecute modes and architecture
|
||
- plugin.json version bumped to 1.3.0
|
||
|
||
## [1.2.0] - 2026-04-06
|
||
|
||
### Added
|
||
|
||
- **`/ultraexecute` command** — disciplined plan executor with 9-phase workflow. Reads an ultraplan or session spec, executes steps sequentially with strict failure recovery, tracks progress for resume, and reports results in machine-parseable JSON.
|
||
- 4 modes: default (execute), `--resume` (continue from checkpoint), `--dry-run` (validate without executing), `--step N` (single step)
|
||
- Per-step protocol: implement → verify → on-failure handling → checkpoint
|
||
- Failure recovery from plan's On failure clauses (revert/retry/skip/escalate)
|
||
- 3-attempt retry cap per step (initial + 2 retries)
|
||
- Progress file (`.ultraexecute-progress-{slug}.json`) for crash recovery and resume
|
||
- Entry/exit condition checking for session specs
|
||
- Scope fence enforcement for session specs (never-touch file protection)
|
||
- JSON summary block in output for headless log parsing
|
||
- Stats tracking to `ultraexecute-stats.jsonl`
|
||
|
||
### Changed
|
||
|
||
- CLAUDE.md restructured with two commands table (plan + execute)
|
||
- plugin.json version bumped to 1.2.0
|
||
|
||
## [1.1.0] - 2026-04-06
|
||
|
||
### Added
|
||
|
||
- **`--decompose` mode** — splits an existing plan into self-contained headless sessions. Analyzes step dependencies, groups steps into sessions of 3–5 steps each, identifies parallel execution waves, and generates session specs + dependency graph + launch script.
|
||
- **`--export headless` format** — shortcut for `--decompose`. Produces the same session decomposition output.
|
||
- **session-decomposer agent** (sonnet) — dedicated agent for plan decomposition. Parses step dependencies, builds dependency graph, groups steps into sessions, generates session specs with scope fences and failure handling.
|
||
- **Session spec template** (`templates/session-spec-template.md`) — defines the format for individual session specs: context, scope fence, steps, entry/exit conditions, failure handling, handoff state.
|
||
- **Headless launch template** (`templates/headless-launch-template.md`) — template for generating bash launch scripts that execute sessions in parallel waves using `claude -p`.
|
||
- **Failure recovery per step** — plan template now includes `On failure:` (revert/retry/skip/escalate) and `Checkpoint:` (git commit) fields for every implementation step.
|
||
- **Headless readiness dimension** in plan-critic — new 9th review dimension checking for On failure clauses, Checkpoint fields, and circuit breakers. Weighted at 0.15 in the quality score.
|
||
|
||
### Changed
|
||
|
||
- Plan-critic scoring rebalanced: 6 dimensions (was 5), weights adjusted to accommodate headless readiness
|
||
- Plan template step format extended with On failure and Checkpoint fields
|
||
- Planning-orchestrator Phase 5 updated with failure recovery generation requirements
|
||
- CLAUDE.md updated with new agent, modes, and state paths
|
||
|
||
## [1.0.0] - 2026-04-06
|
||
|
||
### Added
|
||
|
||
- **`--quick` mode** — skips exploration agent swarm. Runs interview → lightweight Glob/Grep scan → planning → adversarial review. For when the developer knows the codebase and needs structure, not cartography.
|
||
- **`--export` mode** — generates shareable output from an existing plan file. Three formats: `pr` (PR description), `issue` (issue comment), `markdown` (clean plan without internal metadata).
|
||
- **task-finder three-tier categorization** — findings categorized as Must-change (must be modified), Must-respect (contract that must not break), or Reference (context/reuse). Replaces flat file list.
|
||
- **Adaptive interview depth** — interview adapts to answer quality. Detailed answers trigger fewer, more targeted questions. Short/uncertain answers trigger simpler questions with offered alternatives.
|
||
- **Complete `plugin.json` metadata** — author, homepage, repository, license, keywords added.
|
||
- **README badges** — version, license, and platform badges.
|
||
- **Known limitations section in README** — IaC projects (Terraform, Helm, Pulumi, CDK) get reduced value from exploration agents.
|
||
- **Forgejo issue templates** — bug report and feature request YAML templates.
|
||
- **CONTRIBUTING.md** — rewritten for honest solo-project model.
|
||
|
||
### Changed
|
||
|
||
- plugin.json version bumped to 1.0.0
|
||
- Command header updated to Ultraplan Local v1.0
|
||
- Orchestrator accepts `mode: quick` in prompt for lightweight scanning path
|
||
|
||
## [0.4.0] - 2026-04-06
|
||
|
||
### Added
|
||
|
||
- **3 new agents** for information-complete planning:
|
||
- `task-finder` — dedicated agent for finding task-relevant files, functions, types, and reuse candidates. Replaces inline Explore agent.
|
||
- `git-historian` — analyzes git log, blame, active branches, code ownership, and hot files for planning context.
|
||
- `spec-reviewer` — reviews spec quality (completeness, consistency, testability, scope clarity) before exploration begins. New Phase 1b/4b.
|
||
- **Plan scoring** — plan-critic produces a quantitative quality score (0–100) across 5 weighted dimensions with letter grades (A–D) and verdicts (APPROVE/REVISE/REPLAN).
|
||
- **No-placeholder rule** — plan-critic flags TBD, TODO, vague instructions, and underspecified steps as unconditional blockers. 3+ blockers = REPLAN regardless of score.
|
||
- **`[ASSUMPTION]` marking** — planning-orchestrator marks all unverifiable claims and warns when >3 assumptions exist.
|
||
|
||
### Changed
|
||
|
||
- **All agents run for all codebase sizes.** Small codebases get the same 6 core agents as large ones. Agent turns scale down for small codebases instead of dropping agents entirely.
|
||
- Phase 4b (spec review) added before exploration in both command and orchestrator.
|
||
- Orchestrator Phase 2 agent table expanded: 6 always + 1 conditional + 1 medium-only.
|
||
- Plan-critic review checklist expanded with no-placeholder checks (section 7) and scoring output.
|
||
- Orchestrator rules updated with assumption-marking and no-placeholder requirements.
|
||
|
||
## [0.3.0] - 2026-04-05
|
||
|
||
### Added
|
||
|
||
- **planning-orchestrator agent** — dedicated background agent (`background: true`) that handles Phases 4–10 autonomously. Replaces generic background agent spawning with a purpose-built orchestrator running on Opus with `maxTurns: 50`.
|
||
- **`effort` and `maxTurns` on all agents** — fine-grained cost and depth control:
|
||
- Exploration agents: `effort: medium`, `maxTurns: 15–20`
|
||
- Review agents (plan-critic, scope-guardian): `effort: high`, `maxTurns: 10`
|
||
- Research-scout: `effort: medium`, `maxTurns: 10`
|
||
- **Plugin `settings.json`** — default configuration for mode, research, agent counts, interview limits, and team settings. Users can override in their own settings.
|
||
- **Worktree isolation for Agent Teams** — team members use `isolation: "worktree"` to prevent file conflicts during parallel implementation
|
||
- **Session tracking** (Phase 12) — writes JSONL records to `${CLAUDE_PLUGIN_DATA}/ultraplan-stats.jsonl` with task metadata, agent counts, review verdicts, and outcomes
|
||
|
||
### Changed
|
||
|
||
- Phase 3 now launches the `planning-orchestrator` agent instead of a generic background agent
|
||
- Agent Team implementation uses worktree isolation by default
|
||
|
||
## [0.2.0] - 2026-04-05
|
||
|
||
### Added
|
||
|
||
- **Interview phase** — iterative requirements gathering with AskUserQuestion before exploration. Produces a spec file that feeds into planning.
|
||
- **7 specialized agents** in `agents/` directory:
|
||
- `architecture-mapper` — deep architecture analysis, anti-patterns, smell detection
|
||
- `dependency-tracer` — import-chain following, data-flow analysis, side-effect catching
|
||
- `test-strategist` — test strategy design based on existing patterns
|
||
- `risk-assessor` — threat modeling, edge cases, failure modes
|
||
- `plan-critic` — dedicated adversarial reviewer with hardcoded critical perspective
|
||
- `scope-guardian` — scope creep and scope gap detection
|
||
- `research-scout` — external research via WebSearch/Tavily for unfamiliar technologies
|
||
- **External research capability** — research-scout agent searches documentation, known issues, and best practices when the task involves external/unfamiliar technology
|
||
- **Background mode** — default mode runs interview in foreground, then plans in background. User is notified when done.
|
||
- **Spec-driven mode** (`--spec`) — skip interview, provide a pre-written spec file, plan entirely in background
|
||
- **Foreground mode** (`--fg`) — all phases in foreground, blocks session (v0.1.0 behavior)
|
||
- **Agent Team support** — when plan has 3+ independent steps, offers parallel implementation via Agent Teams
|
||
- **Spec template** in `templates/spec-template.md`
|
||
- **Research Sources section** in plan template for citing external research
|
||
- **Dual adversarial review** — plan-critic and scope-guardian run in parallel
|
||
|
||
### Changed
|
||
|
||
- Exploration agents replaced with named specialized agents from `agents/` directory
|
||
- Agent count scales with codebase: 3 (small), 5 (medium), 7 (large)
|
||
- Plan template extended with Research Sources and external tech fields
|
||
- Handoff phase supports "execute with team" option
|
||
- Command workflow expanded from 9 to 11 phases
|
||
|
||
## [0.1.0] - 2026-04-05
|
||
|
||
### Added
|
||
|
||
- Initial release
|
||
- `/ultraplan` slash command with 6-phase workflow
|
||
- Parallel Sonnet exploration (3 agents: architecture, task-relevant, conventions)
|
||
- Opus-driven plan generation from structured template
|
||
- Plan refinement loop with execute/save handoff
|
||
- Plan template with context, analysis, steps, alternatives, risks, verification
|
||
- Cross-platform support (Mac, Linux, Windows) — pure markdown, no scripts
|