ktg-plugin-marketplace/plugins/ultraplan-local/CHANGELOG.md
Kjell Tore Guttormsen 1634197853 feat(ultraplan-local): v2.1.0 — dynamic quality-gated interview
Replace hardcoded Q1-Q8 in /ultrabrief-local with a section-driven
completeness loop (Phase 3) and a draft/review/revise loop with
brief-reviewer as stop-gate (Phase 4). Quality drives the interview,
not a question counter.

brief-reviewer now emits a machine-readable JSON block with per-dimension
scores (1-5) and detail arrays alongside the existing prose report;
planning-orchestrator continues to consume the prose verdict unchanged.

Phase 4 gate: all dimensions >= 4 AND research_plan = 5. On fail, a
targeted follow-up is generated from the weakest dimension's detail
field and the draft is re-reviewed. Max 3 review iterations bound cost;
exhaustion writes brief.md with brief_quality: partial and an explicit
Brief Quality section. Force-stop surfaces per-dimension findings before
the user chooses continue or partial.

Not breaking. /ultrabrief-local [--quick] <task> interface unchanged.
--quick now means compact start with escalation, not a max-N cap.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-18 09:43:43 +02:00

32 KiB
Raw Blame History

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog.

[2.1.0] - 2026-04-18

Changed — Dynamic, quality-gated interview in /ultrabrief-local

The Phase 3 interview is no longer a hardcoded Q1Q8 list with a numeric cap (34 questions in --quick, 58 in default). It is now a section-driven completeness loop: the command maintains per-section state (Intent, Goal, Success Criteria, Research Plan, and five optional sections), picks the next question from the section with the weakest signal, and keeps probing until all four required sections meet an initial-signal gate. Quality drives the loop, not a counter.

Phase 4 adds a draft → brief-reviewer → revise loop. The brief is drafted in memory, written to brief.md.draft, reviewed by the brief-reviewer agent as a stop-gate, and only renamed to brief.md after all five dimensions pass (completeness/consistency/testability/ scope_clarity ≥ 4 and research_plan == 5). If the gate fails, a targeted follow-up is generated from the weakest dimension's detail field and the draft is re-reviewed. The loop is capped at 3 review iterations to bound cost; exhaustion writes the brief with brief_quality: partial and an explicit ## Brief Quality section.

Force-stop path: when the user says "stop" during Phase 4, the current review findings are surfaced with per-dimension scores before asking whether to continue or accept a partial brief. No silent exits.

Not breaking. The /ultrabrief-local [--quick] <task> interface is unchanged from the outside; only internals change. --quick now means "start compact, escalate if gates fail" rather than "max 4 questions".

Added

  • JSON output from brief-reviewer — the agent now emits a final fenced json block with per-dimension score (15) and detail arrays (gaps, issues, weak_criteria, unclear_sections, invalid_topics) alongside the existing prose report. The JSON block is mandatory; empty arrays and score: 5 are required when a dimension passes cleanly. planning-orchestrator continues to use the prose verdict unchanged.
  • brief_quality frontmatter field on task briefs — complete (default) when the Phase 4 gate passed, or partial when the iteration cap was hit or the user force-stopped with known issues. planning-orchestrator can inspect this to decide how heavily to weight brief sections as assumptions.
  • review_iterations and brief_quality in ultrabrief-stats — recorded per run for telemetry.

Changed

  • Hard rule added: /ultrabrief-local never writes brief.md while the review gate is pending. The draft lives in brief.md.draft until the loop terminates.
  • Hard rule added: no hard cap on Phase 3 questions; the brief-review gate is the only loop bound (3-iteration cap) and is in Phase 4.

[2.0.0] - 2026-04-18

Breaking — Four-command pipeline with dedicated brief step

v2.0.0 introduces /ultrabrief-local as a first-class step in the pipeline. The interview previously embedded inside /ultraplan-local has been extracted into a dedicated command that produces a structured task brief — the contract between user intent and planning. /ultraplan-local now requires a brief as input and no longer conducts interviews.

All artifacts converge into one project directory: .claude/projects/{YYYY-MM-DD}-{slug}/ contains brief.md, research/NN-*.md, plan.md, sessions/, and progress.json. --project <dir> works across /ultraresearch-local, /ultraplan-local, and /ultraexecute-local.

See MIGRATION.md for v1 → v2 upgrade guide.

Breaking changes

  • /ultraplan-local requires --brief <path> or --project <dir>. Running without either exits with an error and a pointer to /ultrabrief-local.
  • /ultraplan-local --spec <path> is removed. Convert specs to briefs by adding ## Intent and ## Research Plan sections (see MIGRATION.md).
  • Interview inside /ultraplan-local is removed. Planning no longer asks questions — all intent must be captured in the brief upstream.
  • spec-reviewer agent renamed to brief-reviewer with a new 5th dimension (Research Plan validity). Old spec-reviewer file is deleted.

Added

  • /ultrabrief-local command — interactive interview (3-8 questions with adaptive depth) that produces a task brief with explicit research plan. Features: project directory creation, research topic identification with copy-paste-ready /ultraresearch-local commands, optional auto-orchestration (Claude runs research + plan in foreground), and stats tracking.
  • templates/ultrabrief-template.md — canonical task brief format with ## Intent, ## Goal, ## Non-Goals, ## Constraints / Preferences / NFRs, ## Success Criteria, ## Research Plan (N topics with research_question, scope, confidence, cost), and ## Open Questions / Assumptions.
  • brief-reviewer agent — renamed from spec-reviewer with a new 5th dimension: Research Plan validity (each topic has a valid research question ending in ?, has Required for plan steps: and Confidence needed:, and research files exist when auto_research: true).
  • --project <dir> flag on /ultraresearch-local, /ultraplan-local, and /ultraexecute-local. Single directory holds the full pipeline's artifacts. /ultraresearch-local --project auto-increments {dir}/research/NN-slug.md.
  • Two-kinds-of-briefs terminology documented across README and CLAUDE.md: "task brief" (from /ultrabrief-local) vs "research brief" (from /ultraresearch-local). Prefix used consistently in agent prompts and docs.
  • MIGRATION.md — step-by-step guide for upgrading v1 projects to v2.

Changed

  • planning-orchestrator now accepts Brief file: input instead of Spec file:. Intent→Context mapping: brief's ## Intent + ## Goal feed the plan's Context section directly (structured, no inference needed). Phase 1b now uses brief-reviewer instead of spec-reviewer. With Project dir: in input, writes plan to {dir}/plan.md.
  • /ultraresearch-local supports --project <dir> with auto-indexed output path ({dir}/research/NN-slug.md, where NN is the next available two-digit index).
  • /ultraexecute-local supports --project <dir>. Reads {dir}/plan.md, writes progress to {dir}/progress.json.
  • plugin.json description rewritten to reflect four-command pipeline.

Removed

  • /ultraplan-local --spec <path> flag. Spec files are not a valid input for v2.0+. Convert to brief via /ultrabrief-local or manual conversion (see MIGRATION.md).
  • Interview Phase in /ultraplan-local (was Phase 2). Use /ultrabrief-local to conduct the interview upstream.
  • agents/spec-reviewer.md file. Replaced by agents/brief-reviewer.md.

Rationale

The v1.x interview inside /ultraplan-local conflated two concerns: capturing intent and producing an executable plan. Briefs and plans have different lifecycles — a brief should be reviewable and editable before any research or planning starts, because every downstream decision traces back to it. Extracting the brief into its own command makes the pipeline more honest: the brief is the source of truth for what we want, research briefs are sources of truth for what we learned, and the plan is the contract for how we'll do it. Separating these makes each artifact reviewable on its own terms and enables deterministic re-planning from the same brief when research reveals new constraints.

The explicit ## Research Plan section in briefs closes a common gap: plans were implicitly assuming knowledge that neither the user nor Claude had verified. Research topics are now declared upfront, scoped, and traceable back to plan decisions.

[1.8.0] - 2026-04-17

Opus 4.7 prompt literalism — closing the schema-drift gap

Opus 4.7 reads agent instructions more literally than 4.6 (per 4.7 system card §6.3.1.1). The v1.7 planning-orchestrator described the Step+Manifest schema via prose + procedural rules ("read the template"), which 4.6 inferred correctly but 4.7 sometimes rendered as narrative "Fase N" prose. The result: plans that executed cleanly on 4.6 were rejected by ultraexecute Phase 2 parsing on 4.7 — first observed during v6.2.0 planning for llm-security. v1.8.0 closes the gap by replacing prose rules with a literal copyable template, explicit forbidden-format clauses, and a pre-handoff schema self-check.

Added

  • Inline literal Step+Manifest template in planning-orchestrator Phase 5 — a complete, copyable example (JWT middleware step) replaces "read the template" prose. Removes ambiguity about heading format, field order, and manifest YAML structure.
  • Forbidden heading-format clause in Phase 5 — explicit denylist for ## Fase N, ### Phase N, ### Stage N, and other narrative formats the executor cannot parse. Negative constraints land harder on 4.7.
  • Phase 5.5 schema self-check in planning-orchestrator — after writing the plan, grep-verify canonical ### Step N: count matches manifest: count, and narrative heading count is zero. Rewrite plan if self-check fails, before handing to plan-critic.
  • --validate mode in ultraexecute-local — schema-only check that parses steps and manifests, reports READY | FAIL with specific error hints, and exits without security scan or execution. Intended as a fast sanity-check between /ultraplan-local and full execution:
    /ultraplan-local "task"
    /ultraexecute-local --validate <plan>.md   # READY or actionable FAIL
    /ultraexecute-local <plan>.md              # full execution
    

Changed

  • planning-orchestrator Phase 5 now embeds the canonical Step template inline (~60 lines of literal example) rather than referring to templates/plan-template.md. Template file remains authoritative for cross-referencing but is no longer load-bearing for plan generation.
  • ultraexecute-local Phase 2.3 added as a hard exit point for --validate mode; Phase 2.4 security scan explicitly skips this mode.

Rationale

v1.7.0's self-verifying chain assumed the orchestrator reliably produces the v1.7 schema. That held on 4.6. v1.8.0 makes the assumption robust to 4.7-style literal interpretation by moving from "describe the format" to "show the exact format and forbid alternatives", plus a self-check loop before human-visible output. Pairs with --validate as a user-facing verification step that catches any residual drift before execution side effects begin.

[1.7.0] - 2026-04-12

The self-verifying plan chain

Wave 1 of a parallel 6-session build revealed three failure modes: (1) a session reported status=completed after only 2/5 steps — last tool call was an arbitrary file review, not a completion check; (2) 3/6 sessions had push blocked inside the sub-agent bash sandbox after all work was done; (3) plans and blueprints were prose, so the orchestrator had no machine-readable way to verify completion. v1.7.0 closes all three by making the plan itself an executable contract.

Added

  • Per-step verification manifest in plan format (plan_version: 1.7). Every step now ends with a YAML manifest: block declaring expected_paths, min_file_count, commit_message_pattern, bash_syntax_check, forbidden_paths, must_contain. The manifest is the objective completion predicate — the Verify command is necessary but not sufficient.
  • Plan-critic dimension 10 — Manifest quality (hard gate). Missing or invalid manifest (unparseable regex, path contradiction, missing block) is a major finding. v1.6 plans get a legacy-mode warning instead of a block.
  • Session Manifest aggregate in session specs — synthesized by session-decomposer as the union of per-step manifests. Gives ultraexecute-local a single YAML block per session to audit against.
  • Step 0: Sandbox pre-flight — obligatory first step in every generated session spec. Runs git push --dry-run origin HEAD; exit 77 = sandbox cannot push, session status becomes blocked (not failed), no real work attempted. Escape hatch: ULTRAEXECUTE_SKIP_PREFLIGHT=1.
  • Launch script pre-flightheadless-launch-template.md adds a git push --dry-run check outside the sandbox, before any session spawns, catching credential issues at the earliest possible point.
  • Phase 7.5 — Manifest audit (independent). After all steps complete, ultraexecute-local re-verifies expected paths, commit count, commit message patterns, bash syntax, and forbidden-path untouched-ness from git log and filesystem. Agent's own bookkeeping is ignored. Disagreement with progress file → status overridden to partial.
  • Phase 7.6 — Recovery dispatch (bounded). When Phase 7.5 detects drift in multi-session parent context, synthesize a temp session spec containing only missing steps and dispatch via existing claude -p "/ultraexecute-local --session N". recovery_depth ≤ 2 hard cap — third drift escalates to user.
  • Hard Rule 17: Manifest is the completion predicate. A step may not be marked passed if its manifest does not verify, regardless of Verify's exit code.
  • Hard Rule 18: Last-activity rule. Executor's final tool call before Phase 8 must be a manifest check, never an arbitrary file review. Prevents hallucinated completion.

Changed

  • Plan template (templates/plan-template.md) — adds plan_version: 1.7 metadata line, Manifest: field on every step, "Manifest — objective completion predicate" section.
  • Plan-critic scoring rebalanced: Headless readiness 0.15 → 0.10, Manifest quality 0.05 added. Legacy v1.6 plans skip the Manifest dimension and keep Headless readiness at 0.15.
  • Planning-orchestrator Phase 5 adds "Manifest generation rules (REQUIRED for every step)" with mechanical derivation from Files: and Checkpoint. Validates regex compilation and path existence before handoff to plan-critic.
  • Session-decomposer parses plan manifests and propagates them verbatim into session specs. For v1.7+ plans with missing manifests: abort with pointer to failing step. For legacy v1.6 plans: synthesize minimal manifests and flag legacy_synthesis: true.
  • ultraexecute-local Phase 2 parses manifest YAML. Ugyldig YAML = abort with pointer to step. v<1.7 plans: synthesize + log legacy_plan: true.
  • ultraexecute-local Phase 6 — sub-step D renamed to D1 "Command verification"; new D2 "Manifest verification" runs after D1 with 5 checks. F "Checkpoint" adds checkpoint_drift logging when HEAD message doesn't match commit_message_pattern (non-fatal).
  • Phase 8 report — table gets Manifest column; JSON summary adds plan_version, manifest_audit, drift_details, recovery_dispatched, recovery_depth, legacy_plan. Result vocabulary strict: completed | partial | blocked | failed | stopped.
  • Division of labor clarified in README — /ultraresearch-local gathers context (no decisions), /ultraplan-local transforms intent into an executable contract (manifests, plan-critic gate), /ultraexecute-local executes the contract disciplined (does NOT compensate for weak plans — escalates).

[1.6.0] - 2026-04-08

Added

  • /ultraresearch-local command — deep research combining local codebase analysis with external knowledge. Produces structured research briefs with triangulation, confidence ratings, and source quality assessment. Supports modes: default (background), --quick (inline), --local (codebase only), --external (web only), --fg (foreground).
  • 6 new agents for the research pipeline:
    • research-orchestrator (opus) — runs full research pipeline as background task
    • docs-researcher (sonnet) — official documentation via Tavily, WebSearch, Microsoft Learn
    • community-researcher (sonnet) — real-world experience from issues, blogs, discussions
    • security-researcher (sonnet) — CVEs, audit history, supply chain risks
    • contrarian-researcher (sonnet) — counter-evidence and overlooked alternatives
    • gemini-bridge (sonnet) — independent second opinion via Gemini Deep Research MCP
  • Research brief template (templates/research-brief-template.md) — structured format with dimensions, confidence ratings, triangulation, and source quality assessment.
  • --research flag for /ultraplan-local — accepts up to 3 research brief paths. Enriches the interview (focuses on decisions, not facts) and injects brief context into exploration agents. Research-scout skips already-covered technologies.
  • Research-aware planning orchestratorplanning-orchestrator.md now accepts research briefs, injects summaries into sub-agent prompts, and cross-references brief findings during synthesis.
  • Research settings in settings.json — configurable Gemini bridge (enabled/timeout), interview depth, dimension limits, and stats tracking.

Changed

  • Plugin description and keywords updated to reflect research capabilities.
  • CLAUDE.md expanded with ultraresearch command, modes, agents, architecture, and state.

[1.5.0] - 2026-04-07

Fixed

  • CRITICAL: Parallel session data loss — Phase 2.6 ran parallel claude -p sessions in the same working directory, causing git race conditions and repository corruption. Each parallel session now runs in its own git worktree with isolated branch, index, and working files. Branches are merged back sequentially after each wave completes.

Added

  • Phase 2.55 (Pre-flight safety checks) — validates clean working tree, committed plan file, no scope fence overlaps between parallel sessions, and no stale worktrees before launching parallel execution.
  • Git worktree isolation for all parallel sessions — one branch per session (ultraplan/{slug}/session-{N}), merged with --no-ff after wave completion.
  • Merge conflict detection — if merging a session branch produces conflicts, the merge is aborted and conflicting files are reported. No silent data loss.
  • Unconditional worktree cleanup — worktrees and session branches are always removed, even on failure. Manual cleanup commands are reported if automated cleanup fails.
  • Hard rules 11-13 — worktree isolation mandatory, cleanup unconditional, merge sequentially with conflict abort.
  • Session-scoped progress file naming--session N uses .ultraexecute-progress-{slug}-session-{N}.json to prevent merge conflicts.

Changed

  • Headless launch template uses git worktrees with cleanup_worktrees trap on EXIT, clean-tree pre-flight check, and sequential merge after each wave.
  • Phase 2.6 rewritten with 5-step worktree lifecycle: create → launch → wait → merge → cleanup.

[1.4.0] - 2026-04-06

Renamed

  • /ultraexecute/ultraexecute-local — renamed for namespace consistency with /ultraplan-local and future-proofing against potential Anthropic naming. File: commands/ultraexecute.mdcommands/ultraexecute-local.md. Note: ultraexecute_summary JSON key and ultraexecute-stats.jsonl filename are unchanged for backward compatibility.

Added

  • convention-scanner agent (sonnet) — dedicated agent for discovering coding conventions: naming, directory layout, import style, error handling, test patterns, git commit style, documentation patterns. Replaces inline Explore agent prompt for medium+ codebases.
  • Success Criteria section in spec template — falsifiable "definition of done" conditions that the spec-reviewer validates and ultraexecute-local uses for verification.
  • Dry-run multi-session preview--dry-run now shows session groupings, wave structure, billing status, and claude -p commands when plan has an Execution Strategy.
  • External verification rule in headless launch template — wave verification must run commands independently, never parse session logs as proof.
  • Billing preamble in headless launch template — unset ANTHROPIC_API_KEY prevents accidental API billing.
  • Phase mapping comment in planning-orchestrator — documents how orchestrator phases 1-7 map to command phases 4-10.

Fixed

  • git add -A in escalation — replaced with targeted staging of only files from completed steps. Prevents staging secrets, binaries, or unrelated work.
  • False background: true claim — command documentation incorrectly stated the orchestrator has background: true in its frontmatter. Corrected to explain run_in_background on the Agent tool.

Changed

  • Execution Strategy reconciliation in session-decomposer — respects existing ## Execution Strategy as input instead of re-analyzing from scratch. Warns on file-overlap conflicts.
  • Headless launch template uses --dangerously-skip-permissions instead of --allowedTools for more robust headless execution.
  • Session-decomposer updated with --dangerously-skip-permissions and unset ANTHROPIC_API_KEY for generated scripts.
  • Convention Scanner references in command and orchestrator updated to use dedicated plugin agent.
  • ROADMAP.md translated from Norwegian to English.
  • plugin.json: added homepage, repository, license, keywords. Version bumped to 1.4.0.
  • README badge updated to v1.4.0.

[1.3.0] - 2026-04-06

Added

  • Session-aware parallel execution/ultraexecute auto-detects ## Execution Strategy in plans and orchestrates multi-session parallel execution via claude -p. No manual bash launch.sh required.
    • --fg flag — force foreground sequential execution, ignoring Execution Strategy
    • --session N flag — execute only session N from the plan's Execution Strategy (used by child processes)
    • Phase 2.5 (Execution strategy decision) — determines single-session vs multi-session mode
    • Phase 2.6 (Multi-session orchestration) — launches parallel claude -p sessions per wave, waits for completion, aggregates results
  • Execution Strategy in plan template — new ## Execution Strategy section with sessions, waves, scope fences, and execution order. Generated by planning-orchestrator for plans with > 5 steps.
  • Execution Strategy generation in planning-orchestrator — Phase 5 analyzes step file-overlap to build dependency graph, groups connected components into sessions of 35 steps, and organizes sessions into parallel waves.

Changed

  • planning-orchestrator Phase 5 extended with Execution Strategy generation logic
  • ultraplan-local Phase 8 now lists Execution Strategy as 10th required plan section
  • Plan template includes ## Execution Strategy section template with grouping rules
  • CLAUDE.md updated with new ultraexecute modes and architecture
  • plugin.json version bumped to 1.3.0

[1.2.0] - 2026-04-06

Added

  • /ultraexecute command — disciplined plan executor with 9-phase workflow. Reads an ultraplan or session spec, executes steps sequentially with strict failure recovery, tracks progress for resume, and reports results in machine-parseable JSON.
    • 4 modes: default (execute), --resume (continue from checkpoint), --dry-run (validate without executing), --step N (single step)
    • Per-step protocol: implement → verify → on-failure handling → checkpoint
    • Failure recovery from plan's On failure clauses (revert/retry/skip/escalate)
    • 3-attempt retry cap per step (initial + 2 retries)
    • Progress file (.ultraexecute-progress-{slug}.json) for crash recovery and resume
    • Entry/exit condition checking for session specs
    • Scope fence enforcement for session specs (never-touch file protection)
    • JSON summary block in output for headless log parsing
    • Stats tracking to ultraexecute-stats.jsonl

Changed

  • CLAUDE.md restructured with two commands table (plan + execute)
  • plugin.json version bumped to 1.2.0

[1.1.0] - 2026-04-06

Added

  • --decompose mode — splits an existing plan into self-contained headless sessions. Analyzes step dependencies, groups steps into sessions of 35 steps each, identifies parallel execution waves, and generates session specs + dependency graph + launch script.
  • --export headless format — shortcut for --decompose. Produces the same session decomposition output.
  • session-decomposer agent (sonnet) — dedicated agent for plan decomposition. Parses step dependencies, builds dependency graph, groups steps into sessions, generates session specs with scope fences and failure handling.
  • Session spec template (templates/session-spec-template.md) — defines the format for individual session specs: context, scope fence, steps, entry/exit conditions, failure handling, handoff state.
  • Headless launch template (templates/headless-launch-template.md) — template for generating bash launch scripts that execute sessions in parallel waves using claude -p.
  • Failure recovery per step — plan template now includes On failure: (revert/retry/skip/escalate) and Checkpoint: (git commit) fields for every implementation step.
  • Headless readiness dimension in plan-critic — new 9th review dimension checking for On failure clauses, Checkpoint fields, and circuit breakers. Weighted at 0.15 in the quality score.

Changed

  • Plan-critic scoring rebalanced: 6 dimensions (was 5), weights adjusted to accommodate headless readiness
  • Plan template step format extended with On failure and Checkpoint fields
  • Planning-orchestrator Phase 5 updated with failure recovery generation requirements
  • CLAUDE.md updated with new agent, modes, and state paths

[1.0.0] - 2026-04-06

Added

  • --quick mode — skips exploration agent swarm. Runs interview → lightweight Glob/Grep scan → planning → adversarial review. For when the developer knows the codebase and needs structure, not cartography.
  • --export mode — generates shareable output from an existing plan file. Three formats: pr (PR description), issue (issue comment), markdown (clean plan without internal metadata).
  • task-finder three-tier categorization — findings categorized as Must-change (must be modified), Must-respect (contract that must not break), or Reference (context/reuse). Replaces flat file list.
  • Adaptive interview depth — interview adapts to answer quality. Detailed answers trigger fewer, more targeted questions. Short/uncertain answers trigger simpler questions with offered alternatives.
  • Complete plugin.json metadata — author, homepage, repository, license, keywords added.
  • README badges — version, license, and platform badges.
  • Known limitations section in README — IaC projects (Terraform, Helm, Pulumi, CDK) get reduced value from exploration agents.
  • Forgejo issue templates — bug report and feature request YAML templates.
  • CONTRIBUTING.md — rewritten for honest solo-project model.

Changed

  • plugin.json version bumped to 1.0.0
  • Command header updated to Ultraplan Local v1.0
  • Orchestrator accepts mode: quick in prompt for lightweight scanning path

[0.4.0] - 2026-04-06

Added

  • 3 new agents for information-complete planning:
    • task-finder — dedicated agent for finding task-relevant files, functions, types, and reuse candidates. Replaces inline Explore agent.
    • git-historian — analyzes git log, blame, active branches, code ownership, and hot files for planning context.
    • spec-reviewer — reviews spec quality (completeness, consistency, testability, scope clarity) before exploration begins. New Phase 1b/4b.
  • Plan scoring — plan-critic produces a quantitative quality score (0100) across 5 weighted dimensions with letter grades (AD) and verdicts (APPROVE/REVISE/REPLAN).
  • No-placeholder rule — plan-critic flags TBD, TODO, vague instructions, and underspecified steps as unconditional blockers. 3+ blockers = REPLAN regardless of score.
  • [ASSUMPTION] marking — planning-orchestrator marks all unverifiable claims and warns when >3 assumptions exist.

Changed

  • All agents run for all codebase sizes. Small codebases get the same 6 core agents as large ones. Agent turns scale down for small codebases instead of dropping agents entirely.
  • Phase 4b (spec review) added before exploration in both command and orchestrator.
  • Orchestrator Phase 2 agent table expanded: 6 always + 1 conditional + 1 medium-only.
  • Plan-critic review checklist expanded with no-placeholder checks (section 7) and scoring output.
  • Orchestrator rules updated with assumption-marking and no-placeholder requirements.

[0.3.0] - 2026-04-05

Added

  • planning-orchestrator agent — dedicated background agent (background: true) that handles Phases 410 autonomously. Replaces generic background agent spawning with a purpose-built orchestrator running on Opus with maxTurns: 50.
  • effort and maxTurns on all agents — fine-grained cost and depth control:
    • Exploration agents: effort: medium, maxTurns: 1520
    • Review agents (plan-critic, scope-guardian): effort: high, maxTurns: 10
    • Research-scout: effort: medium, maxTurns: 10
  • Plugin settings.json — default configuration for mode, research, agent counts, interview limits, and team settings. Users can override in their own settings.
  • Worktree isolation for Agent Teams — team members use isolation: "worktree" to prevent file conflicts during parallel implementation
  • Session tracking (Phase 12) — writes JSONL records to ${CLAUDE_PLUGIN_DATA}/ultraplan-stats.jsonl with task metadata, agent counts, review verdicts, and outcomes

Changed

  • Phase 3 now launches the planning-orchestrator agent instead of a generic background agent
  • Agent Team implementation uses worktree isolation by default

[0.2.0] - 2026-04-05

Added

  • Interview phase — iterative requirements gathering with AskUserQuestion before exploration. Produces a spec file that feeds into planning.
  • 7 specialized agents in agents/ directory:
    • architecture-mapper — deep architecture analysis, anti-patterns, smell detection
    • dependency-tracer — import-chain following, data-flow analysis, side-effect catching
    • test-strategist — test strategy design based on existing patterns
    • risk-assessor — threat modeling, edge cases, failure modes
    • plan-critic — dedicated adversarial reviewer with hardcoded critical perspective
    • scope-guardian — scope creep and scope gap detection
    • research-scout — external research via WebSearch/Tavily for unfamiliar technologies
  • External research capability — research-scout agent searches documentation, known issues, and best practices when the task involves external/unfamiliar technology
  • Background mode — default mode runs interview in foreground, then plans in background. User is notified when done.
  • Spec-driven mode (--spec) — skip interview, provide a pre-written spec file, plan entirely in background
  • Foreground mode (--fg) — all phases in foreground, blocks session (v0.1.0 behavior)
  • Agent Team support — when plan has 3+ independent steps, offers parallel implementation via Agent Teams
  • Spec template in templates/spec-template.md
  • Research Sources section in plan template for citing external research
  • Dual adversarial review — plan-critic and scope-guardian run in parallel

Changed

  • Exploration agents replaced with named specialized agents from agents/ directory
  • Agent count scales with codebase: 3 (small), 5 (medium), 7 (large)
  • Plan template extended with Research Sources and external tech fields
  • Handoff phase supports "execute with team" option
  • Command workflow expanded from 9 to 11 phases

[0.1.0] - 2026-04-05

Added

  • Initial release
  • /ultraplan slash command with 6-phase workflow
  • Parallel Sonnet exploration (3 agents: architecture, task-relevant, conventions)
  • Opus-driven plan generation from structured template
  • Plan refinement loop with execute/save handoff
  • Plan template with context, analysis, steps, alternatives, risks, verification
  • Cross-platform support (Mac, Linux, Windows) — pure markdown, no scripts