feat(ultraplan-local): v2.1.0 — dynamic quality-gated interview

Replace hardcoded Q1-Q8 in /ultrabrief-local with a section-driven
completeness loop (Phase 3) and a draft/review/revise loop with
brief-reviewer as stop-gate (Phase 4). Quality drives the interview,
not a question counter.

brief-reviewer now emits a machine-readable JSON block with per-dimension
scores (1-5) and detail arrays alongside the existing prose report;
planning-orchestrator continues to consume the prose verdict unchanged.

Phase 4 gate: all dimensions >= 4 AND research_plan = 5. On fail, a
targeted follow-up is generated from the weakest dimension's detail
field and the draft is re-reviewed. Max 3 review iterations bound cost;
exhaustion writes brief.md with brief_quality: partial and an explicit
Brief Quality section. Force-stop surfaces per-dimension findings before
the user chooses continue or partial.

Not breaking. /ultrabrief-local [--quick] <task> interface unchanged.
--quick now means compact start with escalation, not a max-N cap.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Kjell Tore Guttormsen 2026-04-18 09:43:43 +02:00
commit 1634197853
8 changed files with 501 additions and 120 deletions

View file

@ -84,17 +84,18 @@ Or opt into auto-mode in `/ultrabrief-local` — it will launch research and pla
## `/ultrabrief-local` — Brief
Interactive requirements-gathering command. Runs a focused interview (3-8 questions depending on mode and adaptive depth) and produces a **task brief** with an explicit research plan. Optionally orchestrates the rest of the pipeline.
Interactive requirements-gathering command. Runs a **dynamic, quality-gated interview** and produces a **task brief** with an explicit research plan. Optionally orchestrates the rest of the pipeline.
### How it works
### How it works (v2.1 — quality-gated)
1. **Parse mode**`default` (5-8 questions) or `--quick` (3-4 questions)
1. **Parse mode**`default` (dynamic; probes until quality gates pass) or `--quick` (starts compact, still escalates if gates fail)
2. **Create project directory**`.claude/projects/{YYYY-MM-DD}-{slug}/` with `research/` subdirectory
3. **Interview** — one question at a time via `AskUserQuestion`, starting with Intent → Goal → Success Criteria, then optional Non-Goals, Constraints, Preferences, NFRs, Prior Attempts
4. **Identify research topics** — probe for unfamiliar tech, version upgrades, security decisions, architectural choices
5. **Write brief**`{project_dir}/brief.md` with all sections and a copy-paste-ready research plan
6. **Auto-orchestration opt-in** — user chooses manual (default) or auto (Claude-managed research + plan in foreground)
7. **Stats tracking** — append to `ultrabrief-stats.jsonl`
3. **Phase 3 — Completeness loop** — a section-driven loop (not a fixed question list) picks the next question from the section with the weakest signal (Intent → Goal → Success Criteria → Research Plan first, then optional sections). Required sections must reach an initial-signal gate before Phase 3 exits. No hard cap on question count.
4. **Identify research topics** — inline during Phase 3; probe for unfamiliar tech, version upgrades, security-sensitive decisions, architectural choices. Topics get a research question, scope, confidence, and cost.
5. **Phase 4 — Draft, review, revise** — draft the brief in memory, write to `brief.md.draft`, launch `brief-reviewer` as a stop-gate. The reviewer scores five dimensions (completeness, consistency, testability, scope clarity, research plan) 15 and returns machine-readable JSON. Gate passes when all ≥ 4 and research plan = 5. On fail, a targeted follow-up is generated from the weakest dimension's detail field and the draft is re-reviewed. Max 3 review iterations.
6. **Finalize** — rename draft to `brief.md` on pass, or write with `brief_quality: partial` + a `## Brief Quality` section if the cap is hit or the user force-stops.
7. **Auto-orchestration opt-in** — user chooses manual (default) or auto (Claude-managed research + plan in foreground)
8. **Stats tracking** — append to `ultrabrief-stats.jsonl` (includes `review_iterations` and `brief_quality`)
Output: `.claude/projects/{YYYY-MM-DD}-{slug}/brief.md`
@ -102,11 +103,17 @@ Output: `.claude/projects/{YYYY-MM-DD}-{slug}/brief.md`
| Mode | Usage | Behavior |
|------|-------|----------|
| **Default** | `/ultrabrief-local <task>` | Full interview (5-8 questions) |
| **Quick** | `/ultrabrief-local --quick <task>` | Short interview (3-4 questions) |
| **Default** | `/ultrabrief-local <task>` | Dynamic interview until quality gates pass. No question cap. |
| **Quick** | `/ultrabrief-local --quick <task>` | Starts compact (optional sections get at most one probe), still escalates on weak required sections or failed review gate. |
`/ultrabrief-local` is **always interactive**. There is no foreground/background mode — the interview requires user input.
### Force-stop
If you say "stop" or "enough" during Phase 4, the current review findings are surfaced with per-dimension scores and you choose:
- **Answer one more follow-up** — the loop continues.
- **Stop now (accept partial brief)** — the brief is finalized with `brief_quality: partial` and a `## Brief Quality` section listing the failing dimensions. Downstream planning will treat these as reduced-confidence areas.
### What the brief contains
- **Intent** — why this matters, motivation, user need (load-bearing)