ktg-plugin-marketplace/plugins/ultraplan-local/commands/ultrabrief-local.md
Kjell Tore Guttormsen 7e2d9e151e feat(ultraplan-local): M1 — profile recommendation flow in ultrabrief
Adds the profile recommendation step to /ultrabrief-local Phase 4. The
brief stays universal (same questions, same template); the new step is
purely a processing-decision layer that records which profile downstream
commands should apply.

What lands:
- agents/profile-recommender.md — new sonnet agent that scores available
  profiles against the finalized brief (keyword + NFR-signal matching,
  axis bumps, hallucination gate that forbids inventing profile names).
  Emits a fenced JSON block with ranked entries.
- templates/ultrabrief-template.md — frontmatter gains
  recommended_profile, profile_match, profile_rationale (default values
  applied when only `default` is available — true at M1).
- commands/ultrabrief-local.md — Phase 4 gains Step 4h with explicit
  branches: short-circuit when only `default` exists; AskUserQuestion
  confirmation when top score ≥ 0.7; explicit fallback message when below
  threshold; manual selection sub-question on user override. Persists the
  three frontmatter fields to brief.md after user confirmation. JSON
  parser failure falls back to `default` with `profile_match: fallback`
  rather than blocking — silent fallback is the worst outcome, but a
  *visible* fallback is acceptable.
- scripts/profile-loader.mjs — adds selectRecommendation(ranked, opts) +
  RECOMMENDATION_THRESHOLD=0.7 export. Single source of truth for the
  threshold logic so the command spec and the helper agree.
- scripts/profile-loader.test.mjs — 10 new tests for selectRecommendation
  (default-only, empty/malformed input, above/below threshold, custom
  threshold, max-by-score, missing fields). Total now 36/36.
- README.md / CLAUDE.md / marketplace landing — docs reflect M0 + M1
  shipped, M2 + M3 still pending.

In practice nothing changes for users at M1 because only `default` is
available — Step 4h takes the short-circuit path and writes
`profile_match: default-only`. M2 ships the additional profiles that
make the recommender meaningful.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-30 14:21:54 +02:00

31 KiB
Raw Blame History

name description argument-hint model allowed-tools
ultrabrief-local Interactive interview that produces a task brief with explicit research plan. Feeds /ultraresearch-local and /ultraplan-local. Optionally orchestrates the full pipeline end-to-end. [--quick] <task description> opus Agent, Read, Glob, Grep, Write, Edit, Bash, AskUserQuestion

Ultrabrief Local v2.1

Interactive requirements-gathering command. Produces a task brief — a structured markdown file that declares intent, goal, constraints, and an explicit research plan with copy-paste-ready /ultraresearch-local commands.

Pipeline position:

/ultrabrief-local  →  brief.md                         (this command)
/ultraresearch-local --project <dir>  →  research/*.md
/ultraplan-local --project <dir>  →  plan.md
/ultraexecute-local --project <dir>  →  execution

The brief is the contract between the user's intent and /ultraplan-local. Every decision the plan makes must trace back to content in the brief.

This command is always interactive. There is no background mode — the interview requires user input. After the brief is written, the command optionally orchestrates the rest of the pipeline (research + plan) in foreground if the user opts in.

Phase 1 — Parse mode and validate input

Parse $ARGUMENTS:

  1. If arguments start with --quick: set mode = quick. The interview starts more compactly (fewer opening probes per section) but still escalates automatically if quality gates fail. There is no hard cap on question count — quality drives the loop, not a counter. Strip the flag; remainder is the task description.

  2. Otherwise: mode = default. Interview probes each section until the completeness gate (Phase 3) and brief-review gate (Phase 4) both pass.

If no task description is provided, output usage and stop:

Usage: /ultrabrief-local <task description>
       /ultrabrief-local --quick <task description>

Modes:
  default       Dynamic interview until quality gates pass — brief with research plan
  --quick       Compact start; still escalates on weak sections — brief with research plan

Examples:
  /ultrabrief-local Add user authentication with JWT tokens
  /ultrabrief-local --quick Add rate limiting to the API
  /ultrabrief-local Migrate from Express to Fastify

Report:

Mode: {default | quick}
Task: {task description}

Phase 2 — Generate slug and create project directory

Generate a slug from the task description: first 3-4 meaningful words, lowercase, hyphens. Example: "Migrate from Express to Fastify" → fastify-migration.

Set today's date as YYYY-MM-DD (UTC).

Create the project directory:

PROJECT_DIR=".claude/projects/{YYYY-MM-DD}-{slug}"
mkdir -p "$PROJECT_DIR/research"

Report:

Project directory: .claude/projects/{YYYY-MM-DD}-{slug}/

If the directory already exists and is non-empty, warn and ask:

"Directory {path} exists. Overwrite, reuse (keep existing files), or pick new slug?"

Use AskUserQuestion with three options. If "pick new slug", ask for a new slug and restart Phase 2.

Phase 3 — Completeness loop

Phase 3 is a section-driven completeness loop. Instead of a numbered question list, maintain an internal state of brief sections and keep asking until every required section has substantive content. Quality drives the loop — there is no hard cap on question count.

Use AskUserQuestion for every question. Ask one question at a time. Never dump multiple questions.

Internal state

Track this structure in memory as the loop runs:

state = {
  intent:           { content: "", probes: 0 },           # required
  goal:             { content: "", probes: 0 },           # required
  success_criteria: { content: [], probes: 0 },           # required
  research_plan:    { topics: [], probes: 0 },            # required
  non_goals:        { content: [], probes: 0 },           # optional
  constraints:      { content: [], probes: 0 },           # optional
  preferences:      { content: [], probes: 0 },           # optional
  nfrs:             { content: [], probes: 0 },           # optional
  prior_attempts:   { content: "", probes: 0 },           # optional
  question_history: []                                    # list of questions asked
}

content is raw user answers merged; probes is how many times this section has been asked; question_history prevents re-asking the same variant twice.

Required sections (initial-signal gate)

Four sections MUST have substantive content before exiting Phase 3:

  1. Intent — full sentence or paragraph (not a single word or phrase)
  2. Goal — full sentence or paragraph
  3. Success Criteria — at least one concrete, testable item
  4. Research Plan — either ≥ 1 topic probed, OR the user has explicitly confirmed "no external research needed"

"Substantive" means: non-empty, not a trivial one-word reply, not "I don't know" without a recorded assumption. The strict falsifiability check happens in Phase 4 (brief-review gate); Phase 3 is just the initial-signal bar.

Optional sections (Non-Goals, Constraints, Preferences, NFRs, Prior Attempts) do not gate exit. If they remain empty after the required sections pass, they will be recorded as "Not discussed — no constraints assumed" in Phase 4's draft.

Question bank (per section)

Pick the next question from the section's bank based on content and probes. Wording must stay conversational — only the selection is section-driven, not the tone.

Intent (required):

  • Anchor (probes=0, content empty): "Why are we doing this? What is the motivation, the user need, or the strategic context behind the task?"
  • Follow-up (probes≥1, content present but shallow): "What happens if we do nothing? Who is affected?"
  • Sharpen (user mentioned a symptom): "You mentioned {X}. Is {X} the symptom or the underlying cause?"

Goal (required):

  • Anchor: "Describe the end state in 13 sentences — specific enough to disagree with."
  • Follow-up: "How would you recognize this is done when looking at the UI / API / codebase?"

Success Criteria (required):

  • Anchor: "How do we verify it is actually done? List 24 concrete, testable conditions — commands to run, observations, or metrics."
  • Sharpen (criterion is vague): "'{quoted criterion}' is subjective. Which command, observation, or metric would prove this is met?"
  • Quantify (performance/quality claim): "You mentioned it should be {fast/reliable/secure}. What number or threshold counts as success?"

Research Plan (required, strictest):

  • Anchor (no topics yet): "Are there technologies, libraries, or decisions in this task you do not have solid current knowledge of? Examples might be library choice, a protocol, or a security pattern."
  • Per-topic sharpen (topic exists but incomplete): "For topic '{title}': which parts of the plan depend on the answer? What confidence level do you need — high, medium, or low?"
  • Scope question: "Is '{topic}' answerable from the existing codebase, from external docs, or both?"
  • Confirm none (user refuses all topics): "Confirming: no external research needed — you already know everything the plan will depend on?"

Non-Goals (optional):

  • Anchor: "What is explicitly NOT in scope? This prevents scope-guardian from flagging gaps for things we deliberately don't do."

Constraints (optional):

  • Anchor: "Technical, time, or resource constraints the plan must respect? Dependencies, compatibility, deadlines, or budget."
  • Sharpen: "You mentioned {deadline / budget / compatibility}. Is it hard or guidance?"

Preferences (optional):

  • Anchor: "Preferences for libraries, patterns, or architectural style?"

NFRs (optional):

  • Anchor: "Performance, security, accessibility, or scalability targets? Quantified wherever possible."

Prior Attempts (optional):

  • Anchor: "Has this been attempted before? What worked or failed?"

Selection rule

On each loop iteration:

  1. Compute the next section to probe:
    • If any required section is below the initial-signal gate → pick the weakest required section in this priority order: Intent → Goal → Success Criteria → Research Plan.
    • Else if an optional section is clearly missing and likely material to scope (heuristic: the task description hints at constraints or NFRs) → probe it at most once.
    • Else: exit Phase 3.
  2. Within the chosen section, pick the question variant:
    • If probes == 0 and content is empty → Anchor.
    • If content exists but is shallow → Follow-up or Sharpen.
    • If the section is Research Plan and topics exist → iterate per-topic sharpen across incomplete topics.
  3. Ensure the exact question is NOT already in question_history. If it is, pick the next variant or skip to the next weakest section.
  4. Ask via AskUserQuestion. Append question to history. Increment probes.
  5. Record the answer into content. Never overwrite — merge.

Research topic identification

As the user answers Intent, Goal, or Success Criteria, listen for:

  • Unfamiliar technologies — libraries, frameworks, protocols not clearly present in the codebase
  • Version upgrades — migrating to a new major version
  • Security-sensitive decisions — auth, crypto, data handling
  • Architectural choices — pattern X vs Y, library A vs B
  • Unknown integrations — third-party APIs, external services
  • Compliance / legal — GDPR, accessibility, industry regulations

When you hear one, add a candidate topic to research_plan.topics with only a title and why-it-matters. Probe it on the next Research Plan iteration using the per-topic sharpen question to fill in:

  • Research question (must end in ?)
  • Required for plan steps
  • Scope (local / external / both)
  • Confidence needed (high / medium / low)
  • Estimated cost (quick / standard / deep)

If the user says "I know this" to a candidate topic, remove it from the list. Trust the user. If no topics emerge after probing, the user confirms "no external research needed" → research_plan gate passes with 0 topics.

Quick mode adjustments

If mode = quick:

  • For optional sections, cap probes at 1 each. Do not revisit optional sections during Phase 3.
  • Required sections still have no probe cap — quality gate still applies.
  • Prefer Anchor variants over Sharpen on the first pass.

Force-stop path

If the user says "skip", "stop asking", "just proceed", "enough", or similar, break the loop immediately:

  • Mark any required sections still below the initial-signal gate as { incomplete_forced_stop: true } in state.
  • Proceed to Phase 4 with a note that the brief will carry a reduced confidence flag.

Exit condition

Exit Phase 3 when:

  • All four required sections meet the initial-signal gate, OR
  • The user has force-stopped.

Report:

Phase 3 complete: {N} questions asked across {M} sections.
Proceeding to draft and review.

Phase 4 — Draft, review, and revise

Phase 4 runs a draft → brief-reviewer → revise loop. The draft is not written to disk until the brief-review quality gate passes (or the iteration cap is hit). This ensures the brief that reaches /ultraplan-local has already survived a critical review.

Read the brief template first: @${CLAUDE_PLUGIN_ROOT}/templates/ultrabrief-template.md

Loop bound

Maximum 3 review iterations. This bounds cost in the worst case while leaving room for two rounds of targeted follow-ups.

Iteration step-by-step

Step 4a — Draft in memory

Build the brief text from Phase 3 state by filling the template:

  • Frontmatter: populate task, slug, project_dir, research_topics (count of topics), research_status: pending, auto_research: false (will update in Phase 5 if user opts in), interview_turns (total questions asked across Phase 3 + Phase 4), source: interview.
  • Intent: expand the user's motivation into 35 sentences. Load-bearing.
  • Goal: concrete end state.
  • Non-Goals: from state, or "- None explicitly stated" bullet if empty.
  • Constraints / Preferences / NFRs: from state, or "Not discussed — no constraints assumed" note if empty.
  • Success Criteria: falsifiable commands/observations from state.
  • Research Plan: one ### Topic N: {title} section per topic with the full structure from the template. If 0 topics: write the "No external research needed — user confirmed solid knowledge of all plan dependencies" note.
  • Open Questions / Assumptions: from any "I don't know" answers recorded during Phase 3, plus implicit gaps.
  • Prior Attempts: from state, or "None — fresh task."

Step 4b — Write draft to disk

Write the draft to {PROJECT_DIR}/brief.md.draft (not brief.md — the final file is only written after the gate passes).

Step 4c — Launch brief-reviewer

Launch the brief-reviewer agent (foreground, blocking) with the prompt:

"Review this task brief for quality: {PROJECT_DIR}/brief.md.draft. Check completeness, consistency, testability, scope clarity, and research-plan validity. Report findings, verdict, and the required machine-readable JSON block."

Step 4d — Parse JSON scores

Parse the agent's output. Locate the last fenced json block. Extract per-dimension scores:

review = {
  completeness:   { score, gaps },
  consistency:    { score, issues },
  testability:    { score, weak_criteria },
  scope_clarity:  { score, unclear_sections },
  research_plan:  { score, invalid_topics },
  verdict:        "PROCEED | PROCEED_WITH_RISKS | REVISE"
}

JSON fallback: if the JSON block is missing, invalid, or a dimension is missing, treat all dimensions as score: 3 and set the verdict from the prose verdict if present, otherwise PROCEED_WITH_RISKS. Emit an internal note that the reviewer output was degraded. This ensures the loop never deadlocks on a parser error.

Step 4e — Gate evaluation

The gate passes when all of the following are true:

  • completeness.score ≥ 4
  • consistency.score ≥ 4
  • testability.score ≥ 4
  • scope_clarity.score ≥ 4
  • research_plan.score == 5

(Research Plan requires a perfect score because its format is checked mechanically: ends in ?, Required for plan steps filled, scope is one of local | external | both, confidence is high | medium | low. Anything less means at least one topic is malformed and planning will stumble.)

If gate passes:

  1. Move brief.md.draftbrief.md (atomic rename).
  2. Delete the draft file if rename is not possible on the OS; write brief.md fresh.
  3. Break the loop and proceed to Step 4g.

If gate fails AND iteration count < 3:

  1. Identify the weakest dimension (lowest score; tie broken by priority: research_plan > testability > completeness > consistency > scope_clarity).
  2. Generate a targeted follow-up question from the dimension's detail field (gaps / issues / weak_criteria / unclear_sections / invalid_topics). Example generators:
    • completeness.gaps: ["Non-Goals empty, unclear if deliberate"] → "You did not specify anything out-of-scope. Is that deliberate, or are there things we should explicitly exclude?"
    • testability.weak_criteria: ["'system should be fast'"] → "'System should be fast' is not falsifiable. Which metric or threshold proves this is met — e.g., p95 < 200ms, or throughput ≥ X requests/sec?"
    • research_plan.invalid_topics: [{"topic":"JWT","issue":"Required for plan steps empty"}] → "For research topic 'JWT': which plan steps depend on the answer? Give one or two concrete kinds of step (e.g., 'library selection', 'threat model', 'migration strategy')."
  3. Ask via AskUserQuestion. Record the answer into Phase 3 state.
  4. Return to Step 4a with incremented iteration count. The reviewer sees an updated draft, so you MUST re-read the brief and regenerate the review each iteration — do not reuse stale scores.
  5. When launching the reviewer on iteration 2 or 3, include prior questions in the prompt so it does not produce circular follow-ups:

    "Questions already asked during this interview: {list from question_history}. Focus on issues that remain after those answers — do not re-raise gaps that have already been addressed."

If gate fails AND iteration count == 3 (loop exhausted):

  1. Move brief.md.draftbrief.md.
  2. Add brief_quality: partial to the frontmatter (edit the file post-rename — insert the key above the closing ---).
  3. Add a ## Brief Quality section near the end with the failing dimensions and their detail arrays from the final review, formatted:
    ## Brief Quality
    
    Review loop exhausted after 3 iterations. The following dimensions
    did not reach the pass threshold:
    
    - **Research Plan (score 2/5):** Topic 'JWT library' missing
      Required-for-plan-steps field.
    - **Testability (score 3/5):** Success criterion "works correctly"
      is not falsifiable.
    
    Downstream planning will treat these as reduced-confidence areas.
    
  4. Break the loop and proceed to Step 4g.

Step 4f — Force-stop handling

If during any AskUserQuestion in Step 4e the user says "stop", "skip", "enough", "just write it", or similar, do NOT exit the loop immediately. Instead, surface the current review findings in plain text:

Brief-reviewer would flag these issues:
- Research Plan (score 2/5): Topic 'JWT library choice' missing Required-for-plan-steps field.
- Testability (score 3/5): Success criterion "works correctly" is not falsifiable.

Continue anyway? The plan will have lower confidence in these areas.

Then ask via AskUserQuestion:

Option Action
Answer one more follow-up Return to Step 4e with the current weakest-dimension question.
Stop now (accept partial brief) Finalize brief with brief_quality: partial and the ## Brief Quality section (same path as iteration-cap exhaustion). Break loop.

The force-stop path is distinct from a silent iteration cap: the user sees exactly which dimensions are weak and chooses informed.

Step 4g — Finalize

After the loop exits (pass, cap, or force-stop), ensure:

  • brief.md exists at {PROJECT_DIR}/brief.md.
  • brief.md.draft no longer exists.
  • If the loop ended without a clean pass, frontmatter contains brief_quality: partial and a ## Brief Quality section exists.
  • If the loop ended with a clean pass, brief_quality is either absent or set to complete.

Populate the "How to continue" footer with the actual project path and topic questions.

Report:

Brief written: {PROJECT_DIR}/brief.md
Review iterations: {1..3}
Final quality: {complete | partial}
Research topics identified: {N}

Step 4h — Profile recommendation (M1+)

After the brief is finalized on disk, recommend an ultraplan-local profile that fits the brief. The profile drives which exploration/review agents /ultraplan-local will spawn, which catalog filter the architect will use, and which adversarial regime applies. Profiles live in ${CLAUDE_PLUGIN_ROOT}/profiles/; the loader is ${CLAUDE_PLUGIN_ROOT}/scripts/profile-loader.mjs.

The brief stays universal — Step 4h does not change what the brief contains, only which processing profile downstream commands should apply to it.

Step 4h.1 — Discover available profiles

Run:

node ${CLAUDE_PLUGIN_ROOT}/scripts/profile-loader.mjs list

Capture the newline-separated profile names into a variable AVAILABLE_PROFILES. If the loader exits non-zero or returns zero names, skip Step 4h entirely and write recommended_profile: default, profile_match: default-only to the brief frontmatter.

Step 4h.2 — Short-circuit when only default exists

If AVAILABLE_PROFILES == ["default"] (M1 ships only default), do not spawn the recommender or ask the user. Write to the brief frontmatter:

recommended_profile: default
profile_match: default-only
profile_rationale: "Only the default profile is available; recommendation skipped."

Report:

Profile: default (only profile available; recommendation skipped)

Proceed to Phase 5. The rest of Step 4h applies once M2 ships additional profiles.

Step 4h.3 — Build profile manifest for the recommender

For each profile in AVAILABLE_PROFILES, load it:

node ${CLAUDE_PLUGIN_ROOT}/scripts/profile-loader.mjs load <name>

Extract name, description, axes, and triggers from each. Build a JSON array of profile manifests in memory.

Step 4h.4 — Spawn profile-recommender

Launch the profile-recommender agent (foreground, blocking). Prompt:

"Read the finalized brief at {PROJECT_DIR}/brief.md and rank these profiles by fit. Profiles JSON: {paste manifest JSON array}. Output the ranked list in your standard JSON block. Do not invent profile names — only rank what is in the JSON array."

Capture the agent's output. Locate the last fenced json block and parse it. Expected shape:

{
  "ranked": [
    {"name": "<profile>", "score": 0.0, "match_quality": "exact|partial|fallback", "rationale": "..."},
    ...
  ]
}

JSON fallback: if the JSON block is missing, malformed, or empty, treat this as a recommender failure. Write to the brief:

recommended_profile: default
profile_match: fallback
profile_rationale: "profile-recommender output could not be parsed; using default."

Report the failure to the user as plain text and proceed to Phase 5. Do not ask AskUserQuestion in this branch — the silent fallback is the explicit fallback.

Step 4h.5 — Apply recommendation threshold

Use the selectRecommendation() helper logic (see scripts/profile-loader.mjs):

  • If the top-ranked profile has score >= 0.7, that is the recommendation.
  • If the top score is < 0.7, the recommendation is default with match: fallback.

This logic is also exported as a helper for tests; use the same threshold.

Step 4h.6 — Confirm with the user via AskUserQuestion

If the recommended profile is non-default (i.e., a profile scored ≥ 0.7), present:

Question: "Based on the brief, the {recommended} profile fits best.
          Use it for /ultraplan-local?"

Options:
  1. "Use {recommended}" — apply the recommendation. (Recommended)
  2. "Use default" — fall back to the baseline profile.
  3. "Choose another" — pick from the full list of available profiles.

If the user picks option 1: write recommended_profile: {recommended}, profile_match: {match_quality} (from the agent's output), profile_rationale: {rationale} (from the agent's output).

If the user picks option 2: write recommended_profile: default, profile_match: user-override, profile_rationale: "User chose default over the {recommended} recommendation."

If the user picks option 3: present a sub-question listing every profile in AVAILABLE_PROFILES with its description. The user picks one. Write recommended_profile: {chosen}, profile_match: user-override, profile_rationale: "User selected {chosen} manually over the {recommended} recommendation."

Step 4h.7 — Fallback: top score below threshold

If the top score is < 0.7, surface the fallback explicitly:

No profile matched this brief strongly enough for an automatic recommendation.

Available profiles:
  - default: {description}
  - {profile-2}: {description}
  - {profile-N}: {description}

Using fallback: default. You can override with:
  /ultraplan-local --project {PROJECT_DIR} --profile <name>

Then ask via AskUserQuestion whether the user wants to pick a profile now or accept the default fallback:

Question: "No profile matched strongly. Pick one now, or use default?"

Options:
  1. "Use default (fallback)" — write profile_match: fallback. (Recommended)
  2. "Choose a profile manually" — sub-question with full list.

If the user picks option 1: write recommended_profile: default, profile_match: fallback, profile_rationale: "{top-score and reason from agent's output}". If the user picks option 2: same flow as Step 4h.6 option 3, but profile_match: user-override.

Step 4h.8 — Persist to brief frontmatter

Edit {PROJECT_DIR}/brief.md and replace the placeholder values:

recommended_profile: <chosen>
profile_match: <exact|partial|fallback|user-override|default-only>
profile_rationale: "<one sentence>"

Use Edit with exact matches against the template's placeholder values (recommended_profile: default, profile_match: default-only, profile_rationale: "Single profile available; no recommendation made.") to avoid clobbering other frontmatter.

Step 4h.9 — Report

Profile: {chosen} (match: {match_quality}, source: {recommended | user-override | fallback})
Rationale: {rationale}

If the brief did not write the placeholder defaults (older briefs from before M1), insert the three lines below the existing frontmatter — never above ---. Old briefs without the fields stay valid; downstream consumers default to default.

Phase 5 — Auto-orchestration opt-in (if research_topics > 0)

Skip this phase if research_topics = 0. Proceed directly to Phase 6.

Ask the user via AskUserQuestion:

Question: "You have {N} research topic(s). How do you want to proceed?"

Option Description
Manual (default) Print the commands. You run /ultraresearch-local and /ultraplan-local yourself, choosing depth per topic.
Auto (managed by Claude Code) I run all {N} research topics sequentially in foreground, then automatically trigger /ultraplan-local when research completes. This session blocks until the plan is ready.

Manual path (default)

Output:

## Brief complete

Project: {PROJECT_DIR}/
Brief:   {PROJECT_DIR}/brief.md
Research topics: {N}

Next steps (run in order or parallel):

{For each topic:}
  /ultraresearch-local --project {PROJECT_DIR} --external "{topic question}"

Then:
  /ultraplan-local --project {PROJECT_DIR}

Then:
  /ultraexecute-local --project {PROJECT_DIR}

Stop. Do not continue to Phase 6.

Auto path

Set auto_research: true in the brief's frontmatter (edit the file).

Proceed to Phase 6.

Phase 6 — Auto research dispatch (auto path only)

Runs only when user opted into auto mode.

Step 6a — Confirm proceed

Tell the user auto mode will run in foreground and block the session, then confirm via AskUserQuestion:

Question: "Auto mode runs {N} research topic(s) sequentially and then the plan — all in foreground. This session blocks until the plan is ready. Continue?"

Option Action
Continue — auto Proceed.
Cancel — do manual Revert to manual path (print commands, stop).

If cancelled → fall back to manual path output and stop.

Step 6b — Run research topics sequentially (inline)

Set research_status: in_progress in the brief's frontmatter.

For each research topic (index i = 1 .. N), invoke /ultraresearch-local inline in this main-context session:

/ultraresearch-local --project {PROJECT_DIR} {--external | --local | (none)} "{topic i question}"

Pass the scope flag that matches the topic's scope hint. Wait for each invocation to finish writing the research brief at {PROJECT_DIR}/research/{NN}-{topic-slug}.md before moving to the next topic.

Why sequential inline instead of parallel background? Background orchestrator-agents cannot spawn the research swarm — the Claude Code harness does not expose the Agent tool to sub-agents, so a background run silently degrades to single-context reasoning without WebSearch / Tavily / WebFetch / Gemini (see v2.4.0 release notes). Running each research pass inline in main context keeps the swarm intact. For true parallel execution, use claude -p invocations in separate terminal windows.

Step 6c — Verify all briefs landed

After the last topic completes, verify each research brief file exists:

ls -1 {PROJECT_DIR}/research/*.md | wc -l

Expected count: N. If any are missing, report and ask the user how to proceed (retry, skip missing topic, cancel).

Update brief frontmatter: research_status: complete.

Step 6d — Auto-trigger planning (inline foreground)

Invoke the planning command inline in this session:

/ultraplan-local --project {PROJECT_DIR}

The planning pipeline runs all phases (exploration, synthesis, review) in main context. Wait for the plan to be written to {PROJECT_DIR}/plan.md before continuing.

Step 6e — Report completion

When the planning-orchestrator finishes, present:

## Ultrabrief + Ultraresearch + Ultraplan Complete (auto mode)

**Project:** {PROJECT_DIR}/
**Brief:** {PROJECT_DIR}/brief.md
**Research briefs:** {N} in {PROJECT_DIR}/research/
**Plan:** {PROJECT_DIR}/plan.md

### Pipeline summary

| Step | Status |
|------|--------|
| Brief | Complete ({interview_turns} interview turns) |
| Research | Complete ({N} topics, sequential foreground) |
| Plan | Complete ({steps} steps, critic: {verdict}) |

Next:
  /ultraexecute-local --project {PROJECT_DIR}

Or:
  /ultraexecute-local --dry-run --project {PROJECT_DIR}   # preview
  /ultraexecute-local --validate --project {PROJECT_DIR}  # schema check

Phase 7 — Stats tracking

Append one record to ${CLAUDE_PLUGIN_DATA}/ultrabrief-stats.jsonl:

{
  "ts": "{ISO-8601}",
  "task": "{task description (first 100 chars)}",
  "slug": "{slug}",
  "mode": "{default | quick}",
  "interview_turns": {N},
  "review_iterations": {1..3},
  "brief_quality": "{complete | partial}",
  "research_topics": {N},
  "auto_research": {true | false},
  "auto_result": "{completed | cancelled | failed | manual}",
  "project_dir": "{path}"
}

If ${CLAUDE_PLUGIN_DATA} is not set or not writable, skip silently. Never let stats failures block the workflow.

Hard rules

  1. Interactive only. This command requires user input. There is no --fg or background mode — the interview cannot run headless.
  2. Brief is the contract. Every section must have substantive content or an explicit "Not discussed" note. No empty sections.
  3. Intent is load-bearing. Do not accept a one-line intent. Expand with the user until motivation is clear — the plan and every review agent will trace decisions back to this.
  4. Research topics must be answerable. Each topic's research question must be phrased so /ultraresearch-local can answer it. If a topic is too vague, split or reformulate before writing.
  5. Never invent research topics the user did not agree to. Topics come from the interview. If the user says "I know this", respect it.
  6. Project dir is the single source of truth. Every artifact (brief, research briefs, plan, progress) lives in one project directory. Never scatter files across .claude/research/, .claude/plans/, etc.
  7. Auto mode blocks foreground. If the user opts into auto, this session waits for research + planning to complete. Document this in the opt-in question.
  8. Quality gates, not question counts. Phase 3 and Phase 4 are quality-gated loops; do not enforce a hard cap on interview questions. The brief-review gate (Phase 4) caps at 3 review iterations to bound cost, but Phase 3 has no cap — required-section content drives exit.
  9. Never write brief.md while the review gate is still pending. Draft lives in brief.md.draft until the loop terminates. A caller that sees brief.md must be able to trust that Phase 4 finished.
  10. Privacy: never log prompt text, secrets, or credentials.