feat(voyage)!: marketplace handoff — rename plugins/ultraplan-local to plugins/voyage [skip-docs]

Session 5 of voyage-rebrand (V6). Operator-authorized cross-plugin scope. - git mv plugins/ultraplan-local plugins/voyage (rename detected, history preserved) - .claude-plugin/marketplace.json: voyage entry replaces ultraplan-local - CLAUDE.md: voyage row in plugin list, voyage in design-system consumer list - README.md: bulk rename ultra*-local commands -> trek* commands; ultraplan-local refs -> voyage; type discriminators (type: trekbrief/trekreview); session-title pattern (voyage:<command>:<slug>); v4.0.0 release-note paragraph - plugins/voyage/.claude-plugin/plugin.json: homepage/repository URLs point to monorepo voyage path - plugins/voyage/verify.sh: drop URL whitelist exception (no longer needed) Closes voyage-rebrand. bash plugins/voyage/verify.sh PASS 7/7. npm test 361/361.
2026-05-05 15:37:52 +02:00 · 2026-05-05 15:37:52 +02:00 · 7a90d348ad
commit 7a90d348ad
parent 8f1bf9b7b4
149 changed files with 26 additions and 33 deletions
--- a/plugins/voyage/commands/trekbrief.md
+++ b/plugins/voyage/commands/trekbrief.md
@ -0,0 +1,705 @@
+---
+name: trekbrief
+description: Interactive interview that produces a task brief with explicit research plan. Feeds /trekresearch and /trekplan. Optionally orchestrates the full pipeline end-to-end.
+argument-hint: "[--quick] <task description>"
+model: opus
+allowed-tools: Agent, Read, Glob, Grep, Write, Edit, Bash, AskUserQuestion
+---
+
+# Ultrabrief Local v2.1
+
+Interactive requirements-gathering command. Produces a **task brief** — a
+structured markdown file that declares intent, goal, constraints, and an
+**explicit research plan** with copy-paste-ready `/trekresearch` commands.
+
+Pipeline position:
+
+```
+/trekbrief  →  brief.md                         (this command)
+/trekresearch --project <dir>  →  research/*.md
+/trekplan --project <dir>  →  plan.md
+/trekexecute --project <dir>  →  execution
+```
+
+The brief is the contract between the user's intent and `/trekplan`.
+Every decision the plan makes must trace back to content in the brief.
+
+**This command is always interactive.** There is no background mode — the
+interview requires user input. After the brief is written, the command
+optionally orchestrates the rest of the pipeline (research + plan) in
+foreground if the user opts in.
+
+## Phase 1 — Parse mode and validate input
+
+Parse `$ARGUMENTS`:
+
+1. If arguments start with `--quick`: set **mode = quick**. The interview
+   starts more compactly (fewer opening probes per section) but still
+   escalates automatically if quality gates fail. There is no hard cap on
+   question count — quality drives the loop, not a counter. Strip the flag;
+   remainder is the task description.
+
+2. Otherwise: **mode = default**. Interview probes each section until the
+   completeness gate (Phase 3) and brief-review gate (Phase 4) both pass.
+
+3. `--gates` flag (autonomy control, may combine with any mode): when
+   present, set `gates_mode = true`. This re-enables approval pauses at
+   every phase boundary in the downstream pipeline (research, plan,
+   execute) and at every wave in the executor. Default `gates_mode = false`
+   means auto mode runs continuously until the main-merge gate (which is
+   the one boundary that ALWAYS pauses, regardless of `gates_mode`). Strip
+   the flag from `$ARGUMENTS` before further parsing. The flag is consumed
+   by the autonomy-gate state machine via the CLI shim:
+   `node ${CLAUDE_PLUGIN_ROOT}/lib/util/autonomy-gate.mjs --state X --event Y --gates {true|false}`.
+
+If no task description is provided, output usage and stop:
+
+```
+Usage: /trekbrief <task description>
+       /trekbrief --quick <task description>
+
+Modes:
+  default       Dynamic interview until quality gates pass — brief with research plan
+  --quick       Compact start; still escalates on weak sections — brief with research plan
+
+Examples:
+  /trekbrief Add user authentication with JWT tokens
+  /trekbrief --quick Add rate limiting to the API
+  /trekbrief Migrate from Express to Fastify
+```
+
+Report:
+```
+Mode: {default | quick}
+Task: {task description}
+```
+
+## Phase 2 — Generate slug and create project directory
+
+Generate a slug from the task description: first 3-4 meaningful words,
+lowercase, hyphens. Example: "Migrate from Express to Fastify" → `fastify-migration`.
+
+Set today's date as `YYYY-MM-DD` (UTC).
+
+Create the project directory:
+
+```bash
+PROJECT_DIR=".claude/projects/{YYYY-MM-DD}-{slug}"
+mkdir -p "$PROJECT_DIR/research"
+```
+
+Report:
+```
+Project directory: .claude/projects/{YYYY-MM-DD}-{slug}/
+```
+
+If the directory already exists and is non-empty, warn and ask:
+> "Directory {path} exists. Overwrite, reuse (keep existing files), or pick new slug?"
+
+Use `AskUserQuestion` with three options. If "pick new slug", ask for a
+new slug and restart Phase 2.
+
+## Phase 3 — Completeness loop
+
+Phase 3 is a **section-driven completeness loop**. Instead of a numbered
+question list, maintain an internal state of brief sections and keep asking
+until every required section has substantive content. Quality drives the
+loop — there is no hard cap on question count.
+
+Use `AskUserQuestion` for every question. **Ask one question at a time.**
+Never dump multiple questions.
+
+### Internal state
+
+Track this structure in memory as the loop runs:
+
+```
+state = {
+  intent:           { content: "", probes: 0 },           # required
+  goal:             { content: "", probes: 0 },           # required
+  success_criteria: { content: [], probes: 0 },           # required
+  research_plan:    { topics: [], probes: 0 },            # required
+  non_goals:        { content: [], probes: 0 },           # optional
+  constraints:      { content: [], probes: 0 },           # optional
+  preferences:      { content: [], probes: 0 },           # optional
+  nfrs:             { content: [], probes: 0 },           # optional
+  prior_attempts:   { content: "", probes: 0 },           # optional
+  question_history: []                                    # list of questions asked
+}
+```
+
+`content` is raw user answers merged; `probes` is how many times this
+section has been asked; `question_history` prevents re-asking the same
+variant twice.
+
+### Required sections (initial-signal gate)
+
+Four sections MUST have substantive content before exiting Phase 3:
+
+1. **Intent** — full sentence or paragraph (not a single word or phrase)
+2. **Goal** — full sentence or paragraph
+3. **Success Criteria** — at least one concrete, testable item
+4. **Research Plan** — either ≥ 1 topic probed, OR the user has explicitly
+   confirmed "no external research needed"
+
+"Substantive" means: non-empty, not a trivial one-word reply, not
+"I don't know" without a recorded assumption. The strict falsifiability
+check happens in Phase 4 (brief-review gate); Phase 3 is just the
+initial-signal bar.
+
+Optional sections (Non-Goals, Constraints, Preferences, NFRs, Prior
+Attempts) do not gate exit. If they remain empty after the required
+sections pass, they will be recorded as "Not discussed — no constraints
+assumed" in Phase 4's draft.
+
+### Question bank (per section)
+
+Pick the next question from the section's bank based on `content` and
+`probes`. Wording must stay conversational — only the *selection* is
+section-driven, not the tone.
+
+**Intent** (required):
+- _Anchor_ (probes=0, content empty): "Why are we doing this? What is the
+  motivation, the user need, or the strategic context behind the task?"
+- _Follow-up_ (probes≥1, content present but shallow): "What happens if
+  we do nothing? Who is affected?"
+- _Sharpen_ (user mentioned a symptom): "You mentioned {X}. Is {X} the
+  symptom or the underlying cause?"
+
+**Goal** (required):
+- _Anchor_: "Describe the end state in 1–3 sentences — specific enough to
+  disagree with."
+- _Follow-up_: "How would you recognize this is done when looking at
+  the UI / API / codebase?"
+
+**Success Criteria** (required):
+- _Anchor_: "How do we verify it is actually done? List 2–4 concrete,
+  testable conditions — commands to run, observations, or metrics."
+- _Sharpen_ (criterion is vague): "'{quoted criterion}' is subjective.
+  Which command, observation, or metric would prove this is met?"
+- _Quantify_ (performance/quality claim): "You mentioned it should be
+  {fast/reliable/secure}. What number or threshold counts as success?"
+
+**Research Plan** (required, strictest):
+- _Anchor_ (no topics yet): "Are there technologies, libraries, or
+  decisions in this task you do not have solid current knowledge of?
+  Examples might be library choice, a protocol, or a security pattern."
+- _Per-topic sharpen_ (topic exists but incomplete): "For topic
+  '{title}': which parts of the plan depend on the answer? What
+  confidence level do you need — high, medium, or low?"
+- _Scope question_: "Is '{topic}' answerable from the existing codebase,
+  from external docs, or both?"
+- _Confirm none_ (user refuses all topics): "Confirming: no external
+  research needed — you already know everything the plan will depend on?"
+
+**Non-Goals** (optional):
+- _Anchor_: "What is explicitly NOT in scope? This prevents scope-guardian
+  from flagging gaps for things we deliberately don't do."
+
+**Constraints** (optional):
+- _Anchor_: "Technical, time, or resource constraints the plan must
+  respect? Dependencies, compatibility, deadlines, or budget."
+- _Sharpen_: "You mentioned {deadline / budget / compatibility}. Is it
+  hard or guidance?"
+
+**Preferences** (optional):
+- _Anchor_: "Preferences for libraries, patterns, or architectural style?"
+
+**NFRs** (optional):
+- _Anchor_: "Performance, security, accessibility, or scalability targets?
+  Quantified wherever possible."
+
+**Prior Attempts** (optional):
+- _Anchor_: "Has this been attempted before? What worked or failed?"
+
+### Selection rule
+
+On each loop iteration:
+
+1. Compute the next section to probe:
+   - If any required section is below the initial-signal gate → pick the
+     weakest required section in this priority order:
+     Intent → Goal → Success Criteria → Research Plan.
+   - Else if an optional section is clearly missing and likely material
+     to scope (heuristic: the task description hints at constraints or
+     NFRs) → probe it at most once.
+   - Else: exit Phase 3.
+2. Within the chosen section, pick the question variant:
+   - If `probes == 0` and content is empty → _Anchor_.
+   - If content exists but is shallow → _Follow-up_ or _Sharpen_.
+   - If the section is Research Plan and topics exist → iterate per-topic
+     sharpen across incomplete topics.
+3. Ensure the exact question is NOT already in `question_history`. If it
+   is, pick the next variant or skip to the next weakest section.
+4. Ask via `AskUserQuestion`. Append question to history. Increment probes.
+5. Record the answer into `content`. Never overwrite — merge.
+
+### Research topic identification
+
+As the user answers Intent, Goal, or Success Criteria, listen for:
+
+- **Unfamiliar technologies** — libraries, frameworks, protocols not
+  clearly present in the codebase
+- **Version upgrades** — migrating to a new major version
+- **Security-sensitive decisions** — auth, crypto, data handling
+- **Architectural choices** — pattern X vs Y, library A vs B
+- **Unknown integrations** — third-party APIs, external services
+- **Compliance / legal** — GDPR, accessibility, industry regulations
+
+When you hear one, add a *candidate* topic to `research_plan.topics` with
+only a title and why-it-matters. Probe it on the next Research Plan
+iteration using the per-topic sharpen question to fill in:
+- Research question (must end in `?`)
+- Required for plan steps
+- Scope (local / external / both)
+- Confidence needed (high / medium / low)
+- Estimated cost (quick / standard / deep)
+
+If the user says "I know this" to a candidate topic, remove it from the
+list. Trust the user. If no topics emerge after probing, the user confirms
+"no external research needed" → `research_plan` gate passes with 0 topics.
+
+### Quick mode adjustments
+
+If **mode = quick**:
+- For optional sections, cap probes at 1 each. Do not revisit optional
+  sections during Phase 3.
+- Required sections still have no probe cap — quality gate still applies.
+- Prefer _Anchor_ variants over _Sharpen_ on the first pass.
+
+### Force-stop path
+
+If the user says "skip", "stop asking", "just proceed", "enough", or
+similar, break the loop immediately:
+- Mark any required sections still below the initial-signal gate as
+  `{ incomplete_forced_stop: true }` in state.
+- Proceed to Phase 4 with a note that the brief will carry a reduced
+  confidence flag.
+
+### Exit condition
+
+Exit Phase 3 when:
+- All four required sections meet the initial-signal gate, OR
+- The user has force-stopped.
+
+Report:
+```
+Phase 3 complete: {N} questions asked across {M} sections.
+Proceeding to draft and review.
+```
+
+## Phase 4 — Draft, review, and revise
+
+Phase 4 runs a **draft → brief-reviewer → revise** loop. The draft is
+not written to disk until the brief-review quality gate passes (or the
+iteration cap is hit). This ensures the brief that reaches `/trekplan`
+has already survived a critical review.
+
+Read the brief template first:
+`@${CLAUDE_PLUGIN_ROOT}/templates/trekbrief-template.md`
+
+### Loop bound
+
+**Maximum 3 review iterations.** This bounds cost in the worst case while
+leaving room for two rounds of targeted follow-ups.
+
+### Iteration step-by-step
+
+**Step 4a — Draft in memory**
+
+Build the brief text from Phase 3 state by filling the template:
+
+- **Frontmatter:** populate `task`, `slug`, `project_dir`, `research_topics`
+  (count of topics), `research_status: pending`, `auto_research: false`
+  (will update in Phase 5 if user opts in), `interview_turns` (total
+  questions asked across Phase 3 + Phase 4), `source: interview`.
+- **Intent:** expand the user's motivation into 3–5 sentences. Load-bearing.
+- **Goal:** concrete end state.
+- **Non-Goals:** from state, or "- None explicitly stated" bullet if empty.
+- **Constraints / Preferences / NFRs:** from state, or "Not discussed — no
+  constraints assumed" note if empty.
+- **Success Criteria:** falsifiable commands/observations from state.
+- **Research Plan:** one `### Topic N: {title}` section per topic with the
+  full structure from the template. If 0 topics: write the "No external
+  research needed — user confirmed solid knowledge of all plan
+  dependencies" note.
+- **Open Questions / Assumptions:** from any `"I don't know"` answers
+  recorded during Phase 3, plus implicit gaps.
+- **Prior Attempts:** from state, or "None — fresh task."
+
+**Step 4b — Write draft to disk**
+
+Write the draft to `{PROJECT_DIR}/brief.md.draft` (not `brief.md` — the
+final file is only written after the gate passes).
+
+**Step 4c — Launch brief-reviewer**
+
+Launch the `brief-reviewer` agent (foreground, blocking) with the prompt:
+
+> "Review this task brief for quality: `{PROJECT_DIR}/brief.md.draft`.
+> Check completeness, consistency, testability, scope clarity, and
+> research-plan validity. Report findings, verdict, and the required
+> machine-readable JSON block."
+
+**Step 4d — Parse JSON scores**
+
+Parse the agent's output. Locate the **last** fenced ```json``` block.
+Extract per-dimension scores:
+
+```
+review = {
+  completeness:   { score, gaps },
+  consistency:    { score, issues },
+  testability:    { score, weak_criteria },
+  scope_clarity:  { score, unclear_sections },
+  research_plan:  { score, invalid_topics },
+  verdict:        "PROCEED | PROCEED_WITH_RISKS | REVISE"
+}
+```
+
+**JSON fallback:** if the JSON block is missing, invalid, or a dimension
+is missing, treat all dimensions as `score: 3` and set the `verdict` from
+the prose verdict if present, otherwise `PROCEED_WITH_RISKS`. Emit an
+internal note that the reviewer output was degraded. This ensures the
+loop never deadlocks on a parser error.
+
+**Step 4e — Gate evaluation**
+
+The gate **passes** when all of the following are true:
+
+- `completeness.score ≥ 4`
+- `consistency.score ≥ 4`
+- `testability.score ≥ 4`
+- `scope_clarity.score ≥ 4`
+- `research_plan.score == 5`
+
+(Research Plan requires a perfect score because its format is checked
+mechanically: ends in `?`, `Required for plan steps` filled, scope is
+one of `local | external | both`, confidence is `high | medium | low`.
+Anything less means at least one topic is malformed and planning will
+stumble.)
+
+**If gate passes:**
+1. Move `brief.md.draft` → `brief.md` (atomic rename).
+2. Delete the draft file if rename is not possible on the OS; write
+   `brief.md` fresh.
+3. Break the loop and proceed to Step 4g.
+
+**If gate fails AND iteration count < 3:**
+1. Identify the weakest dimension (lowest score; tie broken by priority:
+   research_plan > testability > completeness > consistency > scope_clarity).
+2. Generate a targeted follow-up question from the dimension's detail
+   field (gaps / issues / weak_criteria / unclear_sections / invalid_topics).
+   Example generators:
+   - `completeness.gaps: ["Non-Goals empty, unclear if deliberate"]`
+     → "You did not specify anything out-of-scope. Is that deliberate, or
+     are there things we should explicitly exclude?"
+   - `testability.weak_criteria: ["'system should be fast'"]`
+     → "'System should be fast' is not falsifiable. Which metric or
+     threshold proves this is met — e.g., p95 < 200ms, or throughput
+     ≥ X requests/sec?"
+   - `research_plan.invalid_topics: [{"topic":"JWT","issue":"Required for plan steps empty"}]`
+     → "For research topic 'JWT': which plan steps depend on the answer?
+     Give one or two concrete kinds of step (e.g., 'library selection',
+     'threat model', 'migration strategy')."
+3. Ask via `AskUserQuestion`. Record the answer into Phase 3 state.
+4. Return to Step 4a with incremented iteration count. The reviewer sees
+   an updated draft, so you MUST re-read the brief and regenerate the
+   review each iteration — do not reuse stale scores.
+5. When launching the reviewer on iteration 2 or 3, include prior
+   questions in the prompt so it does not produce circular follow-ups:
+   > "Questions already asked during this interview: {list from
+   > question_history}. Focus on issues that remain after those answers —
+   > do not re-raise gaps that have already been addressed."
+
+**If gate fails AND iteration count == 3 (loop exhausted):**
+1. Move `brief.md.draft` → `brief.md`.
+2. Add `brief_quality: partial` to the frontmatter (edit the file
+   post-rename — insert the key above the closing `---`).
+3. Add a `## Brief Quality` section near the end with the failing
+   dimensions and their `detail` arrays from the final review, formatted:
+   ```
+   ## Brief Quality
+
+   Review loop exhausted after 3 iterations. The following dimensions
+   did not reach the pass threshold:
+
+   - **Research Plan (score 2/5):** Topic 'JWT library' missing
+     Required-for-plan-steps field.
+   - **Testability (score 3/5):** Success criterion "works correctly"
+     is not falsifiable.
+
+   Downstream planning will treat these as reduced-confidence areas.
+   ```
+4. Break the loop and proceed to Step 4g.
+
+### Step 4f — Force-stop handling
+
+If during any `AskUserQuestion` in Step 4e the user says "stop", "skip",
+"enough", "just write it", or similar, do NOT exit the loop immediately.
+Instead, surface the current review findings in plain text:
+
+```
+Brief-reviewer would flag these issues:
+- Research Plan (score 2/5): Topic 'JWT library choice' missing Required-for-plan-steps field.
+- Testability (score 3/5): Success criterion "works correctly" is not falsifiable.
+
+Continue anyway? The plan will have lower confidence in these areas.
+```
+
+Then ask via `AskUserQuestion`:
+
+| Option | Action |
+|--------|--------|
+| **Answer one more follow-up** | Return to Step 4e with the current weakest-dimension question. |
+| **Stop now (accept partial brief)** | Finalize brief with `brief_quality: partial` and the `## Brief Quality` section (same path as iteration-cap exhaustion). Break loop. |
+
+The force-stop path is distinct from a silent iteration cap: the user
+sees exactly which dimensions are weak and chooses informed.
+
+### Step 4g — Finalize
+
+After the loop exits (pass, cap, or force-stop), ensure:
+- `brief.md` exists at `{PROJECT_DIR}/brief.md`.
+- `brief.md.draft` no longer exists.
+- If the loop ended without a clean pass, frontmatter contains
+  `brief_quality: partial` and a `## Brief Quality` section exists.
+- If the loop ended with a clean pass, `brief_quality` is either
+  absent or set to `complete`.
+
+Populate the "How to continue" footer with the actual project path and
+topic questions.
+
+**Schema sanity check (since v3.1.0):** before reporting, run the brief
+validator. This catches frontmatter typos and state-machine inconsistencies
+the brief-reviewer rubric does not check (e.g. `research_status: skipped`
+with `research_topics: 3` and no `brief_quality: partial`).
+
+```bash
+node ${CLAUDE_PLUGIN_ROOT}/lib/validators/brief-validator.mjs --json "{PROJECT_DIR}/brief.md"
+```
+
+If the validator returns errors, report them to the user and offer to
+re-enter Phase 4 with the validator's hints in scope. If only warnings,
+note them in the final report.
+
+Report:
+```
+Brief written: {PROJECT_DIR}/brief.md
+Review iterations: {1..3}
+Final quality: {complete | partial}
+Validator: {PASS | warnings(N)}
+Research topics identified: {N}
+```
+
+## Phase 5 — Auto-orchestration opt-in (if research_topics > 0)
+
+**Skip this phase if research_topics = 0.** Proceed directly to Phase 6.
+
+Ask the user via `AskUserQuestion`:
+
+**Question:** "You have {N} research topic(s). How do you want to proceed?"
+
+| Option | Description |
+|--------|-------------|
+| **Manual (default)** | Print the commands. You run `/trekresearch` and `/trekplan` yourself, choosing depth per topic. |
+| **Auto (managed by Claude Code)** | I run all {N} research topics sequentially in foreground, then automatically trigger `/trekplan` when research completes. This session blocks until the plan is ready. |
+
+### Manual path (default)
+
+Output:
+
+```
+## Brief complete
+
+Project: {PROJECT_DIR}/
+Brief:   {PROJECT_DIR}/brief.md
+Research topics: {N}
+
+Next steps (run in order or parallel):
+
+{For each topic:}
+  /trekresearch --project {PROJECT_DIR} --external "{topic question}"
+
+Then:
+  /trekplan --project {PROJECT_DIR}
+
+Then:
+  /trekexecute --project {PROJECT_DIR}
+```
+
+Stop. Do not continue to Phase 6.
+
+### Auto path
+
+Set `auto_research: true` in the brief's frontmatter (edit the file).
+
+Emit the brief-approved lifecycle event so downstream observability sees
+the pipeline kick off (consumed by `lib/stats/event-emit.mjs`):
+
+```bash
+node ${CLAUDE_PLUGIN_ROOT}/lib/stats/event-emit.mjs \
+  --event brief-approved \
+  --payload "{\"project\":\"${PROJECT_DIR}\"}"
+```
+
+If `gates_mode == true`: pause here via `AskUserQuestion` —
+"Auto-mode confirmed. Proceed to research now? (yes/no)". If the user
+answers no, fall back to the manual path output and stop. Otherwise
+proceed to Phase 6.
+
+If `gates_mode == false` (default in auto): proceed directly to Phase 6.
+The chain stops only at the main-merge gate (see `commands/trekexecute.md`
+Phase 8).
+
+Proceed to Phase 6.
+
+## Phase 6 — Auto research dispatch (auto path only)
+
+**Runs only when user opted into auto mode.**
+
+### Step 6a — Confirm proceed
+
+Tell the user auto mode will run in foreground and block the session, then
+confirm via `AskUserQuestion`:
+
+**Question:** "Auto mode runs {N} research topic(s) sequentially and then
+the plan — all in foreground. This session blocks until the plan is ready.
+Continue?"
+
+| Option | Action |
+|--------|--------|
+| **Continue — auto** | Proceed. |
+| **Cancel — do manual** | Revert to manual path (print commands, stop). |
+
+If cancelled → fall back to manual path output and stop.
+
+### Step 6b — Run research topics sequentially (inline)
+
+Set `research_status: in_progress` in the brief's frontmatter.
+
+For each research topic (index i = 1 .. N), invoke `/trekresearch`
+inline in this main-context session:
+
+```
+/trekresearch --project {PROJECT_DIR} {--external | --local | (none)} "{topic i question}"
+```
+
+Pass the scope flag that matches the topic's scope hint. Wait for each
+invocation to finish writing the research brief at
+`{PROJECT_DIR}/research/{NN}-{topic-slug}.md` before moving to the next
+topic.
+
+> **Why sequential inline instead of parallel background?** Background
+> orchestrator-agents cannot spawn the research swarm — the Claude Code
+> harness does not expose the Agent tool to sub-agents, so a background
+> run silently degrades to single-context reasoning without WebSearch /
+> Tavily / WebFetch / Gemini (see v2.4.0 release notes). Running each
+> research pass inline in main context keeps the swarm intact. For true
+> parallel execution, use `claude -p` invocations in separate terminal
+> windows.
+
+### Step 6c — Verify all briefs landed
+
+After the last topic completes, verify each research brief file exists:
+
+```bash
+ls -1 {PROJECT_DIR}/research/*.md | wc -l
+```
+
+Expected count: N. If any are missing, report and ask the user how to
+proceed (retry, skip missing topic, cancel).
+
+Update brief frontmatter: `research_status: complete`.
+
+### Step 6d — Auto-trigger planning (inline foreground)
+
+Invoke the planning command inline in this session:
+
+```
+/trekplan --project {PROJECT_DIR}
+```
+
+The planning pipeline runs all phases (exploration, synthesis, review) in
+main context. Wait for the plan to be written to `{PROJECT_DIR}/plan.md`
+before continuing.
+
+### Step 6e — Report completion
+
+When the planning-orchestrator finishes, present:
+
+```
+## Ultrabrief + Ultraresearch + Voyage Complete (auto mode)
+
+**Project:** {PROJECT_DIR}/
+**Brief:** {PROJECT_DIR}/brief.md
+**Research briefs:** {N} in {PROJECT_DIR}/research/
+**Plan:** {PROJECT_DIR}/plan.md
+
+### Pipeline summary
+
+| Step | Status |
+|------|--------|
+| Brief | Complete ({interview_turns} interview turns) |
+| Research | Complete ({N} topics, sequential foreground) |
+| Plan | Complete ({steps} steps, critic: {verdict}) |
+
+Next:
+  /trekexecute --project {PROJECT_DIR}
+
+Or:
+  /trekexecute --dry-run --project {PROJECT_DIR}   # preview
+  /trekexecute --validate --project {PROJECT_DIR}  # schema check
+```
+
+## Phase 7 — Stats tracking
+
+Append one record to `${CLAUDE_PLUGIN_DATA}/trekbrief-stats.jsonl`:
+
+```json
+{
+  "ts": "{ISO-8601}",
+  "task": "{task description (first 100 chars)}",
+  "slug": "{slug}",
+  "mode": "{default | quick}",
+  "interview_turns": {N},
+  "review_iterations": {1..3},
+  "brief_quality": "{complete | partial}",
+  "research_topics": {N},
+  "auto_research": {true | false},
+  "auto_result": "{completed | cancelled | failed | manual}",
+  "project_dir": "{path}"
+}
+```
+
+If `${CLAUDE_PLUGIN_DATA}` is not set or not writable, skip silently.
+Never let stats failures block the workflow.
+
+## Hard rules
+
+1. **Interactive only.** This command requires user input. There is no
+   `--fg` or background mode — the interview cannot run headless.
+2. **Brief is the contract.** Every section must have substantive content
+   or an explicit "Not discussed" note. No empty sections.
+3. **Intent is load-bearing.** Do not accept a one-line intent. Expand with
+   the user until motivation is clear — the plan and every review agent
+   will trace decisions back to this.
+4. **Research topics must be answerable.** Each topic's research question
+   must be phrased so `/trekresearch` can answer it. If a topic is
+   too vague, split or reformulate before writing.
+5. **Never invent research topics the user did not agree to.** Topics
+   come from the interview. If the user says "I know this", respect it.
+6. **Project dir is the single source of truth.** Every artifact (brief,
+   research briefs, plan, progress) lives in one project directory.
+   Never scatter files across `.claude/research/`, `.claude/plans/`, etc.
+7. **Auto mode blocks foreground.** If the user opts into auto, this
+   session waits for research + planning to complete. Document this in
+   the opt-in question.
+8. **Quality gates, not question counts.** Phase 3 and Phase 4 are
+   quality-gated loops; do not enforce a hard cap on interview questions.
+   The brief-review gate (Phase 4) caps at 3 review iterations to bound
+   cost, but Phase 3 has no cap — required-section content drives exit.
+9. **Never write `brief.md` while the review gate is still pending.**
+   Draft lives in `brief.md.draft` until the loop terminates. A caller
+   that sees `brief.md` must be able to trust that Phase 4 finished.
+10. **Privacy:** never log prompt text, secrets, or credentials.
--- a/plugins/voyage/commands/trekcontinue.md
+++ b/plugins/voyage/commands/trekcontinue.md
@ -0,0 +1,307 @@
+---
+name: trekcontinue
+description: Resume the next session in a multi-session trekplan project. Reads .session-state.local.json and immediately begins the next session.
+argument-hint: "[<project-dir> | --help]"
+model: opus
+---
+
+# Ultracontinue Local v1.0
+
+Zero-friction multi-session resumption. In a fresh Claude Code session, type
+`/trekcontinue` — the command reads the per-project state file
+(`.claude/projects/<project>/.session-state.local.json`), shows a 3-line summary,
+and immediately begins executing the next session.
+
+The state file is the contract. Any session-end mechanism may write it
+(`/trekexecute` Phase 8 / Phase 2.55 / Phase 4, the
+`/trekendsession` helper, or — in the future — `graceful-handoff`).
+This command only reads.
+
+Pipeline position:
+
+```
+/trekplan           →  plan.md
+/trekexecute        →  progress.json + .session-state.local.json
+... session boundary, fresh chat ...
+/trekcontinue             →  reads .session-state.local.json, starts next session
+```
+
+See **Handover 7** in `docs/HANDOVER-CONTRACTS.md` for the full schema.
+
+## Phase 0 — `--help` handling
+
+Parse `$ARGUMENTS` with `parseArgs($ARGUMENTS, 'trekcontinue')` from
+`lib/parsers/arg-parser.mjs`. Dispatch the usage block ONLY when one of these
+two conditions equals exactly true (no substring search, no "contains" check):
+
+- `flags['--help'] === true`, OR
+- `positional[0] === '-h'` (single-dash short form — the parser keeps it as
+  positional because the schema does not declare an alias).
+
+In every other case — including when `$ARGUMENTS` is empty, whitespace-only,
+the literal empty string `""`, or a positional project-dir — fall through to
+Phase 1. Do NOT print the usage block on empty args.
+
+```
+/trekcontinue — Resume the next session in a multi-session trekplan project.
+
+Usage:
+  /trekcontinue                                       # auto-discover state file under cwd
+  /trekcontinue <project-dir>                         # explicit project directory
+  /trekcontinue --cleanup <project-dir>               # dry-run: list stale files
+  /trekcontinue --cleanup --confirm <project-dir>     # actually delete (requires status: completed)
+  /trekcontinue --help                                # this message
+
+Reads .claude/projects/<project>/.session-state.local.json (per-project,
+gitignored). On a valid resumable state, prints a 3-line summary and begins
+executing the next session immediately. No interactive confirmation prompt.
+
+State-file schema (v1):
+  schema_version           1
+  project                  string
+  next_session_brief_path  string (validator soft-checks file existence)
+  next_session_label       string
+  status                   in_progress | partial | failed | stopped | completed
+                           (completed → no further sessions to resume)
+  updated_at               ISO-8601 timestamp
+  (unknown top-level keys are tolerated — forward-compat for graceful-handoff v2.2)
+
+Typical flow:
+  /trekbrief                # produces brief.md
+  /trekplan --project ...   # produces plan.md
+  /trekexecute --project .. # writes session-state on session-end
+  ... (fresh Claude chat) ...
+  /trekcontinue                   # reads session-state, runs next session
+```
+
+## Phase 0.5 — Cleanup mode dispatch
+
+After `parseArgs` has resolved `$ARGUMENTS`, check the parsed `flags`
+object directly (NOT a string contains-check on raw `$ARGUMENTS` — that
+substring pattern was the root cause of Bug 1).
+
+If `flags['--cleanup'] === true`, switch into the terminal cleanup
+flow and do NOT proceed to Phase 1 or any later phase.
+
+**Required positional:** an explicit `<project-dir>` (`positional[0]`).
+There is no "clean all" mode — accidental wholesale deletion would be
+irreversible. If `positional[0]` is missing, empty, or starts with `-`,
+print this usage block to stderr and exit non-zero:
+
+```
+Error: /trekcontinue --cleanup requires <project-dir>.
+Usage:
+  /trekcontinue --cleanup <project-dir>            # dry-run: list stale files
+  /trekcontinue --cleanup --confirm <project-dir>  # actually delete (status: completed)
+```
+
+**Compute mode from parsed flags:**
+
+```
+dryRun  = (flags['--confirm'] !== true)
+confirm = (flags['--confirm'] === true)
+```
+
+**Invoke cleanup inline.** Emit the concrete project-dir path as a literal
+token in the Bash command — never a template placeholder — same
+anti-substitution rule as Phase 2:
+
+```
+node --input-type=module -e "import {cleanupProject} from './lib/util/cleanup.mjs'; const [, dir, mode] = process.argv; const r = cleanupProject(dir, {dryRun: mode !== 'confirm', confirm: mode === 'confirm'}); console.log(JSON.stringify(r, null, 2)); process.exit(r.valid ? 0 : 1)" '<RESOLVED-PROJECT-DIR>' '<MODE>'
+```
+
+Substitute `<RESOLVED-PROJECT-DIR>` with the literal `positional[0]`
+value you have in your working context, and `<MODE>` with either the
+literal string `dryrun` or the literal string `confirm` based on the
+booleans above. The validator emits a `{valid, errors, warnings, parsed}`
+JSON record. Print it to stdout. Exit with the validator's exit code.
+
+**Cleanup is a terminal mode.** It must not fall through to Phase 1/2/3/4.
+Operators who want to resume after cleanup must invoke `/trekcontinue`
+again without `--cleanup`.
+
+## Phase 1 — Resolve project directory
+
+The parsed `positional[0]` from Phase 0 is the explicit project-dir argument,
+when present. Otherwise (empty `$ARGUMENTS` or whitespace-only) auto-discover.
+
+### Step 1.a — Reject `.md` positional argument (SC-2)
+
+If `positional[0]` is non-empty AND ends in `.md`, the user almost certainly
+pasted a `NEXT-SESSION-PROMPT.local.md` path instead of a project directory.
+Print the following diagnostic to stderr and exit non-zero. Do NOT proceed.
+
+```
+Error: expected <project-dir>, got a markdown file path: <positional[0]>
+Did you mean to paste the file path as a project directory?
+Usage: /trekcontinue <project-dir>
+```
+
+### Step 1.b — Auto-discover candidates
+
+When no explicit project-dir was given, enumerate
+`.claude/projects/*/.session-state.local.json` paths with `node -e`
+(NOT shell glob — harness-mode safety) and emit each as one JSON line of
+`{"path": ..., "updated_at": ...}` so Phase 1 can sort numerically:
+
+```bash
+!`node -e "const fs=require('fs'),path=require('path');const root='.claude/projects';if(!fs.existsSync(root))process.exit(0);for(const d of fs.readdirSync(root)){const p=path.join(root,d,'.session-state.local.json');if(!fs.existsSync(p))continue;let u='';try{u=(JSON.parse(fs.readFileSync(p,'utf8'))||{}).updated_at||''}catch(_){};process.stdout.write(JSON.stringify({path:p,updated_at:u})+'\\n');}"`
+```
+
+Sort the emitted candidates by `Date.parse(updated_at)` descending (newest
+first) — numeric comparison, NOT lexicographic string compare. The newest
+resumable state wins.
+
+### Step 1.c — Decision tree
+
+- **0 candidates and no explicit arg:** print SC-2 cold-start message and exit:
+  ```
+  No active multi-session project here.
+  Start with /trekbrief or /trekplan.
+  ```
+- **1 candidate (or explicit non-`.md` arg):** continue to Phase 2 with that path.
+- **>1 candidates and no explicit arg:** with the Date.parse sort applied, the
+  newest resumable state wins automatically and the command continues to Phase 2
+  with that path. (Operators who want a different candidate re-invoke as
+  `/trekcontinue <project-dir>`.)
+
+## Phase 1.5 — Frontmatter consistency check
+
+Bug 3 contract: producers (`/trekexecute`, `/trekendsession`)
+write `NEXT-SESSION-PROMPT.local.md` with YAML frontmatter (`produced_by:`,
+`produced_at:`). Multiple producers may have written candidates in different
+locations; this phase refuses ambiguity before validating the state file.
+
+After resolving the project directory and state-file path, look for two
+`NEXT-SESSION-PROMPT.local.md` candidates:
+
+  a. `<plugin-root>/NEXT-SESSION-PROMPT.local.md` — operator-managed master file
+  b. `<project-dir>/NEXT-SESSION-PROMPT.local.md` — producer-written sibling
+
+**If both exist:**
+
+- Read both via the **Read tool** (NOT Bash — same anti-substitution rule
+  as Phase 2).
+- Invoke the consistency validator with both paths emitted as concrete
+  literal tokens (no template substitution at the Bash boundary):
+
+  ```
+  node lib/validators/next-session-prompt-validator.mjs --json --consistency <RESOLVED-PATH-A> <RESOLVED-PATH-B>
+  ```
+
+  Replace `<RESOLVED-PATH-A>` and `<RESOLVED-PATH-B>` with the two concrete
+  filesystem paths you have in your working context. The validator emits
+  `{valid, errors, warnings}` JSON on stdout.
+
+- **If `valid: false`** (typically `NEXT_SESSION_PROMPT_PRODUCER_MISMATCH`):
+  print the structured `errors[]` (each `[code] message` on its own line),
+  list both candidate paths, and exit non-zero. Do NOT proceed to Phase 2.
+  Resolve the conflict by deleting the stale candidate (run
+  `/trekcontinue --cleanup --confirm <project-dir>` after the
+  current session closes, or remove by hand).
+
+- **If `valid: true` with a `NEXT_SESSION_PROMPT_WALL_CLOCK_DRIFT` warning**
+  (one of the candidates is more than 24h old): print the warning to stderr
+  but continue — long pauses (weekend, vacation) are not failures.
+
+- **If `valid: true` with a `NEXT_SESSION_PROMPT_STALE_IGNORED` warning**
+  (one candidate is older than the state file's `updated_at`): print the
+  warning and continue. The state-anchored check is the primary refusal
+  signal; staleness simply rejects the older candidate.
+
+**If only one exists:** continue to Phase 2. No comparison needed.
+
+**If neither exists:** continue to Phase 2. Legacy projects and first-run
+flows have no NEXT-SESSION-PROMPT files.
+
+## Phase 2 — Validate the state file
+
+Phase 1 resolved a concrete state-file path. That path is a real string in
+your working context — never a template. Phase 2 must read and validate the
+state file without any placeholder substitution.
+
+### Step 2.a — Read the file with the Read tool (no Bash)
+
+Use the **Read tool** on the resolved state-file path from Phase 1. Do NOT use
+Bash for the read. The Read tool is deterministic and not subject to
+shell-substitution errors. Parse the returned JSON body programmatically.
+
+### Step 2.b — Schema-validate the parsed object
+
+Verify the schema by invoking the existing validator CLI shim. Emit the
+resolved absolute path as a literal string token in the Bash command — the
+exact same string you just passed to the Read tool. The validator accepts
+`--json <path>` and prints a `{valid, errors, warnings}` JSON record:
+
+```
+node lib/validators/session-state-validator.mjs --json <RESOLVED-ABSOLUTE-PATH-FROM-PHASE-1>
+```
+
+Replace `<RESOLVED-ABSOLUTE-PATH-FROM-PHASE-1>` with the actual path string at
+the time you issue the Bash call. There is no template engine; the string is
+substituted by you, the model, before the Bash tool sees the command.
+
+**Anti-substitution invariant.** If you ever find yourself about to emit a
+literal angle-bracket placeholder, or any other unresolved variable name, to
+the Bash tool — STOP. The resolved path is a concrete value you already have
+from Phase 1; emit the value, not a placeholder for it.
+
+### Step 2.c — Interpret the result
+
+- **Validator exit code != 0 OR `valid: false` in JSON output:** print the
+  structured `errors[]` (each `[code] message` on its own line) and exit. Do
+  not proceed to narration. Suggest running the validator directly for
+  follow-up: `node lib/validators/session-state-validator.mjs <path>`.
+- **`valid: true` AND any warning has `code: SESSION_STATE_NOT_RESUMABLE`**
+  (i.e. `status: completed`): print "no further sessions to resume; project
+  complete" and exit cleanly.
+- **`valid: true` AND status is one of `in_progress | partial | failed | stopped`:**
+  proceed to Phase 3.
+
+## Phase 3 — Narrate 3-line summary
+
+Print this exact template (using values from the validated `parsed` object):
+
+```
+Project: {project}
+Next session: {next_session_label}
+Brief: {next_session_brief_path}
+```
+
+No interactive confirmation prompt — per the brief NFR ("ingen prompts, men la
+informasjon synes"). The 3-line block is informational only.
+
+## Phase 4 — Begin execution
+
+Read the file at `next_session_brief_path` (it is the brief that the next
+session is supposed to execute — typically the same `brief.md` for
+single-brief multi-session plans, or a session-specific spec for parallel
+session decomposition). Understand the task and begin executing per the
+standard trekplan pipeline. The user did not type a separate "start"
+command — `/trekcontinue` is the start.
+
+If the brief file does not exist (validator emits a warning but does not
+fail), print: `Warning: next_session_brief_path "{path}" does not exist on
+disk. Cannot continue automatically.` and exit. Do not guess.
+
+## Phase 5 — Stats tracking
+
+Append a one-line JSON record to `${CLAUDE_PLUGIN_DATA}/trekcontinue-stats.jsonl`
+if the env var is set; silently skip otherwise.
+
+```json
+{"ts":"<iso-8601>","project":"<project>","next_session_label":"<label>","status":"<status>"}
+```
+
+## Hard rules
+
+- **Idempotent.** Running `/trekcontinue` twice in the same Claude session
+  does not advance state — the writer (Phase 8 / hook / helper) advances state
+  only when a session completes.
+- **Zero secrets in the state file.** Status, paths, labels — never API keys,
+  never user content beyond filenames.
+- **NEVER auto-load via SessionStart.** The command is operator-invoked only.
+  Auto-loading would re-introduce the stale-file risk noted in
+  `feedback_next_session_prompt_manual.md`.
+- **No interactive prompts.** Phases 0–4 must run without `AskUserQuestion`.
+  This keeps the command headless-safe.
--- a/plugins/voyage/commands/trekendsession.md
+++ b/plugins/voyage/commands/trekendsession.md
@ -0,0 +1,172 @@
+---
+name: trekendsession
+description: Mark the current session as complete and write session-state pointing at the next session. Helper for informal multi-session flows.
+argument-hint: "<next-brief-path> <next-label> | --help"
+model: sonnet
+---
+
+# Voyage End-Session Local v1.0
+
+Tiny helper for **informal** multi-session flows (no formal plan with
+Execution Strategy). Writes a `.session-state.local.json` pointing at the
+next session so `/trekcontinue` can resume in a fresh Claude chat.
+
+For formal flows (a plan produced by `/trekplan --project`),
+`/trekexecute` Phase 8 already writes the state file — this helper
+is unnecessary there. Use this command for ad-hoc release runs, manual
+multi-session handovers, or any flow that does not run through
+`/trekexecute`.
+
+Pipeline position:
+
+```
+... session N work ...
+/trekendsession <brief> "<next-label>"   →  writes state
+... session boundary, fresh chat ...
+/trekcontinue                                         →  reads state, starts session N+1
+```
+
+See **Handover 7** in `docs/HANDOVER-CONTRACTS.md` for the schema.
+
+## Phase 0 — `--help` handling
+
+If `$ARGUMENTS` contains `--help` or `-h`, print the usage block below and exit
+cleanly. Do NOT proceed to any further phase.
+
+```
+/trekendsession — Mark current session done; point at next session.
+
+Usage:
+  /trekendsession <next-brief-path> <next-label>
+  /trekendsession --help
+
+Both arguments are REQUIRED. No interactive prompt — headless-safe.
+
+Writes <project-dir>/.session-state.local.json with:
+  schema_version           1
+  project                  <auto-resolved from cwd>
+  next_session_brief_path  <next-brief-path argument>
+  next_session_label       <next-label argument>
+  status                   in_progress
+  updated_at               <now, ISO-8601>
+
+Then validates via lib/validators/session-state-validator.mjs and prints
+the same 3-line narration that /trekcontinue will show in the next session.
+
+Example:
+  /trekendsession .claude/projects/2026-05-01-feature/brief.md "Session 2 of 3"
+```
+
+## Phase 1 — Resolve project directory
+
+Resolve the nearest `.claude/projects/*/brief.md` from cwd (the current working
+directory). Use `node -e` enumeration (NOT shell glob — harness-mode safety):
+
+```bash
+!`node --input-type=module -e "import {existsSync, readdirSync} from 'node:fs'; import {join} from 'node:path'; const root='.claude/projects'; if(!existsSync(root)) process.exit(0); readdirSync(root).filter(d=>existsSync(join(root,d,'brief.md'))).forEach(d=>process.stdout.write(join(root,d)+'\\n'));"`
+```
+
+Decision tree:
+
+- **0 candidates:** print error to stderr — "no `.claude/projects/<dir>/brief.md`
+  found under cwd; cannot determine project directory" — and exit 1. Do NOT
+  fall back to a synthesized path.
+- **1 candidate:** use it as `<project-dir>`. Continue.
+- **>1 candidates:** print all paths and ask the operator to `cd` into the
+  intended project directory before retrying. Exit 1.
+
+## Phase 2 — Required args check (headless-safe)
+
+Read `$ARGUMENTS`. Both `<next-brief-path>` and `<next-label>` are required.
+If either is missing or empty:
+
+```
+Error: missing required args.
+Usage: /trekendsession <next-brief-path> '<next-label>'
+```
+
+Print to stderr and exit 1. **No interactive prompt** — this keeps the helper
+headless-safe (per brief NFR; addresses adversarial-review major #11). If you
+want an interactive flow, use `/trekcontinue --help` to see the full pipeline.
+
+## Phase 3 — Atomically write `.session-state.local.json` + sibling NEXT-SESSION-PROMPT.local.md
+
+Write `<project-dir>/.session-state.local.json` with the schema-v1 object:
+
+```json
+{
+  "schema_version": 1,
+  "project": "<project-dir>",
+  "next_session_brief_path": "<arg 1>",
+  "next_session_label": "<arg 2>",
+  "status": "in_progress",
+  "updated_at": "<now, ISO-8601>"
+}
+```
+
+Use the atomic-write util — write to `<path>.tmp`, then `rename` into place —
+to avoid partial-state on crash. The util is ESM, so invoke via
+`node --input-type=module -e` with an `import` statement (a CommonJS shim
+would throw `ERR_REQUIRE_ESM` on Node 18+ since `atomic-write.mjs` is ESM).
+
+Under `node --input-type=module -e "<script>" arg1 arg2 arg3`, Node sets
+`process.argv[0]` to the node binary path and user args start at
+`process.argv[1]`. Adjust the destructure if your Node version differs.
+
+This phase ALSO writes a sibling `NEXT-SESSION-PROMPT.local.md` in the
+project directory with YAML frontmatter (`produced_by: trekendsession`,
+`produced_at: <ISO-8601>`, `project: <project-dir>`). Both files are written
+in a single ESM block so the writes succeed or fail together:
+
+```bash
+!`node --input-type=module -e "
+import path from 'node:path';
+import { writeFileSync } from 'node:fs';
+import { atomicWriteJson } from './lib/util/atomic-write.mjs';
+const [, dir, brief, label] = process.argv;
+const now = new Date().toISOString();
+const stateObj = { schema_version: 1, project: dir, next_session_brief_path: brief, next_session_label: label, status: 'in_progress', updated_at: now };
+const stateFile = path.join(dir, '.session-state.local.json');
+atomicWriteJson(stateFile, stateObj);
+const promptFile = path.join(dir, 'NEXT-SESSION-PROMPT.local.md');
+const promptBody = '---\\nproduced_by: trekendsession\\nproduced_at: ' + now + '\\nproject: ' + dir + '\\n---\\n\\n# ' + label + '\\n\\nResume via /trekcontinue.\\n';
+writeFileSync(promptFile, promptBody);
+console.log(stateFile);
+console.log(promptFile);
+" '<project-dir>' '<next-brief-path>' '<next-label>'`
+```
+
+## Phase 4 — Validate + narrate
+
+Validate the freshly-written state file:
+
+```bash
+!`node lib/validators/session-state-validator.mjs --json <project-dir>/.session-state.local.json`
+```
+
+If `valid: true`, print the success block matching `/trekcontinue` Phase 3
+narration (SC-8 cross-project consistency — same template both sides):
+
+```
+Session state written: <project-dir>/.session-state.local.json
+
+Project: <project-dir>
+Next session: <next-label>
+Brief: <next-brief-path>
+
+In a fresh Claude session, run /trekcontinue to resume.
+```
+
+If `valid: false`, print the structured `errors[]` and exit 1. Investigate
+before retrying — usually means a bad path or label argument.
+
+## Hard rules
+
+- **Required args, no defaults.** Never invent a brief path or session label.
+  If args are missing, fail loud.
+- **Atomic write only.** Tmp + rename — no partial state files on disk.
+- **Zero secrets.** Status, paths, labels — never API keys, never user content
+  beyond filenames.
+- **NEVER auto-invoke this command.** It is operator-typed only at session-end.
+- **Idempotent within a session.** Running twice with the same args
+  overwrites cleanly (atomic rename); does not double-advance.
--- a/plugins/voyage/commands/trekexecute.md
+++ b/plugins/voyage/commands/trekexecute.md
--- a/plugins/voyage/commands/trekplan.md
+++ b/plugins/voyage/commands/trekplan.md
@ -0,0 +1,855 @@
+---
+name: trekplan
+description: Deep implementation planning from a task brief. Requires --brief or --project. Runs parallel specialized agents, optional external research, and adversarial review.
+argument-hint: "--brief <path> | --project <dir> [--fg | --quick | --research <brief> | --decompose <plan> | --export <fmt> <plan>]"
+model: opus
+allowed-tools: Agent, Read, Glob, Grep, Write, Edit, Bash, AskUserQuestion, TaskCreate, TaskUpdate, TeamCreate, TeamDelete
+---
+
+# Voyage Local v2.0
+
+Deep, multi-phase implementation planning driven by a **task brief**.
+Planning consumes the brief (produced by `/trekbrief`) and any
+research briefs referenced in it, then runs specialized exploration
+agents, synthesis, and adversarial review to produce an executable plan.
+
+**v2.0 is a breaking release.** The interview phase has been extracted
+into `/trekbrief`. This command no longer accepts free-text task
+descriptions — it requires either `--brief <path>` or `--project <dir>`.
+
+Pipeline position:
+
+```
+/trekbrief     →  brief.md
+/trekresearch  →  research/*.md
+/trekplan      →  plan.md            (this command)
+/trekexecute   →  execution
+```
+
+## Phase 1 — Parse mode and validate input
+
+Parse `$ARGUMENTS` for mode flags. Order of precedence:
+
+1. **`--export <format> <plan-path>`** — extract `{format}` (first token after
+   `--export`) and `{plan-path}` (remainder). Valid formats: `pr`, `issue`,
+   `markdown`, `headless`. Set **mode = export**.
+
+   If format is not in the valid set:
+   ```
+   Error: unknown export format '{format}'. Valid: pr, issue, markdown, headless
+   ```
+   If the plan file does not exist:
+   ```
+   Error: plan file not found: {path}
+   ```
+
+2. **`--decompose <plan-path>`** — extract the plan path. Set **mode = decompose**.
+   If the plan file does not exist:
+   ```
+   Error: plan file not found: {path}
+   ```
+
+3. **`--project <dir>`** — extract the project directory path.
+   - Resolve `{dir}` (trim trailing slash).
+   - Derive implicit flags:
+     - `--brief {dir}/brief.md`
+     - Plan destination: `{dir}/plan.md`
+     - Research briefs auto-discovered from `{dir}/research/*.md` (sorted).
+   - If `{dir}` does not exist or `{dir}/brief.md` is missing:
+     ```
+     Error: project directory not initialized. Run /trekbrief to create it.
+     Missing: {dir}/brief.md
+     ```
+   - Set **project_dir = {dir}**, **brief_path = {dir}/brief.md**.
+   - **Validate inputs** (soft mode — warnings do not block, errors do):
+     ```bash
+     # Brief schema sanity check (frontmatter + state machine, soft on body sections)
+     node ${CLAUDE_PLUGIN_ROOT}/lib/validators/brief-validator.mjs --soft --json "{dir}/brief.md"
+
+     # Research briefs (if any) — drift-warn only, none of these block the run
+     [ -d "{dir}/research" ] && \
+       node ${CLAUDE_PLUGIN_ROOT}/lib/validators/research-validator.mjs --soft --dir "{dir}/research" --json
+
+     # Architecture note discovery (EXTERNAL CONTRACT — drift-WARN, never drift-FAIL)
+     node ${CLAUDE_PLUGIN_ROOT}/lib/validators/architecture-discovery.mjs --json "{dir}"
+     ```
+     Each call exits 0 on success or with a structured JSON error report on stderr.
+     Surface any warnings in the user-facing summary at Phase 3, but do not abort.
+   - Set **has_research_brief = true** if `{dir}/research/*.md` matches ≥ 1 file.
+   - Read the architecture-discovery JSON output: set **has_architecture_note = true**
+     if `found == true`. The discovery module emits warnings if the file lives at a
+     non-canonical path (e.g. `architecture-overview.md`); preserve them for the
+     user-facing summary. If set, **architecture_note_path = {result.overview}**.
+     Produced by an external opt-in architect plugin (no longer publicly distributed;
+     the filesystem slot remains available for any compatible producer). Missing file
+     is fine — additive discovery, not required.
+
+4. **`--brief <path>`** — extract the brief path. If the file does not exist:
+   ```
+   Error: brief file not found: {path}
+   ```
+   Set **brief_path = {path}**. Plan destination will be derived in Phase 3
+   from the brief's slug and date (see Phase 3).
+
+5. **`--research <brief.md> [brief2.md] [brief3.md]`** — collect paths after
+   `--research` until the next `--` flag or a token that does not look like a
+   file path. Maximum 3 briefs. Set **has_research_brief = true**. Validate
+   each path exists — if any is missing:
+   ```
+   Error: research brief not found: {path}
+   ```
+   `--research` combines with `--brief`, `--project`, `--fg`, and `--quick`.
+   When combined with `--project`, the explicit `--research` briefs are
+   appended to the auto-discovered ones (deduplicated by path).
+
+6. **`--fg`** — accepted as a no-op alias for backwards compatibility. All
+   phases always run in the main session as of v2.4.0.
+
+7. **`--quick`** — set **mode = quick**. Skip agent swarm; use lightweight
+   Glob/Grep scan and go directly to planning + adversarial review.
+
+8. **`--gates`** — autonomy control. When present, set `gates_mode = true`.
+   Pause for operator confirmation after Phase 5 (exploration), Phase 7
+   (synthesis), and Phase 9 (adversarial review). Default `gates_mode =
+   false` lets phases flow continuously. The flag is consumed by the
+   autonomy-gate state machine via the CLI shim:
+   `node ${CLAUDE_PLUGIN_ROOT}/lib/util/autonomy-gate.mjs --state X --event Y --gates {true|false}`.
+
+9. If neither `--brief` nor `--project` is present after flag parsing,
+   output usage and stop:
+
+```
+Usage: /trekplan --brief <path-to-brief.md>
+       /trekplan --project <project-dir>
+       /trekplan --brief <path> --research <research-brief.md>
+       /trekplan --project <dir> --fg
+       /trekplan --project <dir> --quick
+       /trekplan --export <pr|issue|markdown|headless> <plan-path>
+       /trekplan --decompose <plan-path>
+
+A brief is required. Produce one with /trekbrief first.
+
+Modes:
+  --brief       Plan from a brief file (foreground, v2.4.0+)
+  --project     Plan from a project directory (brief.md + research/ auto-resolved)
+  --research    Add up to 3 extra research briefs as planning context
+  --fg          No-op alias (foreground is the only mode as of v2.4.0)
+  --quick       Skip exploration agent swarm; plan directly
+  --export      Generate shareable output from an existing plan (no new planning)
+  --decompose   Split an existing plan into self-contained headless sessions
+
+Examples:
+  /trekplan --project .claude/projects/2026-04-18-jwt-auth
+  /trekplan --brief .claude/projects/2026-04-18-jwt-auth/brief.md
+  /trekplan --project .claude/projects/2026-04-18-jwt-auth --research extra.md
+  /trekplan --project .claude/projects/2026-04-18-jwt-auth --fg
+  /trekplan --export pr .claude/plans/trekplan-2026-04-06-rate-limiting.md
+  /trekplan --decompose .claude/plans/trekplan-2026-04-06-rate-limiting.md
+
+Migrating from v1.x? See MIGRATION.md in this plugin. The old --spec flag
+and free-text interview mode were removed in v2.0.
+```
+
+Do not continue past this step if no brief was provided.
+
+### Read the brief
+
+Read the brief file and parse its frontmatter. Extract:
+- `task` — one-line task description
+- `slug` — slug for plan filenames
+- `project_dir` — if present, overrides derived project path (optional)
+- `research_topics` — N (used as a sanity check)
+- `research_status` — `pending | in_progress | complete | skipped`
+
+If `research_status == pending` and `research_topics > 0`:
+- Warn the user: "Brief declares {N} research topics but research is still
+  pending. Plan confidence will be lower. Continue anyway?"
+- `AskUserQuestion`: **Continue with low confidence** / **Cancel — run research first**.
+- If cancel: print the research invocations from the brief's "How to continue"
+  section and stop.
+
+Report the detected mode:
+```
+Mode:    {foreground | quick | export | decompose}
+Brief:   {brief_path}
+Project: {project_dir or "-"}
+Research: {N local briefs, M extra via --research}
+```
+
+### When the input is type:trekreview (Handover 6)
+
+The brief input may be a `review.md` produced by `/trekreview`
+instead of a `brief.md` produced by `/trekbrief`. Both files
+share the same handover slot — `type` is the discriminator.
+
+If `fm.type === 'trekreview'`:
+
+  1. Skip the `research_status` gate above (review.md has no
+     `research_topics` and no Research Plan section).
+  2. Extract the `findings` array from the frontmatter — this is the
+     list of 40-char hex finding-IDs the review surfaced.
+  3. Read the body's last fenced ```json``` block to recover the full
+     finding objects (the frontmatter only has IDs; the JSON has the
+     `severity`, `file`, `line`, `rule_key`, `title`, `detail`,
+     `recommended_action` payload).
+  4. Filter findings to severity ∈ `{BLOCKER, MAJOR}`. MINOR and
+     SUGGESTION are skipped for v1.0 plan-input — they are advisory
+     only and would inflate the plan with low-priority churn.
+  5. Treat each remaining finding as a plan goal:
+     - `recommended_action` → step intent
+     - `file` → primary `Files:` target
+     - `id` → goes into the plan's `source_findings:` frontmatter list
+  6. When writing `plan.md`, populate the frontmatter field
+     `source_findings: [<id1>, <id2>, ...]` containing exactly the IDs
+     of the BLOCKER + MAJOR findings consumed. The list provides the
+     audit trail back to `review.md`.
+  7. Use **block-style YAML** for the `source_findings:` list. The
+     frontmatter parser at `lib/util/frontmatter.mjs` does not support
+     flow-style arrays; `source_findings: [a, b]` is broken — use:
+     ```yaml
+     source_findings:
+       - 0123456789abcdef0123456789abcdef01234567
+       - fedcba9876543210fedcba9876543210fedcba98
+     ```
+
+`source_findings:` is **additive and optional** — plans produced from a
+`type: brief` input simply omit the field. No `plan_version` bump is
+required for this addition (backwards compatible).
+
+## Phase 1.5 — Export (runs only when mode = export)
+
+**Skip this phase entirely unless mode = export.**
+
+Read the plan file. Extract these sections from the plan content:
+- Task description (from Context section)
+- Implementation steps (from Implementation Plan section)
+- Risks (from Risks and Mitigations section)
+- Test strategy (from Test Strategy section, if present)
+- Scope estimate (from Estimated Scope section)
+
+### Format: `pr`
+
+Output a markdown block formatted as a PR description:
+
+```
+## Summary
+
+{2–3 sentence summary of what this change does and why}
+
+## Changes
+
+{Bulleted list of implementation steps, one line each}
+
+## Test plan
+
+{Bulleted checklist from test strategy, formatted as - [ ] items}
+
+## Risks
+
+{Risks from plan, abbreviated to 1 line each}
+
+---
+*Generated by trekplan from {plan filename}*
+```
+
+### Format: `issue`
+
+Output a markdown block formatted as an issue comment:
+
+```
+## Implementation plan summary
+
+**Task:** {task description}
+**Plan file:** {plan path}
+**Scope:** {N files, complexity}
+
+### Proposed approach
+{3–5 bullet points from key implementation steps}
+
+### Open questions / risks
+{Top 2–3 risks from plan}
+
+---
+*Generated by trekplan*
+```
+
+### Format: `markdown`
+
+Output the plan content with internal metadata stripped:
+- Remove the "Revisions" section
+- Remove plan-critic and scope-guardian scores/verdicts
+- Remove `[ASSUMPTION]` markers (but keep the surrounding sentence)
+- Keep everything else verbatim
+
+### Format: `headless`
+
+This is a shortcut for `--decompose`. It runs the full session decomposition
+pipeline and is equivalent to `--decompose {plan-path}`. Proceed to
+Phase 1.6 (Decompose) below.
+
+---
+
+After outputting the formatted block (for pr/issue/markdown), say:
+```
+Export complete ({format}). Copy the block above.
+```
+
+Then **stop**. Do not continue to any subsequent phase.
+
+## Phase 1.6 — Decompose (runs only when mode = decompose or export headless)
+
+**Skip this phase entirely unless mode = decompose or export format = headless.**
+
+Read the plan file. Verify it contains an Implementation Plan section with
+numbered steps. If no steps are found, report and stop:
+```
+Error: plan has no implementation steps. Run /trekplan first to generate a plan.
+```
+
+Determine the output directory from the plan slug:
+- Extract the slug from the plan filename (e.g., `trekplan-2026-04-06-auth-refactor` → `auth-refactor`)
+- Output directory: `.claude/trekplan-sessions/{slug}/`
+
+Launch the **session-decomposer** agent:
+
+```
+Plan file: {plan path}
+Plugin root: ${CLAUDE_PLUGIN_ROOT}
+Output directory: .claude/trekplan-sessions/{slug}/
+```
+
+The session-decomposer will:
+1. Parse the plan's steps and their file dependencies
+2. Build a dependency graph between steps
+3. Group steps into sessions of 3–5 steps each
+4. Identify which sessions can run in parallel (waves)
+5. Generate one session spec file per session
+6. Generate a dependency diagram (mermaid)
+7. Generate a launch script (`launch.sh`)
+
+When the session-decomposer completes, present the summary to the user:
+
+```
+## Decomposition Complete
+
+**Master plan:** {plan path}
+**Sessions:** {N} across {W} waves
+**Output:** .claude/trekplan-sessions/{slug}/
+
+### Sessions
+
+| # | Title | Steps | Wave | Parallel |
+|---|-------|-------|------|----------|
+{session table from decomposer}
+
+### Files generated
+
+- Session specs: .claude/trekplan-sessions/{slug}/session-*.md
+- Dependency graph: .claude/trekplan-sessions/{slug}/dependency-graph.md
+- Launch script: .claude/trekplan-sessions/{slug}/launch.sh
+
+You can:
+- Review individual session specs before running
+- Run all sessions: `bash .claude/trekplan-sessions/{slug}/launch.sh`
+- Run a single session: `claude -p "$(cat .claude/trekplan-sessions/{slug}/session-1-*.md)"`
+- Say **"launch"** to start headless execution from here
+```
+
+If the user says **"launch"**: run the launch script via Bash.
+
+Then **stop**. Do not continue to any subsequent phase.
+
+## Phase 2 — (removed in v2.0)
+
+The interview phase has moved to `/trekbrief`. This command no
+longer asks the user any requirements questions — the brief is the
+authoritative input.
+
+## Phase 3 — Destination and context recap (foreground)
+
+Determine the plan destination path:
+- If `project_dir` is set (from `--project` or the brief's `project_dir`
+  frontmatter field): **plan destination = {project_dir}/plan.md**.
+- Otherwise: derive slug and date — if the brief has frontmatter `slug` and
+  `created`, use them; otherwise extract from the brief filename. Destination:
+  `.claude/plans/trekplan-{YYYY-MM-DD}-{slug}.md`.
+
+Collect all research briefs (from `--research` flag and auto-discovered
+`{project_dir}/research/*.md`).
+
+Report to the user:
+
+```
+Planning pipeline running in foreground.
+
+  Brief:   {brief_path}
+  Project: {project_dir or "-"}
+  Plan:    {plan destination}
+  Research briefs: {N}
+  Architecture note: {present | none}
+```
+
+Then continue to the next phase inline.
+
+> **Why foreground?** As of v2.4.0 the planning-orchestrator is no longer
+> spawned as a background agent. The Claude Code harness does not expose the
+> Agent tool to sub-agents, so an orchestrator launched with
+> `run_in_background: true` cannot spawn the documented exploration swarm
+> (`architecture-mapper`, `task-finder`, `plan-critic`, etc.) and silently
+> degrades to single-context reasoning. Running the phases inline in main
+> context keeps the swarm intact. Use `claude -p` in a separate terminal
+> window for long-running headless work.
+
+---
+
+**All remaining phases run inline in the main command context.**
+
+---
+
+## Phase 4 — Codebase sizing
+
+Determine codebase scale to calibrate agent turns (not agent count).
+
+Run via Bash:
+```
+find . -type f \( -name "*.ts" -o -name "*.tsx" -o -name "*.js" -o -name "*.jsx" -o -name "*.py" -o -name "*.go" -o -name "*.rs" -o -name "*.java" -o -name "*.rb" -o -name "*.c" -o -name "*.cpp" -o -name "*.h" -o -name "*.cs" -o -name "*.swift" -o -name "*.kt" -o -name "*.sh" -o -name "*.md" \) -not -path "*/node_modules/*" -not -path "*/.git/*" -not -path "*/vendor/*" -not -path "*/dist/*" -not -path "*/build/*" | wc -l
+```
+
+Classify:
+- **Small** (< 50 files)
+- **Medium** (50–500 files)
+- **Large** (> 500 files)
+
+Report:
+```
+Codebase: {N} source files ({scale}). Deploying exploration agents.
+```
+
+## Phase 4b — Brief review
+
+Launch the **brief-reviewer** agent:
+Prompt: "Review this task brief for quality: {brief_path}. Check completeness,
+consistency, testability, scope clarity, and research-plan validity."
+
+Handle the verdict:
+- **PROCEED** — continue to Phase 5.
+- **PROCEED_WITH_RISKS** — continue, carry flagged risks as `[ASSUMPTION]` in the plan.
+- **REVISE** — present findings and ask the user for clarification
+  (foreground is the only mode). If the user force-stops, carry outstanding
+  findings as `[ASSUMPTION]` entries.
+
+## Phase 5 — Parallel exploration (specialized agents + research)
+
+**If mode = quick:** Do NOT launch any exploration agents. Instead, run a
+lightweight file check:
+- `Glob` for files matching key terms from the brief's task/intent (up to 3 patterns)
+- `Grep` for function/type definitions matching key terms (up to 3 patterns)
+
+Report findings as:
+```
+Quick scan: {N} potentially relevant files found via Glob/Grep.
+No agent swarm — proceeding directly to planning.
+```
+
+Then skip Phase 6 (deep-dives) and proceed to Phase 7 (Synthesis) with only
+the quick-scan results.
+
+---
+
+**All other modes:** Launch exploration agents **in parallel** (all in a single
+message). Use the specialized agents from the `agents/` directory.
+
+**All agents run for all codebase sizes.** Scale `maxTurns` by size (small: halved,
+medium: default, large: default) instead of dropping agents.
+
+| Agent | Small | Medium | Large | Purpose |
+|-------|-------|--------|-------|---------|
+| `architecture-mapper` | Yes | Yes | Yes | Codebase structure, patterns, anti-patterns |
+| `dependency-tracer` | Yes | Yes | Yes | Module connections, data flow, side effects |
+| `risk-assessor` | Yes | Yes | Yes | Risks, edge cases, failure modes |
+| `task-finder` | Yes | Yes | Yes | Task-relevant files, functions, types, reuse candidates |
+| `test-strategist` | Yes | Yes | Yes | Test patterns, coverage gaps, strategy |
+| `git-historian` | Yes | Yes | Yes | Recent changes, ownership, hot files, active branches |
+| `research-scout` | Conditional | Conditional | Conditional | External docs (only when unfamiliar tech detected AND no research brief covers it) |
+| `convention-scanner` | No | Yes | Yes | Coding conventions, naming, style, test patterns |
+
+### Always launch (all codebase sizes):
+
+**architecture-mapper** — full codebase structure, tech stack, patterns, anti-patterns.
+Prompt: "Analyze the architecture of this codebase. The task being planned is: {task}"
+
+**dependency-tracer** — module connections, data flow, side effects for task-relevant code.
+Prompt: "Trace dependencies and data flow relevant to this task: {task}. Focus on modules
+that will be affected by the implementation."
+
+**risk-assessor** — risks, edge cases, failure modes, technical debt near task area.
+Prompt: "Assess risks and failure modes for implementing this task: {task}. Check for
+complexity hotspots, security boundaries, and technical debt in the relevant code."
+
+**task-finder** — all files, functions, types, and interfaces directly related to the task.
+Prompt: "Find all code relevant to this task: {task}. Include existing implementations
+that solve similar problems, API boundaries, database models, configuration files.
+Report file paths and line numbers for every finding."
+
+**test-strategist** — existing test patterns, coverage gaps, test strategy.
+Prompt: "Analyze the test infrastructure and design a test strategy for this task: {task}.
+Discover existing patterns and identify coverage gaps."
+
+**git-historian** — recent changes, code ownership, hot files, active branches.
+Prompt: "Analyze git history relevant to this task: {task}. Report recent changes,
+ownership, hot files, and active branches that may affect planning."
+
+### Launch for medium+ codebases (50+ files):
+
+**Convention Scanner** — use the `convention-scanner` plugin agent (model: "sonnet")
+for medium+ codebases only.
+Provide concrete examples from the codebase, not generic advice."
+
+### Conditional: External research
+
+After reading the brief, determine if the task involves technologies, APIs, or
+libraries that are:
+- Not clearly present in the codebase
+- Being upgraded to a new major version
+- Being used in an unfamiliar way
+
+**Skip research-scout** for any topic already answered by an attached research
+brief. If the brief's `research_status == complete` and all `Research Plan`
+topics have corresponding research files, skip research-scout entirely.
+
+If yes (and not covered by attached briefs): launch **research-scout** in
+parallel with the other agents.
+Prompt: "Research the following technologies for this task: {task}.
+Specific questions: {list specific questions about the technology}.
+Technologies to research: {list}."
+
+If no external technology is involved or all topics are covered by briefs:
+skip research-scout and note:
+"No external research needed — covered by research briefs / well-represented in codebase."
+
+## Phase 6 — Targeted deep-dives
+
+After all Phase 5 agents complete, review their results and identify **knowledge gaps**
+— areas where exploration was too shallow to plan confidently.
+
+Common reasons for deep-dives:
+- A critical function was found but its implementation details are unclear
+- A dependency chain needs tracing to understand side effects
+- A test pattern was identified but the test infrastructure needs more detail
+- A risk was flagged but the actual impact needs verification
+
+For each significant gap, spawn a targeted deep-dive agent (model: "sonnet",
+subagent_type: "Explore") with a narrow, specific brief.
+
+Launch up to 3 deep-dive agents in parallel. If no gaps exist, skip this phase
+and note: "Initial exploration was sufficient — no deep-dives needed."
+
+## Phase 7 — Synthesis
+
+After all agents complete (initial + deep-dives + research), synthesize:
+
+1. Read all agent results carefully
+2. Identify overlaps and contradictions between agents
+3. Build a mental model of the codebase architecture
+4. Catalog reusable code: existing functions, utilities, patterns
+5. Integrate research findings with codebase analysis
+6. Note remaining gaps — things you cannot determine from code or research
+   (these become assumptions in the plan, marked explicitly)
+7. For each finding, track whether it came from **codebase analysis** or
+   **external research** — the plan must distinguish these sources
+
+Do NOT write this synthesis to disk. It is internal working context only.
+
+## Phase 8 — Deep planning
+
+> **Schema-drift defense (sealed inline so this contract survives even when
+> `agents/planning-orchestrator.md` is not implicitly loaded by Opus 4.7).**
+>
+> The plan you write MUST satisfy these regexes. The executor parses with
+> strict regex matching; any deviation breaks parsing and forces a re-plan.
+>
+> ```
+> STEP_HEADING_REGEX     = /^### Step (\d+):\s+(.+?)\s*$/m
+> FORBIDDEN_HEADING_REGEX = /^(?:##|###) (?:Fase|Phase|Stage|Steg) \d+/m
+> ```
+>
+> **FORBIDDEN headings** (parser rejects these — do not emit them under
+> Implementation Plan):
+> - `## Fase 1`, `### Fase 1` — Norwegian narrative format
+> - `## Phase 1`, `### Phase 1` — narrative phase format
+> - `## Stage 1`, `### Stage 1` — narrative stage format
+> - `## Steg 1`, `### Steg 1` — Norwegian step word
+> - `### 1.` or `### 1)` — numbered without "Step"
+> - `### Step 1 —` (em-dash instead of colon)
+> - Any heading that doesn't match `STEP_HEADING_REGEX`
+>
+> **REQUIRED step shape** — copy this canonical example verbatim, substituting
+> file paths, descriptions, and patterns. Preserve the exact heading format,
+> bullet field names, and Manifest YAML structure. Do not invent new field
+> names. Do not skip fields. Do not nest steps under sub-headings.
+>
+> ````markdown
+> ### Step 1: Add JWT verification middleware
+>
+> - **Files:** `src/middleware/jwt.ts`
+> - **Changes:** Create new middleware function `verifyJWT(req, res, next)` that reads `Authorization: Bearer <token>` header, verifies signature with `process.env.JWT_SECRET`, attaches decoded payload to `req.user`, and returns 401 on invalid/missing token. (new file)
+> - **Reuses:** `jsonwebtoken.verify()` (already in package.json), pattern from `src/middleware/cors.ts`
+> - **Test first:**
+>   - File: `src/middleware/jwt.test.ts` (new)
+>   - Verifies: valid token attaches user; invalid token returns 401; missing header returns 401
+>   - Pattern: `src/middleware/cors.test.ts` (follow this style)
+> - **Verify:** `npm test -- jwt.test.ts` → expected: `3 passing`
+> - **On failure:** revert — `git checkout -- src/middleware/jwt.ts src/middleware/jwt.test.ts`
+> - **Checkpoint:** `git commit -m "feat(auth): add JWT verification middleware"`
+> - **Manifest:**
+>   ```yaml
+>   manifest:
+>     expected_paths:
+>       - src/middleware/jwt.ts
+>       - src/middleware/jwt.test.ts
+>     min_file_count: 2
+>     commit_message_pattern: "^feat\\(auth\\): add JWT verification middleware$"
+>     bash_syntax_check: []
+>     forbidden_paths:
+>       - src/middleware/cors.ts
+>     must_contain:
+>       - path: src/middleware/jwt.ts
+>         pattern: "verifyJWT"
+>   ```
+> ````
+>
+> **Validator self-check (mandatory after writing `plan.md`):** run
+> `node ${CLAUDE_PLUGIN_ROOT}/lib/validators/plan-validator.mjs --strict --json {plan_path}`
+> and re-revise the plan if it fails. The validator is the source of truth for
+> heading shape, manifest presence, and required-field coverage. If
+> `${CLAUDE_PLUGIN_ROOT}` is unset (rare in practice), fall back to the
+> equivalent path under your validators cache or the repo's `lib/validators/`.
+
+Read the brief file (from `--brief` or `--project`).
+Read the plan template: @${CLAUDE_PLUGIN_ROOT}/templates/plan-template.md
+
+Write the plan following the template structure. The plan MUST include:
+
+### Required sections
+
+1. **Context** — Why this change is needed. Use the brief's **Intent** verbatim
+   or tightly paraphrased. The plan's motivation must trace directly to the brief.
+2. **Codebase Analysis** — Tech stack, patterns, relevant files, reusable code,
+   external tech researched. Every file path must be real (verified during exploration).
+3. **Research Sources** — If any research briefs or research-scout was used: table
+   of technologies, sources, findings, and confidence levels. Omit if none.
+4. **Implementation Plan** — Ordered steps. Each step specifies:
+   - Exact files to modify or create (with paths)
+   - What changes to make and why
+   - Which existing code to reuse
+   - Dependencies on other steps
+   - Whether the step is based on codebase analysis or external research
+   - **On failure:** — recovery action (revert/retry/skip/escalate)
+   - **Checkpoint:** — git commit command after success
+10. **Execution Strategy** — For plans with > 5 steps: group steps into sessions
+    (3–5 steps each), organize sessions into waves (parallel where independent),
+    specify scope fences per session. Omit for plans with ≤ 5 steps.
+5. **Alternatives Considered** — At least one alternative approach with
+   pros/cons and reason for rejection.
+6. **Risks and Mitigations** — From the risk-assessor findings and the brief's
+   open questions. What could go wrong and how to handle it.
+7. **Test Strategy** — From the test-strategist findings (if available).
+   What tests to write and which patterns to follow.
+8. **Verification** — Reuse the brief's **Success Criteria** as the baseline.
+   Each criterion must be an executable command or observable condition.
+9. **Estimated Scope** — File counts and complexity rating.
+
+### Quality standards
+
+- Every file path in the plan must exist in the codebase (or be explicitly
+  marked as "new file to create")
+- Every "reuses" reference must point to a real function/pattern found during
+  exploration
+- Steps must be ordered by dependency (not by file path or importance)
+- Verification criteria must be concrete and executable
+- The plan must be implementable by someone who has not seen the exploration
+  results — it must stand on its own
+- Research-based decisions must cite their source
+- Every implementation decision must be traceable to a brief section (Intent,
+  Goal, Constraint, Preference, NFR, or Success Criterion)
+
+### Write the plan
+
+Use the plan destination computed in Phase 3:
+- `--project` mode: `{project_dir}/plan.md`
+- `--brief` mode: `.claude/plans/trekplan-{YYYY-MM-DD}-{slug}.md`
+
+Create the parent directory if it does not exist.
+
+## Phase 9 — Adversarial review
+
+Launch two review agents **in parallel — emit both Agent tool calls in a
+single assistant message turn** (same pattern as Phase 5 exploration). They
+have zero data dependencies; serializing them wastes 30–60 seconds per run.
+
+**plan-critic** — adversarial review of the plan.
+Prompt: "Review this implementation plan for the task: {task}.
+Plan file: {plan path}. Read it and find every problem — missing steps,
+wrong ordering, fragile assumptions, missing error handling, scope creep,
+underspecified steps. Rate each finding as blocker, major, or minor.
+Write the structured JSON output to `/tmp/plan-critic-out.json` so the
+dedup helper can merge with scope-guardian's findings."
+
+**scope-guardian** — scope alignment check.
+Prompt: "Check this implementation plan against the brief.
+Task: {task}. Brief file: {brief_path}. Plan file: {plan path}.
+Find scope creep (plan does more than the brief requires) and scope gaps
+(plan misses brief requirements). Check that referenced files and functions
+exist. Verify that every Success Criterion in the brief is covered by the
+plan's Verification section. Write structured JSON output to
+`/tmp/scope-guardian-out.json`."
+
+After both complete, run an inline dedup pass:
+
+```bash
+node ${CLAUDE_PLUGIN_ROOT}/lib/review/plan-review-dedup.mjs \
+  --plan-critic /tmp/plan-critic-out.json \
+  --scope-guardian /tmp/scope-guardian-out.json \
+  > /tmp/plan-review-merged.json
+```
+
+The merged array attributes each finding to `[plan-critic, scope-guardian]`
+when both reviewers raised the same issue (exact match on
+`file:line:rule_key`, or Jaccard ≥ 0.7 on text tokens). Revise the plan
+once for the merged set, not twice for the duplicates. Source: research/05
+R1 + R2.
+
+After both complete:
+- If **blockers** are found: revise the plan to address them. Add a "Revisions"
+  note at the bottom of the plan listing what changed and why.
+- If only **major** issues: revise to address them. Add revisions note.
+- If only **minor** issues or clean: proceed without changes. Note the
+  review result in the plan.
+
+## Phase 10 — Present and refine
+
+Present a summary to the user:
+
+```
+## Voyage Complete
+
+**Task:** {task description}
+**Mode:** {foreground | quick}
+**Brief:** {brief_path}
+**Project:** {project_dir or "-"}
+**Plan:** {plan_path}
+**Exploration:** {N} agents deployed ({N} specialized + {N} deep-dives + {research status})
+**Scope:** {N} files to modify, {N} to create — {complexity}
+
+### Key decisions
+- {Decision 1 and rationale}
+- {Decision 2 and rationale}
+
+### Implementation steps ({N} total)
+1. {Step 1 summary}
+2. {Step 2 summary}
+...
+
+### Research findings
+{Summary of external research + attached research briefs, or "No external research used."}
+
+### Adversarial review
+**Plan critic:** {Summary — blockers/majors/minors found, how addressed}
+**Scope guardian:** {Summary — creep/gaps found, how addressed}
+
+You can:
+- Ask questions or request changes to refine the plan
+- Say **"execute"** to start implementing
+- Say **"execute with team"** to implement with parallel Agent Team (if eligible)
+- Say **"save"** to keep the plan for later
+```
+
+If the user asks questions or requests changes:
+- Update the plan file in-place
+- Show what changed
+- Re-present the summary
+
+## Phase 11 — Handoff
+
+### "save" / "later" / "done"
+
+Confirm the plan and brief file locations and exit.
+
+### "execute" / "go" / "start"
+
+Begin implementing the plan step by step in this session. Follow the plan exactly.
+Mark each step complete as you go.
+
+### "execute with team" / "team"
+
+Before creating a team, verify eligibility:
+1. Count implementation steps that are **independent** (no dependency on each other)
+   AND touch **different files/modules**
+2. If fewer than 3 independent steps: inform the user and fall back to sequential
+   execution. "The plan has fewer than 3 independent steps — sequential execution
+   is more efficient."
+
+If eligible:
+1. Present the proposed team split: which steps go to which team member
+2. Ask for confirmation: "Create Agent Team with {N} members? (yes/no)"
+3. If confirmed: create the team with `TeamCreate`, assign step clusters to
+   each member. Use `isolation: "worktree"` on each team member agent so they
+   work in isolated git worktrees — this prevents file conflicts during parallel
+   implementation. Coordinate execution and clean up with `TeamDelete` when done.
+4. If `TeamCreate` fails (tool not available): fall back to sequential execution
+   and notify the user
+
+## Phase 12 — Session tracking
+
+After the plan is presented (Phase 10) or after handoff (Phase 11), write a
+session record to `${CLAUDE_PLUGIN_DATA}/trekplan-stats.jsonl` (create the file
+if it does not exist).
+
+Record format (one JSON line):
+```json
+{
+  "ts": "{ISO-8601 timestamp}",
+  "task": "{task description (first 100 chars)}",
+  "mode": "{default|fg|quick}",
+  "slug": "{plan slug}",
+  "brief_path": "{brief_path}",
+  "project_dir": "{project_dir or null}",
+  "codebase_size": "{small|medium|large}",
+  "codebase_files": {N},
+  "agents_deployed": {N},
+  "deep_dives": {N},
+  "research_briefs_used": {N},
+  "research_scout_used": {true|false},
+  "critic_verdict": "{BLOCK|REVISE|PASS}",
+  "guardian_verdict": "{ALIGNED|CREEP|GAP|MIXED}",
+  "outcome": "{execute|execute_team|save|refine}"
+}
+```
+
+If `${CLAUDE_PLUGIN_DATA}` is not set or not writable, skip tracking silently.
+Never let tracking failures block the main workflow.
+
+## Hard rules
+
+- **Brief-driven**: Every plan decision must trace back to a section of the
+  brief (Intent, Goal, Constraint, Preference, NFR, Success Criterion). If a
+  step has no brief basis, it is scope creep — flag it or remove it.
+- **No interview**: Never ask the user requirements questions. If the brief is
+  inadequate, stop and ask the user to run `/trekbrief` again.
+- **Scope**: Only explore the current working directory and its subdirectories.
+  Never read files outside the repo (no ~/.env, no credentials, no other repos).
+- **Cost**: Sonnet for all agents (exploration, deep-dives, research, critics).
+  Opus only runs in the main thread for synthesis and planning.
+- **Privacy**: Never log, store, or repeat file contents that look like
+  secrets, tokens, or credentials. Never log prompt text.
+- **No premature execution**: Do not modify any project files until the user
+  explicitly approves the plan.
+- **Plan stands alone**: The plan file must be understandable without access
+  to the exploration results. Include all necessary context.
+- **Honesty**: If exploration reveals the task is trivial (single file, obvious
+  change), say so. Do not inflate the plan to justify the process. Suggest
+  the user just implements it directly.
+- **Adaptive**: Never spawn more agents than the codebase warrants. A 10-file
+  project does not need 7 exploration agents. Scale down.
+- **Research transparency**: Always distinguish codebase-derived decisions from
+  research-derived decisions in the plan.
--- a/plugins/voyage/commands/trekresearch.md
+++ b/plugins/voyage/commands/trekresearch.md
@ -0,0 +1,427 @@
+---
+name: trekresearch
+description: Deep research combining local codebase analysis with external knowledge, producing structured research briefs with triangulation and confidence ratings
+argument-hint: "[--project <dir>] [--quick | --local | --external | --fg] <research question>"
+model: opus
+allowed-tools: Agent, Read, Glob, Grep, Write, Edit, Bash, AskUserQuestion, WebSearch, WebFetch, mcp__tavily__tavily_search, mcp__tavily__tavily_research
+---
+
+# Ultraresearch Local v1.0
+
+Deep, multi-phase research that combines local codebase analysis with external
+knowledge. Uses specialized agent swarms to investigate multiple dimensions in
+parallel, then triangulates findings to produce insights that neither local nor
+external research could provide alone.
+
+**Design principle: Context Engineering** — build the right context by orchestrating
+specialized agents, each seeing only what they need. The value is in triangulation
+(cross-checking local vs. external) and synthesis (insights from combining both).
+
+**Pipeline integration:** Research briefs feed into trekplan via `--research`:
+```
+/trekresearch <question> → brief → /trekplan --research <brief> <task>
+```
+
+## Phase 1 — Parse mode and validate input
+
+Parse `$ARGUMENTS` for mode flags. Flags can appear in any order before the
+research question. Collect all flags first, then treat the remainder as the
+research question.
+
+Supported flags:
+
+1. `--quick` — lightweight research, no agent swarm. The command itself does
+   3-5 targeted searches inline. Set **mode = quick**.
+
+2. `--local` — only codebase research. Skip external agents and gemini bridge.
+   Set **scope = local**.
+
+3. `--external` — only external research. Skip codebase analysis agents.
+   Set **scope = external**.
+
+4. `--fg` — accepted as a no-op alias for backwards compatibility. Execution
+   is always foreground as of v2.4.0. Set **execution = foreground** (the
+   only mode).
+
+5. `--project <dir>` — attach this research to an trekbrief project folder.
+   The brief will be written to `{dir}/research/{NN}-{slug}.md` (auto-incremented
+   index) instead of the default `.claude/research/` path. Set **project_dir = {dir}**.
+
+   If `{dir}` does not exist:
+   ```
+   Error: project directory not found: {dir}
+   Run /trekbrief first to create it.
+   ```
+   Create `{dir}/research/` if it does not already exist.
+
+6. `--gates` — autonomy control. When present, set `gates_mode = true`. The
+   research command will pause after each topic completes ("Topic N
+   complete. Proceed to topic N+1? (yes/no)"). Default `gates_mode = false`
+   means topics run continuously. The flag is consumed by the autonomy-gate
+   state machine via the CLI shim:
+   `node ${CLAUDE_PLUGIN_ROOT}/lib/util/autonomy-gate.mjs --state X --event Y --gates {true|false}`.
+
+Flags can be combined:
+- `--local` — local-only research
+- `--external --quick` — external-only, lightweight
+- `--project <dir> --external` — attach external research to a project
+- `--quick` alone implies both local and external (lightweight)
+
+Defaults: **scope = both**, **execution = foreground** (only mode as of
+v2.4.0), **project_dir = none**.
+
+After stripping flags, the remaining text is the **research question**.
+
+If no research question is provided, output usage and stop:
+
+```
+Usage: /trekresearch <research question>
+       /trekresearch --quick <research question>
+       /trekresearch --local <research question>
+       /trekresearch --external <research question>
+       /trekresearch --fg <research question>
+       /trekresearch --project <dir> [--external|--local|--quick|--fg] <research question>
+
+Modes:
+  default       Interview → foreground research (local + external) → brief
+  --quick       Interview (short) → inline research (no agent swarm)
+  --local       Only codebase analysis agents (skip external + Gemini)
+  --external    Only external research agents (skip codebase analysis)
+  --fg          No-op alias (foreground is the only mode as of v2.4.0)
+  --project     Write brief into an trekbrief project folder (auto-indexed)
+
+Flags can be combined: --local, --external --quick, --project <dir> --external
+
+Examples:
+  /trekresearch Should we migrate from Express to Fastify?
+  /trekresearch --quick What auth libraries are popular for Node.js?
+  /trekresearch --local How is error handling structured in this codebase?
+  /trekresearch --external What are the security implications of using Redis for sessions?
+  /trekresearch --fg --local What patterns does this codebase use for database access?
+  /trekresearch --project .claude/projects/2026-04-18-jwt-auth --external What JWT library is best for Node.js?
+```
+
+Do not continue past this step if no question was provided.
+
+Report the detected mode:
+```
+Mode: {default | quick}, Scope: {both | local | external}, Execution: foreground
+Project: {project_dir or "-"}
+Question: {research question}
+```
+
+### Compute brief destination
+
+If **project_dir is set**:
+- Scan `{project_dir}/research/` for existing files matching `NN-*.md`.
+- Find the highest existing index; set `N = highest + 1`. If no files exist, `N = 1`.
+- Zero-pad to 2 digits: `01`, `02`, ...
+- Brief destination: `{project_dir}/research/{NN}-{slug}.md`
+
+If **project_dir is not set**:
+- Brief destination: `.claude/research/trekresearch-{YYYY-MM-DD}-{slug}.md`
+
+Store as `brief_destination` for use in later phases.
+
+## Phase 2 — Research interview
+
+Use `AskUserQuestion` to clarify the research question. Ask **one question at a time**.
+
+The interview is shorter than trekplan's (2-4 questions, not 3-8) because research
+is more focused than planning.
+
+### Interview flow
+
+**Start with the research question itself.** If the user provided a clear, specific
+question, you may skip directly to follow-ups.
+
+**Core questions (pick 2-4 based on clarity of initial question):**
+
+1. **Decision context:** "What decision does this research feed? Are you evaluating
+   options, investigating feasibility, or building understanding?"
+   *Skip if the question itself makes this obvious.*
+
+2. **Dimensions:** "Are there specific aspects you care about most? (e.g., performance,
+   security, migration cost, team learning curve)"
+   *Skip if the question is narrow enough that dimensions are obvious.*
+
+3. **Prior knowledge:** "What do you already know about this topic? What have you
+   tried or ruled out?"
+   *Always useful — prevents redundant research.*
+
+4. **Constraints:** "Are there constraints that should guide the research?
+   (e.g., must be open-source, must support X, budget limitations)"
+   *Skip if no constraints are apparent.*
+
+**Rules:**
+- If the user says "just research it", "skip", or similar — stop interviewing.
+  Use the research question as-is.
+- For `--quick` mode: ask 1-2 questions maximum.
+- Never ask about things you can discover from the codebase.
+
+### Determine research dimensions
+
+Based on the interview, identify 3-8 research dimensions. These are the facets
+of the question that will be investigated in parallel. Examples:
+
+- "Should we use Redis?" → dimensions: performance, reliability, operational
+  complexity, security, cost, team familiarity
+- "How should we handle auth?" → dimensions: standards compliance, implementation
+  complexity, library ecosystem, security posture, scalability
+
+Report dimensions:
+```
+Research dimensions identified:
+1. {Dimension 1}
+2. {Dimension 2}
+...
+```
+
+## Phase 3 — Slug and destination (foreground)
+
+Generate a slug from the research question (first 3-4 meaningful words,
+lowercase, hyphens). Confirm the `brief_destination` computed in Phase 1.
+
+Report to the user:
+
+```
+Research pipeline running in foreground.
+
+  Question: {research question}
+  Dimensions: {N} identified
+  Scope: {both | local | external}
+  Project: {project_dir or "-"}
+  Brief:   {brief_destination}
+```
+
+Then continue to the next phase inline.
+
+> **Why foreground?** As of v2.4.0 the research-orchestrator is no longer
+> spawned as a background agent. The Claude Code harness does not expose the
+> Agent tool to sub-agents, so an orchestrator launched with
+> `run_in_background: true` cannot spawn the documented research swarm
+> (`docs-researcher`, `community-researcher`, etc.) and silently degrades to
+> single-context reasoning without WebSearch / Tavily / WebFetch / Gemini.
+> Running the phases inline in main context keeps the swarm intact. Use
+> `claude -p` in a separate terminal window for long-running headless work.
+
+---
+
+**All remaining phases run inline in the main command context.**
+
+---
+
+## Phase 3.5 — Quick mode (inline research)
+
+**Skip this phase entirely unless mode = quick.**
+
+For quick mode, do NOT launch an agent swarm. Instead, do lightweight research
+directly using available tools.
+
+### Quick local research (if scope includes local)
+
+- `Glob` for files matching key terms from the research question (up to 3 patterns)
+- `Grep` for relevant definitions, patterns, or usage (up to 5 patterns)
+- Read the 2-3 most relevant files found
+
+### Quick external research (if scope includes external)
+
+Use available search tools directly (in this priority order):
+1. `mcp__tavily__tavily_search` — if available, use for 2-3 targeted queries
+2. `WebSearch` — fallback for 2-3 targeted queries
+3. `WebFetch` — fetch 1-2 specific pages if URLs were found
+
+### Quick synthesis
+
+Synthesize findings inline. Write a lightweight research brief to the destination
+path, following the research-brief-template but with shorter sections and fewer
+dimensions.
+
+Skip to Phase 8 (stats tracking) after writing the brief.
+
+## Phase 4 — Parallel research (agent swarm)
+
+**Determine which agents to launch based on scope:**
+
+### Local agents (scope = both or local)
+
+Reuse existing plugin agents with research-focused prompts. These agents are
+designed for planning, but work equally well for research when prompted differently.
+
+| Agent | Purpose in research context |
+|-------|----------------------------|
+| `architecture-mapper` | How the architecture relates to the research question |
+| `dependency-tracer` | Dependencies and integrations relevant to the topic |
+| `task-finder` | Existing code that relates to the research question |
+| `git-historian` | Recent changes and ownership relevant to the topic |
+| `convention-scanner` | Coding patterns relevant to evaluating options |
+
+For each local agent, prompt with the research question, NOT a task description:
+
+- architecture-mapper: "Analyze the architecture relevant to this research question:
+  {question}. Focus on how {topic} relates to current patterns and constraints."
+- dependency-tracer: "Trace dependencies relevant to this research question: {question}.
+  Identify which modules would be affected by {topic}."
+- task-finder: "Find existing code relevant to this research question: {question}.
+  Look for prior implementations, patterns, or utilities related to {topic}."
+- git-historian: "Analyze git history relevant to this research question: {question}.
+  Who owns the relevant code? What has changed recently in related areas?"
+- convention-scanner: "Discover coding conventions relevant to evaluating {question}.
+  What patterns would a solution need to follow?"
+
+### External agents (scope = both or external)
+
+Launch the new research-specialized agents:
+
+| Agent | Purpose |
+|-------|---------|
+| `docs-researcher` | Official documentation, RFCs, vendor docs |
+| `community-researcher` | Real-world experience, issues, blog posts |
+| `security-researcher` | CVEs, audit history, supply chain risks |
+| `contrarian-researcher` | Counter-evidence, overlooked alternatives |
+
+For each external agent, pass: the research question, specific dimensions to
+investigate, and any context from the interview.
+
+### Bridge agent (scope = both or external, if enabled)
+
+Launch `gemini-bridge` with the research question. Do NOT include findings from
+other agents — the value of Gemini is independence.
+
+### Launch rules
+
+- Launch ALL selected agents **in parallel** in a single message
+- Use model: "sonnet" for all sub-agents (the orchestrator runs on Opus)
+- Scale maxTurns by codebase size for local agents (same as trekplan):
+  small = halved, medium/large = default
+- convention-scanner: medium+ codebases only (50+ files)
+
+## Phase 5 — Targeted follow-ups
+
+Review all agent results. Identify knowledge gaps — areas where findings are
+thin, contradictory, or missing.
+
+For each significant gap, launch a targeted follow-up agent (model: "sonnet")
+with a narrow, specific brief. Maximum 2 follow-ups.
+
+If no gaps exist, skip: "Initial research sufficient — no follow-ups needed."
+
+## Phase 6 — Triangulation
+
+This is the KEY phase that makes trekresearch more than aggregation.
+
+For each research dimension:
+
+1. **Collect** — gather relevant findings from local AND external agents
+2. **Compare** — do local findings agree with external findings?
+3. **Flag contradictions** — where they disagree, present both sides with evidence
+4. **Cross-validate** — use codebase facts to validate external claims:
+   - External says "library X is fast" → local shows the codebase already uses
+     a similar pattern that could benchmark against
+   - External says "pattern Y is best practice" → local shows the codebase uses
+     pattern Z which conflicts
+5. **Rate confidence** per dimension:
+   - **high** — multiple authoritative sources agree, local evidence confirms
+   - **medium** — good sources but limited cross-validation
+   - **low** — single source, limited evidence
+   - **contradictory** — credible sources actively disagree
+
+Compute overall confidence as a weighted average (0.0-1.0) based on dimension
+confidence levels and their relative importance.
+
+## Phase 7 — Synthesis and brief writing
+
+Read the research brief template:
+@${CLAUDE_PLUGIN_ROOT}/templates/research-brief-template.md
+
+Write the research brief following the template. Key rules:
+
+1. **Executive Summary** — 3 sentences. Answer, confidence, key caveat.
+2. **Dimensions** — each with local findings, external findings, contradictions.
+3. **Synthesis** — NOT a summary. NEW insights from triangulation.
+4. **Open Questions** — what remains unresolved and why.
+5. **Recommendation** — only if decision-relevant. Omit for exploratory research.
+6. **Sources** — every claim traced to URL or codebase path.
+
+Generate the slug from the research question (first 3-4 meaningful words).
+Write the brief to the `brief_destination` computed in Phase 1:
+- With `--project`: `{project_dir}/research/{NN}-{slug}.md`
+- Without `--project`: `.claude/research/trekresearch-{YYYY-MM-DD}-{slug}.md`
+
+Create the parent directory if it does not exist.
+
+## Phase 8 — Present and track
+
+Present a summary to the user:
+
+```
+## Ultraresearch Complete
+
+**Question:** {research question}
+**Mode:** {default | quick}, Scope: {both | local | external}
+**Brief:** {brief_destination}
+**Project:** {project_dir or "-"}
+**Confidence:** {overall confidence 0.0-1.0}
+**Dimensions:** {N} researched
+**Agents:** {N} local + {N} external + {gemini: used | unavailable | skipped}
+
+### Key Findings
+- {Finding 1}
+- {Finding 2}
+- {Finding 3}
+
+### Contradictions Found
+- {Contradiction 1, or "None — findings are consistent across sources."}
+
+### Open Questions
+- {Question 1, or "None — all dimensions adequately covered."}
+
+You can:
+- Read the full brief at {brief_destination}
+- If `--project` was used: run `/trekplan --project {project_dir}` when all research topics are complete
+- Otherwise: `/trekplan --research {brief_destination} --brief <your-brief.md>`
+- Ask follow-up questions about specific findings
+```
+
+### Stats tracking
+
+Write a session record to `${CLAUDE_PLUGIN_DATA}/trekresearch-stats.jsonl`
+(create the file if it does not exist).
+
+Record format (one JSON line):
+```json
+{
+  "ts": "{ISO-8601 timestamp}",
+  "question": "{research question (first 100 chars)}",
+  "mode": "{default|quick}",
+  "scope": "{both|local|external}",
+  "slug": "{brief slug}",
+  "project_dir": "{project_dir or null}",
+  "brief_path": "{brief_destination}",
+  "dimensions": {N},
+  "agents_local": {N},
+  "agents_external": {N},
+  "gemini_used": {true|false},
+  "confidence": {0.0-1.0},
+  "contradictions": {N},
+  "open_questions": {N}
+}
+```
+
+If `${CLAUDE_PLUGIN_DATA}` is not set or not writable, skip tracking silently.
+
+## Hard rules
+
+- **No planning:** This command produces research briefs, not implementation plans.
+  If the user asks to plan, direct them to `/trekplan --research <brief>`.
+- **Sources required:** Every claim must cite a source. No unsourced findings.
+- **Independence:** Do not pre-bias external agents with local findings or vice versa.
+  Triangulate AFTER independent research.
+- **Graceful degradation:** If MCP tools are unavailable (Tavily, Gemini, MS Learn),
+  proceed with available tools and note limitations in brief metadata.
+- **Cost:** Sonnet for all sub-agents. Opus only in the main command/orchestrator.
+- **Privacy:** Never log secrets, tokens, or credentials.
+- **Honesty:** If the question is trivially answerable, say so. Don't inflate research.
+- **Scope of codebase:** Only analyze the current working directory for local research.
+- **Research transparency:** Clearly distinguish local findings from external findings.
+  Never blend them without attribution.
--- a/plugins/voyage/commands/trekreview.md
+++ b/plugins/voyage/commands/trekreview.md
@ -0,0 +1,340 @@
+---
+name: trekreview
+description: |
+  Independent post-hoc review of delivered code against the brief. Produces
+  review.md with severity-tagged findings (BLOCKER/MAJOR/MINOR/SUGGESTION)
+  per Handover 6 (review → plan).
+argument-hint: "--project <dir> [--since <ref>] [--quick] [--validate] [--dry-run]"
+model: opus
+allowed-tools: Agent, Read, Glob, Grep, Write, Edit, Bash, AskUserQuestion
+---
+
+# Ultrareview Local v1.0
+
+Independent post-hoc review of code delivered by `/trekexecute`
+against the contract in `brief.md`. Produces `review.md` — a structured
+artifact with severity-tagged findings that `/trekplan --brief
+review.md` can consume as plan input (Handover 6).
+
+Pipeline position:
+
+```
+/trekbrief     →  brief.md
+/trekresearch  →  research/*.md
+/trekplan      →  plan.md
+/trekexecute   →  progress.json (+ commits)
+/trekreview    →  review.md            (this command)
+```
+
+The review is **independent**: each reviewer runs without cross-feeding,
+and the coordinator applies BOUNDED operations only. Synthesis-level
+inference across files is forbidden in v1.0 (Judge Agent pattern).
+
+See `agents/review-orchestrator.md` for the canonical workflow this
+command executes inline.
+
+## Phase 1 — Parse mode and validate input
+
+Parse `$ARGUMENTS` via the shared arg-parser:
+
+```bash
+node ${CLAUDE_PLUGIN_ROOT}/lib/parsers/arg-parser.mjs --command trekreview "$@"
+```
+
+The parser recognizes these flags (see `lib/parsers/arg-parser.mjs`
+FLAG_SCHEMA `trekreview` entry):
+
+| Flag | Type | Purpose |
+|------|------|---------|
+| `--project <dir>` | valued | Required. Path to trekplan project folder containing `brief.md`. |
+| `--since <ref>` | valued | Optional. Override "before" SHA for the diff. Validated via `git rev-parse --verify`. |
+| `--quick` | boolean | Skip the brief-conformance pass; run only the code-correctness reviewer; skip the coordinator's reasonableness filter. |
+| `--validate` | boolean | Schema-only check on existing `{project_dir}/review.md`. No LLM calls. |
+| `--dry-run` | boolean | Print the discovered scope and triage map. Skip writes. |
+| `--fg` | boolean | No-op alias (foreground is default). |
+
+Resolution:
+1. If `--project` is missing, print usage and stop:
+   ```
+   Error: --project <dir> is required.
+   Usage: /trekreview --project <dir> [--since <ref>] [--quick] [--validate] [--dry-run]
+   ```
+2. Trim trailing slash from `{dir}`. Set:
+   - `project_dir = {dir}`
+   - `brief_path = {dir}/brief.md`
+   - `review_path = {dir}/review.md`
+3. If `{dir}` does not exist or `{dir}/brief.md` is missing:
+   ```
+   Error: project directory not initialized. Run /trekbrief first.
+   Missing: {dir}/brief.md
+   ```
+
+Set `mode`:
+- `validate` if `--validate` is set (overrides everything else; skip to Phase 8.5).
+- `dry-run` if `--dry-run` is set.
+- `quick` if `--quick` is set.
+- `default` otherwise.
+
+## Phase 2 — Validate brief
+
+Run the brief validator in soft mode — the brief is upstream context, not
+something this command produces, so partial grades are acceptable as long
+as the file is parseable:
+
+```bash
+node ${CLAUDE_PLUGIN_ROOT}/lib/validators/brief-validator.mjs --soft --json "{brief_path}"
+```
+
+Read the JSON output. If `valid: false` AND any error has code
+`BRIEF_MISSING_REQUIRED_FIELD` or `FRONTMATTER_PARSE_ERROR`: stop and
+ask the user to re-run `/trekbrief`. Other soft errors become
+warnings in the review's Executive Summary.
+
+Read the brief frontmatter. Capture for review.md:
+- `task` → review frontmatter `task`
+- `slug` → review frontmatter `slug`
+- `project_dir` → review frontmatter `project_dir` (defaults to the
+  CLI `--project` value when missing)
+
+## Phase 3 — Discover scope SHA range
+
+Determine the "before" SHA that bounds the review:
+
+1. **`--since <ref>` override** — if set, validate via:
+   ```bash
+   git rev-parse --verify "$since_ref"
+   ```
+   On failure: print `Error: --since ref is not a valid git revision: {ref}` and stop.
+   Set `before_sha = $(git rev-parse --verify "$since_ref")`.
+
+2. **Preferred path** — read `{project_dir}/progress.json` if it exists.
+   Extract `session_start_sha`. Validate it via `git rev-parse --verify`.
+   Set `before_sha = session_start_sha`.
+
+3. **Fallback** — no `progress.json`. Use the brief's mtime to find the
+   most recent commit at or before the brief was written:
+   ```bash
+   brief_mtime=$(stat -f %m "{brief_path}")  # macOS; on Linux use stat -c %Y
+   before_sha=$(git log --until="@$brief_mtime" -n 1 --format=%H)
+   ```
+   Emit a clear warning that gets surfaced in the review's Executive
+   Summary: "scope_sha_start unavailable — falling back to brief mtime
+   ({timestamp}). Coverage may include unrelated commits."
+
+Compute the "after" SHA: `after_sha=$(git rev-parse HEAD)`.
+
+Capture working-tree changes (uncommitted at review time):
+```bash
+git diff --name-only "$before_sha".."$after_sha"
+git diff --name-only HEAD       # uncommitted (annotated [uncommitted])
+```
+
+The combined file list is the review scope. Note that the
+`[uncommitted]` annotation is a **brief-level contract** — the brief's
+Assumptions section declares this is allowed; the review surfaces it
+explicitly in the Coverage table.
+
+If the file count is `0`, write a one-line review.md noting "No diff
+between {before_sha} and {after_sha}; nothing to review." Verdict: ALLOW.
+Skip Phases 4–7. Continue to Phase 8 (validate + stats).
+
+## Phase 4 — Triage gate (deterministic path-pattern classifier)
+
+The triage gate is **deterministic** — no LLM judgment. It classifies
+every file from Phase 3 into a treatment bucket:
+
+| Treatment | When |
+|-----------|------|
+| `skip` | Matches `*.lock`, `*.svg`, `dist/**`, `build/**`, `node_modules/**`, OR the file's first 3 lines contain a generated-file marker (`@generated`, `Code generated by`, `DO NOT EDIT`). |
+| `deep-review` | Matches `auth/**`, `crypto/**`, `**/security/**`, `hooks/**`. |
+| `summary-only` | Default treatment for everything else. |
+
+Hard refuse-with-suggestion gates — use `AskUserQuestion`:
+
+```
+if (reviewed_files_count > 100) → ask user
+if (estimated_diff_tokens > 100000) → ask user
+```
+
+Token estimation: `wc -c "$diff_file" / 4` (rough proxy). Use
+`AskUserQuestion` with the prompt:
+
+> The diff under review is large (`{N}` files / `~{T}` tokens). Continue
+> with the full scope, narrow with `--since <closer-ref>`, or stop?
+
+Options:
+1. **Continue** — proceed at this scope.
+2. **Narrow** — print suggested `git log --oneline {before}..HEAD` so the
+   user can pick a closer ref, then stop.
+3. **Stop** — cancel.
+
+Record the treatment for every file. Files marked `skip` MUST appear in
+the Coverage section of `review.md` — never silently drop them. Silent
+drops are `COVERAGE_SILENT_SKIP` (MAJOR) per the rule catalogue.
+
+If `mode == dry-run`: print the triage map and exit.
+
+## Phase 5 — Launch parallel reviewers
+
+Launch two reviewer agents **in parallel** via the Agent tool — one
+message, multiple tool calls.
+
+Reviewers run independently. Do NOT pre-feed findings between them.
+
+| Agent | Mode-gated | Purpose |
+|-------|------------|---------|
+| `brief-conformance-reviewer` | Skipped in `quick` | Trace each Success Criterion + Non-Goal to delivered code. Emits findings tagged with rule_keys from the conformance/scope categories. |
+| `code-correctness-reviewer` | Always runs | 7-dimension code review. Emits findings tagged with rule_keys from the correctness/security/maintenance/tests categories. |
+
+Each reviewer prompt includes:
+- **Diff context** — the unified diff from Phase 3, truncated per file
+  for files marked `summary-only`.
+- **Triage map** — full file list with treatments. Reviewers must
+  respect `skip` decisions.
+- **Brief path** — `{brief_path}` (read on demand; do not inline).
+- **Rule catalogue** — reference to `lib/review/rule-catalogue.mjs`.
+
+Collect each reviewer's trailing JSON block (last fenced `json` block in
+their output). Parse with `JSON.parse()`. On parse error, ask the agent
+to re-emit the JSON only.
+
+In `quick` mode, launch only `code-correctness-reviewer`. The Executive
+Summary will note the brief-conformance pass was skipped.
+
+## Phase 6 — Coordinator dedup + verdict
+
+Launch `review-coordinator` (Agent tool) with the merged findings array
+from Phase 5 plus the triage map, brief metadata, and SHA range.
+
+The coordinator runs the 4-pass process documented in
+`agents/review-coordinator.md`:
+
+1. **Dedup** by `(file, line, rule_key)` triplet.
+2. **HubSpot Judge filters** — Succinctness, Accuracy, Actionability.
+3. **Cloudflare reasonableness** — drop speculative or catalogue-violating
+   findings (skipped in `quick` mode).
+4. **Verdict** — BLOCK / WARN / ALLOW per the threshold table.
+
+The coordinator's output is the full review.md content — frontmatter +
+body sections + trailing JSON block. Do NOT re-run the reviewers based
+on the coordinator's output.
+
+## Phase 7 — Write review.md
+
+Write the coordinator's output verbatim to:
+
+```
+{project_dir}/review.md
+```
+
+Create parent directories if they do not exist. Atomic write pattern:
+write to a temp file, then rename. The frontmatter `findings:` field
+must use **block-style YAML** (one ID per line, `  - ` prefix). The
+parser at `lib/util/frontmatter.mjs` does not support flow-style arrays.
+
+If `mode == dry-run`: skip the write; print the would-be path and the
+first 60 lines of the rendered output.
+
+## Phase 8 — Validate output + stats
+
+Run the strict validator:
+
+```bash
+node ${CLAUDE_PLUGIN_ROOT}/lib/validators/review-validator.mjs --json "{review_path}"
+```
+
+If validation fails:
+- For repairable errors (missing required body section, malformed
+  finding-ID, REVIEW_VERSION_FORMAT warning): repair in place — re-emit
+  the missing section, recompute the finding-ID, fix the version
+  string. Re-validate.
+- For unrepairable errors (REVIEW_WRONG_TYPE, malformed frontmatter):
+  stop and ask the user to re-run; do not silently produce an invalid
+  review.md.
+
+Append a stats line to `${CLAUDE_PLUGIN_DATA}/trekreview-stats.jsonl`
+(create the file if it does not exist):
+
+```json
+{"ts":"{ISO-8601}","slug":"{slug}","verdict":"BLOCK|WARN|ALLOW","counts":{"BLOCKER":N,"MAJOR":N,"MINOR":N,"SUGGESTION":N},"reviewed_files_count":N,"mode":"default|quick|validate|dry-run","duration_ms":N}
+```
+
+If `${CLAUDE_PLUGIN_DATA}` is unset or not writable, skip stats silently.
+Never let stats failures block the main workflow.
+
+## Phase 8.5 — Validate-only mode (`--validate`)
+
+When `mode == validate`:
+1. Skip Phases 3–7 entirely.
+2. Run the strict validator on `{project_dir}/review.md`.
+3. Print a one-line PASS/FAIL summary plus the JSON output on FAIL.
+4. Exit 0 on PASS, 1 on FAIL. Never write to disk. Never call any agent.
+
+## Phase 9 — Present summary
+
+After the write succeeds, print:
+
+```
+## Ultrareview Complete
+
+**Task:** {task}
+**Mode:** {default | quick | dry-run}
+**Brief:** {brief_path}
+**Project:** {project_dir}
+**Review:** {review_path}
+**Scope:** {before_sha}..{after_sha} ({reviewed_files_count} files)
+**Verdict:** {BLOCK | WARN | ALLOW}
+
+### Counts
+- BLOCKER: {N}
+- MAJOR: {N}
+- MINOR: {N}
+- SUGGESTION: {N}
+
+### Top findings
+- [{severity}] {title} ({file}:{line})
+  ...
+{up to 5 highest-severity findings}
+
+You can:
+- Read the full review at {review_path}
+- Feed BLOCKER + MAJOR findings into a follow-up plan:
+    /trekplan --brief {review_path}
+- Re-run with `--quick` for a faster correctness-only pass
+- Re-run with `--since <ref>` to narrow scope
+```
+
+Per **Handover 6**, BLOCKER and MAJOR findings are consumed by
+`/trekplan --brief review.md` to produce a remediation plan. The
+review's frontmatter `findings:` list and the trailing JSON block are
+the contract for that handover (see `docs/HANDOVER-CONTRACTS.md`).
+
+## Hard rules
+
+- **Brief is the contract.** Every finding in the review traces to a
+  brief section via `brief_ref`, except `SCOPE_CREEP_BUILT` (which
+  traces to "no anchor"). Conformance is the conformance reviewer's
+  job — code-correctness findings carry generic anchors like
+  `"NFR — code correctness"`.
+- **Independent reviewers.** Do NOT cross-feed findings between
+  brief-conformance-reviewer and code-correctness-reviewer. The
+  coordinator is the only place where outputs combine.
+- **Bounded coordination.** Synthesis-level inference across files is
+  forbidden in v1.0. The coordinator dedups, filters, and computes the
+  verdict — nothing more.
+- **Triage map respected.** Files marked `skip` MUST appear in the
+  Coverage section. Silent drops are `COVERAGE_SILENT_SKIP` (MAJOR).
+- **Block-style YAML for findings list.** The frontmatter parser does
+  not support flow-style arrays. `findings: [a, b]` is broken; use
+  `findings:\n  - a\n  - b`.
+- **Refuse-with-suggestion above 100 files / 100K tokens.** Never run
+  blind on a giant diff. Use AskUserQuestion to surface the gate.
+- **Cost.** Sonnet for all sub-agents (reviewers + coordinator). Opus
+  only runs in the main /trekreview command thread.
+- **Privacy.** Never log secrets, tokens, or credentials in review.md.
+  Findings citing files with secret-like content must redact the secret
+  in the `detail` field.
+- **Honesty.** If the diff is trivially small or all-skip, say so. Do
+  not pad findings to make the review look thorough.
+- **No production code.** This command never runs production code, never
+  writes to anything outside `{project_dir}` and `${CLAUDE_PLUGIN_DATA}`.