feat(voyage)!: marketplace handoff — rename plugins/ultraplan-local to plugins/voyage [skip-docs]

Session 5 of voyage-rebrand (V6). Operator-authorized cross-plugin scope.

- git mv plugins/ultraplan-local plugins/voyage (rename detected, history preserved)
- .claude-plugin/marketplace.json: voyage entry replaces ultraplan-local
- CLAUDE.md: voyage row in plugin list, voyage in design-system consumer list
- README.md: bulk rename ultra*-local commands -> trek* commands; ultraplan-local refs -> voyage; type discriminators (type: trekbrief/trekreview); session-title pattern (voyage:<command>:<slug>); v4.0.0 release-note paragraph
- plugins/voyage/.claude-plugin/plugin.json: homepage/repository URLs point to monorepo voyage path
- plugins/voyage/verify.sh: drop URL whitelist exception (no longer needed)

Closes voyage-rebrand. bash plugins/voyage/verify.sh PASS 7/7. npm test 361/361.
This commit is contained in:
Kjell Tore Guttormsen 2026-05-05 15:37:52 +02:00
commit 7a90d348ad
149 changed files with 26 additions and 33 deletions

View file

@ -0,0 +1,705 @@
---
name: trekbrief
description: Interactive interview that produces a task brief with explicit research plan. Feeds /trekresearch and /trekplan. Optionally orchestrates the full pipeline end-to-end.
argument-hint: "[--quick] <task description>"
model: opus
allowed-tools: Agent, Read, Glob, Grep, Write, Edit, Bash, AskUserQuestion
---
# Ultrabrief Local v2.1
Interactive requirements-gathering command. Produces a **task brief** — a
structured markdown file that declares intent, goal, constraints, and an
**explicit research plan** with copy-paste-ready `/trekresearch` commands.
Pipeline position:
```
/trekbrief → brief.md (this command)
/trekresearch --project <dir> → research/*.md
/trekplan --project <dir> → plan.md
/trekexecute --project <dir> → execution
```
The brief is the contract between the user's intent and `/trekplan`.
Every decision the plan makes must trace back to content in the brief.
**This command is always interactive.** There is no background mode — the
interview requires user input. After the brief is written, the command
optionally orchestrates the rest of the pipeline (research + plan) in
foreground if the user opts in.
## Phase 1 — Parse mode and validate input
Parse `$ARGUMENTS`:
1. If arguments start with `--quick`: set **mode = quick**. The interview
starts more compactly (fewer opening probes per section) but still
escalates automatically if quality gates fail. There is no hard cap on
question count — quality drives the loop, not a counter. Strip the flag;
remainder is the task description.
2. Otherwise: **mode = default**. Interview probes each section until the
completeness gate (Phase 3) and brief-review gate (Phase 4) both pass.
3. `--gates` flag (autonomy control, may combine with any mode): when
present, set `gates_mode = true`. This re-enables approval pauses at
every phase boundary in the downstream pipeline (research, plan,
execute) and at every wave in the executor. Default `gates_mode = false`
means auto mode runs continuously until the main-merge gate (which is
the one boundary that ALWAYS pauses, regardless of `gates_mode`). Strip
the flag from `$ARGUMENTS` before further parsing. The flag is consumed
by the autonomy-gate state machine via the CLI shim:
`node ${CLAUDE_PLUGIN_ROOT}/lib/util/autonomy-gate.mjs --state X --event Y --gates {true|false}`.
If no task description is provided, output usage and stop:
```
Usage: /trekbrief <task description>
/trekbrief --quick <task description>
Modes:
default Dynamic interview until quality gates pass — brief with research plan
--quick Compact start; still escalates on weak sections — brief with research plan
Examples:
/trekbrief Add user authentication with JWT tokens
/trekbrief --quick Add rate limiting to the API
/trekbrief Migrate from Express to Fastify
```
Report:
```
Mode: {default | quick}
Task: {task description}
```
## Phase 2 — Generate slug and create project directory
Generate a slug from the task description: first 3-4 meaningful words,
lowercase, hyphens. Example: "Migrate from Express to Fastify" → `fastify-migration`.
Set today's date as `YYYY-MM-DD` (UTC).
Create the project directory:
```bash
PROJECT_DIR=".claude/projects/{YYYY-MM-DD}-{slug}"
mkdir -p "$PROJECT_DIR/research"
```
Report:
```
Project directory: .claude/projects/{YYYY-MM-DD}-{slug}/
```
If the directory already exists and is non-empty, warn and ask:
> "Directory {path} exists. Overwrite, reuse (keep existing files), or pick new slug?"
Use `AskUserQuestion` with three options. If "pick new slug", ask for a
new slug and restart Phase 2.
## Phase 3 — Completeness loop
Phase 3 is a **section-driven completeness loop**. Instead of a numbered
question list, maintain an internal state of brief sections and keep asking
until every required section has substantive content. Quality drives the
loop — there is no hard cap on question count.
Use `AskUserQuestion` for every question. **Ask one question at a time.**
Never dump multiple questions.
### Internal state
Track this structure in memory as the loop runs:
```
state = {
intent: { content: "", probes: 0 }, # required
goal: { content: "", probes: 0 }, # required
success_criteria: { content: [], probes: 0 }, # required
research_plan: { topics: [], probes: 0 }, # required
non_goals: { content: [], probes: 0 }, # optional
constraints: { content: [], probes: 0 }, # optional
preferences: { content: [], probes: 0 }, # optional
nfrs: { content: [], probes: 0 }, # optional
prior_attempts: { content: "", probes: 0 }, # optional
question_history: [] # list of questions asked
}
```
`content` is raw user answers merged; `probes` is how many times this
section has been asked; `question_history` prevents re-asking the same
variant twice.
### Required sections (initial-signal gate)
Four sections MUST have substantive content before exiting Phase 3:
1. **Intent** — full sentence or paragraph (not a single word or phrase)
2. **Goal** — full sentence or paragraph
3. **Success Criteria** — at least one concrete, testable item
4. **Research Plan** — either ≥ 1 topic probed, OR the user has explicitly
confirmed "no external research needed"
"Substantive" means: non-empty, not a trivial one-word reply, not
"I don't know" without a recorded assumption. The strict falsifiability
check happens in Phase 4 (brief-review gate); Phase 3 is just the
initial-signal bar.
Optional sections (Non-Goals, Constraints, Preferences, NFRs, Prior
Attempts) do not gate exit. If they remain empty after the required
sections pass, they will be recorded as "Not discussed — no constraints
assumed" in Phase 4's draft.
### Question bank (per section)
Pick the next question from the section's bank based on `content` and
`probes`. Wording must stay conversational — only the *selection* is
section-driven, not the tone.
**Intent** (required):
- _Anchor_ (probes=0, content empty): "Why are we doing this? What is the
motivation, the user need, or the strategic context behind the task?"
- _Follow-up_ (probes≥1, content present but shallow): "What happens if
we do nothing? Who is affected?"
- _Sharpen_ (user mentioned a symptom): "You mentioned {X}. Is {X} the
symptom or the underlying cause?"
**Goal** (required):
- _Anchor_: "Describe the end state in 13 sentences — specific enough to
disagree with."
- _Follow-up_: "How would you recognize this is done when looking at
the UI / API / codebase?"
**Success Criteria** (required):
- _Anchor_: "How do we verify it is actually done? List 24 concrete,
testable conditions — commands to run, observations, or metrics."
- _Sharpen_ (criterion is vague): "'{quoted criterion}' is subjective.
Which command, observation, or metric would prove this is met?"
- _Quantify_ (performance/quality claim): "You mentioned it should be
{fast/reliable/secure}. What number or threshold counts as success?"
**Research Plan** (required, strictest):
- _Anchor_ (no topics yet): "Are there technologies, libraries, or
decisions in this task you do not have solid current knowledge of?
Examples might be library choice, a protocol, or a security pattern."
- _Per-topic sharpen_ (topic exists but incomplete): "For topic
'{title}': which parts of the plan depend on the answer? What
confidence level do you need — high, medium, or low?"
- _Scope question_: "Is '{topic}' answerable from the existing codebase,
from external docs, or both?"
- _Confirm none_ (user refuses all topics): "Confirming: no external
research needed — you already know everything the plan will depend on?"
**Non-Goals** (optional):
- _Anchor_: "What is explicitly NOT in scope? This prevents scope-guardian
from flagging gaps for things we deliberately don't do."
**Constraints** (optional):
- _Anchor_: "Technical, time, or resource constraints the plan must
respect? Dependencies, compatibility, deadlines, or budget."
- _Sharpen_: "You mentioned {deadline / budget / compatibility}. Is it
hard or guidance?"
**Preferences** (optional):
- _Anchor_: "Preferences for libraries, patterns, or architectural style?"
**NFRs** (optional):
- _Anchor_: "Performance, security, accessibility, or scalability targets?
Quantified wherever possible."
**Prior Attempts** (optional):
- _Anchor_: "Has this been attempted before? What worked or failed?"
### Selection rule
On each loop iteration:
1. Compute the next section to probe:
- If any required section is below the initial-signal gate → pick the
weakest required section in this priority order:
Intent → Goal → Success Criteria → Research Plan.
- Else if an optional section is clearly missing and likely material
to scope (heuristic: the task description hints at constraints or
NFRs) → probe it at most once.
- Else: exit Phase 3.
2. Within the chosen section, pick the question variant:
- If `probes == 0` and content is empty → _Anchor_.
- If content exists but is shallow → _Follow-up_ or _Sharpen_.
- If the section is Research Plan and topics exist → iterate per-topic
sharpen across incomplete topics.
3. Ensure the exact question is NOT already in `question_history`. If it
is, pick the next variant or skip to the next weakest section.
4. Ask via `AskUserQuestion`. Append question to history. Increment probes.
5. Record the answer into `content`. Never overwrite — merge.
### Research topic identification
As the user answers Intent, Goal, or Success Criteria, listen for:
- **Unfamiliar technologies** — libraries, frameworks, protocols not
clearly present in the codebase
- **Version upgrades** — migrating to a new major version
- **Security-sensitive decisions** — auth, crypto, data handling
- **Architectural choices** — pattern X vs Y, library A vs B
- **Unknown integrations** — third-party APIs, external services
- **Compliance / legal** — GDPR, accessibility, industry regulations
When you hear one, add a *candidate* topic to `research_plan.topics` with
only a title and why-it-matters. Probe it on the next Research Plan
iteration using the per-topic sharpen question to fill in:
- Research question (must end in `?`)
- Required for plan steps
- Scope (local / external / both)
- Confidence needed (high / medium / low)
- Estimated cost (quick / standard / deep)
If the user says "I know this" to a candidate topic, remove it from the
list. Trust the user. If no topics emerge after probing, the user confirms
"no external research needed" → `research_plan` gate passes with 0 topics.
### Quick mode adjustments
If **mode = quick**:
- For optional sections, cap probes at 1 each. Do not revisit optional
sections during Phase 3.
- Required sections still have no probe cap — quality gate still applies.
- Prefer _Anchor_ variants over _Sharpen_ on the first pass.
### Force-stop path
If the user says "skip", "stop asking", "just proceed", "enough", or
similar, break the loop immediately:
- Mark any required sections still below the initial-signal gate as
`{ incomplete_forced_stop: true }` in state.
- Proceed to Phase 4 with a note that the brief will carry a reduced
confidence flag.
### Exit condition
Exit Phase 3 when:
- All four required sections meet the initial-signal gate, OR
- The user has force-stopped.
Report:
```
Phase 3 complete: {N} questions asked across {M} sections.
Proceeding to draft and review.
```
## Phase 4 — Draft, review, and revise
Phase 4 runs a **draft → brief-reviewer → revise** loop. The draft is
not written to disk until the brief-review quality gate passes (or the
iteration cap is hit). This ensures the brief that reaches `/trekplan`
has already survived a critical review.
Read the brief template first:
`@${CLAUDE_PLUGIN_ROOT}/templates/trekbrief-template.md`
### Loop bound
**Maximum 3 review iterations.** This bounds cost in the worst case while
leaving room for two rounds of targeted follow-ups.
### Iteration step-by-step
**Step 4a — Draft in memory**
Build the brief text from Phase 3 state by filling the template:
- **Frontmatter:** populate `task`, `slug`, `project_dir`, `research_topics`
(count of topics), `research_status: pending`, `auto_research: false`
(will update in Phase 5 if user opts in), `interview_turns` (total
questions asked across Phase 3 + Phase 4), `source: interview`.
- **Intent:** expand the user's motivation into 35 sentences. Load-bearing.
- **Goal:** concrete end state.
- **Non-Goals:** from state, or "- None explicitly stated" bullet if empty.
- **Constraints / Preferences / NFRs:** from state, or "Not discussed — no
constraints assumed" note if empty.
- **Success Criteria:** falsifiable commands/observations from state.
- **Research Plan:** one `### Topic N: {title}` section per topic with the
full structure from the template. If 0 topics: write the "No external
research needed — user confirmed solid knowledge of all plan
dependencies" note.
- **Open Questions / Assumptions:** from any `"I don't know"` answers
recorded during Phase 3, plus implicit gaps.
- **Prior Attempts:** from state, or "None — fresh task."
**Step 4b — Write draft to disk**
Write the draft to `{PROJECT_DIR}/brief.md.draft` (not `brief.md` — the
final file is only written after the gate passes).
**Step 4c — Launch brief-reviewer**
Launch the `brief-reviewer` agent (foreground, blocking) with the prompt:
> "Review this task brief for quality: `{PROJECT_DIR}/brief.md.draft`.
> Check completeness, consistency, testability, scope clarity, and
> research-plan validity. Report findings, verdict, and the required
> machine-readable JSON block."
**Step 4d — Parse JSON scores**
Parse the agent's output. Locate the **last** fenced ```json``` block.
Extract per-dimension scores:
```
review = {
completeness: { score, gaps },
consistency: { score, issues },
testability: { score, weak_criteria },
scope_clarity: { score, unclear_sections },
research_plan: { score, invalid_topics },
verdict: "PROCEED | PROCEED_WITH_RISKS | REVISE"
}
```
**JSON fallback:** if the JSON block is missing, invalid, or a dimension
is missing, treat all dimensions as `score: 3` and set the `verdict` from
the prose verdict if present, otherwise `PROCEED_WITH_RISKS`. Emit an
internal note that the reviewer output was degraded. This ensures the
loop never deadlocks on a parser error.
**Step 4e — Gate evaluation**
The gate **passes** when all of the following are true:
- `completeness.score ≥ 4`
- `consistency.score ≥ 4`
- `testability.score ≥ 4`
- `scope_clarity.score ≥ 4`
- `research_plan.score == 5`
(Research Plan requires a perfect score because its format is checked
mechanically: ends in `?`, `Required for plan steps` filled, scope is
one of `local | external | both`, confidence is `high | medium | low`.
Anything less means at least one topic is malformed and planning will
stumble.)
**If gate passes:**
1. Move `brief.md.draft``brief.md` (atomic rename).
2. Delete the draft file if rename is not possible on the OS; write
`brief.md` fresh.
3. Break the loop and proceed to Step 4g.
**If gate fails AND iteration count < 3:**
1. Identify the weakest dimension (lowest score; tie broken by priority:
research_plan > testability > completeness > consistency > scope_clarity).
2. Generate a targeted follow-up question from the dimension's detail
field (gaps / issues / weak_criteria / unclear_sections / invalid_topics).
Example generators:
- `completeness.gaps: ["Non-Goals empty, unclear if deliberate"]`
→ "You did not specify anything out-of-scope. Is that deliberate, or
are there things we should explicitly exclude?"
- `testability.weak_criteria: ["'system should be fast'"]`
→ "'System should be fast' is not falsifiable. Which metric or
threshold proves this is met — e.g., p95 < 200ms, or throughput
≥ X requests/sec?"
- `research_plan.invalid_topics: [{"topic":"JWT","issue":"Required for plan steps empty"}]`
→ "For research topic 'JWT': which plan steps depend on the answer?
Give one or two concrete kinds of step (e.g., 'library selection',
'threat model', 'migration strategy')."
3. Ask via `AskUserQuestion`. Record the answer into Phase 3 state.
4. Return to Step 4a with incremented iteration count. The reviewer sees
an updated draft, so you MUST re-read the brief and regenerate the
review each iteration — do not reuse stale scores.
5. When launching the reviewer on iteration 2 or 3, include prior
questions in the prompt so it does not produce circular follow-ups:
> "Questions already asked during this interview: {list from
> question_history}. Focus on issues that remain after those answers —
> do not re-raise gaps that have already been addressed."
**If gate fails AND iteration count == 3 (loop exhausted):**
1. Move `brief.md.draft``brief.md`.
2. Add `brief_quality: partial` to the frontmatter (edit the file
post-rename — insert the key above the closing `---`).
3. Add a `## Brief Quality` section near the end with the failing
dimensions and their `detail` arrays from the final review, formatted:
```
## Brief Quality
Review loop exhausted after 3 iterations. The following dimensions
did not reach the pass threshold:
- **Research Plan (score 2/5):** Topic 'JWT library' missing
Required-for-plan-steps field.
- **Testability (score 3/5):** Success criterion "works correctly"
is not falsifiable.
Downstream planning will treat these as reduced-confidence areas.
```
4. Break the loop and proceed to Step 4g.
### Step 4f — Force-stop handling
If during any `AskUserQuestion` in Step 4e the user says "stop", "skip",
"enough", "just write it", or similar, do NOT exit the loop immediately.
Instead, surface the current review findings in plain text:
```
Brief-reviewer would flag these issues:
- Research Plan (score 2/5): Topic 'JWT library choice' missing Required-for-plan-steps field.
- Testability (score 3/5): Success criterion "works correctly" is not falsifiable.
Continue anyway? The plan will have lower confidence in these areas.
```
Then ask via `AskUserQuestion`:
| Option | Action |
|--------|--------|
| **Answer one more follow-up** | Return to Step 4e with the current weakest-dimension question. |
| **Stop now (accept partial brief)** | Finalize brief with `brief_quality: partial` and the `## Brief Quality` section (same path as iteration-cap exhaustion). Break loop. |
The force-stop path is distinct from a silent iteration cap: the user
sees exactly which dimensions are weak and chooses informed.
### Step 4g — Finalize
After the loop exits (pass, cap, or force-stop), ensure:
- `brief.md` exists at `{PROJECT_DIR}/brief.md`.
- `brief.md.draft` no longer exists.
- If the loop ended without a clean pass, frontmatter contains
`brief_quality: partial` and a `## Brief Quality` section exists.
- If the loop ended with a clean pass, `brief_quality` is either
absent or set to `complete`.
Populate the "How to continue" footer with the actual project path and
topic questions.
**Schema sanity check (since v3.1.0):** before reporting, run the brief
validator. This catches frontmatter typos and state-machine inconsistencies
the brief-reviewer rubric does not check (e.g. `research_status: skipped`
with `research_topics: 3` and no `brief_quality: partial`).
```bash
node ${CLAUDE_PLUGIN_ROOT}/lib/validators/brief-validator.mjs --json "{PROJECT_DIR}/brief.md"
```
If the validator returns errors, report them to the user and offer to
re-enter Phase 4 with the validator's hints in scope. If only warnings,
note them in the final report.
Report:
```
Brief written: {PROJECT_DIR}/brief.md
Review iterations: {1..3}
Final quality: {complete | partial}
Validator: {PASS | warnings(N)}
Research topics identified: {N}
```
## Phase 5 — Auto-orchestration opt-in (if research_topics > 0)
**Skip this phase if research_topics = 0.** Proceed directly to Phase 6.
Ask the user via `AskUserQuestion`:
**Question:** "You have {N} research topic(s). How do you want to proceed?"
| Option | Description |
|--------|-------------|
| **Manual (default)** | Print the commands. You run `/trekresearch` and `/trekplan` yourself, choosing depth per topic. |
| **Auto (managed by Claude Code)** | I run all {N} research topics sequentially in foreground, then automatically trigger `/trekplan` when research completes. This session blocks until the plan is ready. |
### Manual path (default)
Output:
```
## Brief complete
Project: {PROJECT_DIR}/
Brief: {PROJECT_DIR}/brief.md
Research topics: {N}
Next steps (run in order or parallel):
{For each topic:}
/trekresearch --project {PROJECT_DIR} --external "{topic question}"
Then:
/trekplan --project {PROJECT_DIR}
Then:
/trekexecute --project {PROJECT_DIR}
```
Stop. Do not continue to Phase 6.
### Auto path
Set `auto_research: true` in the brief's frontmatter (edit the file).
Emit the brief-approved lifecycle event so downstream observability sees
the pipeline kick off (consumed by `lib/stats/event-emit.mjs`):
```bash
node ${CLAUDE_PLUGIN_ROOT}/lib/stats/event-emit.mjs \
--event brief-approved \
--payload "{\"project\":\"${PROJECT_DIR}\"}"
```
If `gates_mode == true`: pause here via `AskUserQuestion`
"Auto-mode confirmed. Proceed to research now? (yes/no)". If the user
answers no, fall back to the manual path output and stop. Otherwise
proceed to Phase 6.
If `gates_mode == false` (default in auto): proceed directly to Phase 6.
The chain stops only at the main-merge gate (see `commands/trekexecute.md`
Phase 8).
Proceed to Phase 6.
## Phase 6 — Auto research dispatch (auto path only)
**Runs only when user opted into auto mode.**
### Step 6a — Confirm proceed
Tell the user auto mode will run in foreground and block the session, then
confirm via `AskUserQuestion`:
**Question:** "Auto mode runs {N} research topic(s) sequentially and then
the plan — all in foreground. This session blocks until the plan is ready.
Continue?"
| Option | Action |
|--------|--------|
| **Continue — auto** | Proceed. |
| **Cancel — do manual** | Revert to manual path (print commands, stop). |
If cancelled → fall back to manual path output and stop.
### Step 6b — Run research topics sequentially (inline)
Set `research_status: in_progress` in the brief's frontmatter.
For each research topic (index i = 1 .. N), invoke `/trekresearch`
inline in this main-context session:
```
/trekresearch --project {PROJECT_DIR} {--external | --local | (none)} "{topic i question}"
```
Pass the scope flag that matches the topic's scope hint. Wait for each
invocation to finish writing the research brief at
`{PROJECT_DIR}/research/{NN}-{topic-slug}.md` before moving to the next
topic.
> **Why sequential inline instead of parallel background?** Background
> orchestrator-agents cannot spawn the research swarm — the Claude Code
> harness does not expose the Agent tool to sub-agents, so a background
> run silently degrades to single-context reasoning without WebSearch /
> Tavily / WebFetch / Gemini (see v2.4.0 release notes). Running each
> research pass inline in main context keeps the swarm intact. For true
> parallel execution, use `claude -p` invocations in separate terminal
> windows.
### Step 6c — Verify all briefs landed
After the last topic completes, verify each research brief file exists:
```bash
ls -1 {PROJECT_DIR}/research/*.md | wc -l
```
Expected count: N. If any are missing, report and ask the user how to
proceed (retry, skip missing topic, cancel).
Update brief frontmatter: `research_status: complete`.
### Step 6d — Auto-trigger planning (inline foreground)
Invoke the planning command inline in this session:
```
/trekplan --project {PROJECT_DIR}
```
The planning pipeline runs all phases (exploration, synthesis, review) in
main context. Wait for the plan to be written to `{PROJECT_DIR}/plan.md`
before continuing.
### Step 6e — Report completion
When the planning-orchestrator finishes, present:
```
## Ultrabrief + Ultraresearch + Voyage Complete (auto mode)
**Project:** {PROJECT_DIR}/
**Brief:** {PROJECT_DIR}/brief.md
**Research briefs:** {N} in {PROJECT_DIR}/research/
**Plan:** {PROJECT_DIR}/plan.md
### Pipeline summary
| Step | Status |
|------|--------|
| Brief | Complete ({interview_turns} interview turns) |
| Research | Complete ({N} topics, sequential foreground) |
| Plan | Complete ({steps} steps, critic: {verdict}) |
Next:
/trekexecute --project {PROJECT_DIR}
Or:
/trekexecute --dry-run --project {PROJECT_DIR} # preview
/trekexecute --validate --project {PROJECT_DIR} # schema check
```
## Phase 7 — Stats tracking
Append one record to `${CLAUDE_PLUGIN_DATA}/trekbrief-stats.jsonl`:
```json
{
"ts": "{ISO-8601}",
"task": "{task description (first 100 chars)}",
"slug": "{slug}",
"mode": "{default | quick}",
"interview_turns": {N},
"review_iterations": {1..3},
"brief_quality": "{complete | partial}",
"research_topics": {N},
"auto_research": {true | false},
"auto_result": "{completed | cancelled | failed | manual}",
"project_dir": "{path}"
}
```
If `${CLAUDE_PLUGIN_DATA}` is not set or not writable, skip silently.
Never let stats failures block the workflow.
## Hard rules
1. **Interactive only.** This command requires user input. There is no
`--fg` or background mode — the interview cannot run headless.
2. **Brief is the contract.** Every section must have substantive content
or an explicit "Not discussed" note. No empty sections.
3. **Intent is load-bearing.** Do not accept a one-line intent. Expand with
the user until motivation is clear — the plan and every review agent
will trace decisions back to this.
4. **Research topics must be answerable.** Each topic's research question
must be phrased so `/trekresearch` can answer it. If a topic is
too vague, split or reformulate before writing.
5. **Never invent research topics the user did not agree to.** Topics
come from the interview. If the user says "I know this", respect it.
6. **Project dir is the single source of truth.** Every artifact (brief,
research briefs, plan, progress) lives in one project directory.
Never scatter files across `.claude/research/`, `.claude/plans/`, etc.
7. **Auto mode blocks foreground.** If the user opts into auto, this
session waits for research + planning to complete. Document this in
the opt-in question.
8. **Quality gates, not question counts.** Phase 3 and Phase 4 are
quality-gated loops; do not enforce a hard cap on interview questions.
The brief-review gate (Phase 4) caps at 3 review iterations to bound
cost, but Phase 3 has no cap — required-section content drives exit.
9. **Never write `brief.md` while the review gate is still pending.**
Draft lives in `brief.md.draft` until the loop terminates. A caller
that sees `brief.md` must be able to trust that Phase 4 finished.
10. **Privacy:** never log prompt text, secrets, or credentials.

View file

@ -0,0 +1,307 @@
---
name: trekcontinue
description: Resume the next session in a multi-session trekplan project. Reads .session-state.local.json and immediately begins the next session.
argument-hint: "[<project-dir> | --help]"
model: opus
---
# Ultracontinue Local v1.0
Zero-friction multi-session resumption. In a fresh Claude Code session, type
`/trekcontinue` — the command reads the per-project state file
(`.claude/projects/<project>/.session-state.local.json`), shows a 3-line summary,
and immediately begins executing the next session.
The state file is the contract. Any session-end mechanism may write it
(`/trekexecute` Phase 8 / Phase 2.55 / Phase 4, the
`/trekendsession` helper, or — in the future — `graceful-handoff`).
This command only reads.
Pipeline position:
```
/trekplan → plan.md
/trekexecute → progress.json + .session-state.local.json
... session boundary, fresh chat ...
/trekcontinue → reads .session-state.local.json, starts next session
```
See **Handover 7** in `docs/HANDOVER-CONTRACTS.md` for the full schema.
## Phase 0 — `--help` handling
Parse `$ARGUMENTS` with `parseArgs($ARGUMENTS, 'trekcontinue')` from
`lib/parsers/arg-parser.mjs`. Dispatch the usage block ONLY when one of these
two conditions equals exactly true (no substring search, no "contains" check):
- `flags['--help'] === true`, OR
- `positional[0] === '-h'` (single-dash short form — the parser keeps it as
positional because the schema does not declare an alias).
In every other case — including when `$ARGUMENTS` is empty, whitespace-only,
the literal empty string `""`, or a positional project-dir — fall through to
Phase 1. Do NOT print the usage block on empty args.
```
/trekcontinue — Resume the next session in a multi-session trekplan project.
Usage:
/trekcontinue # auto-discover state file under cwd
/trekcontinue <project-dir> # explicit project directory
/trekcontinue --cleanup <project-dir> # dry-run: list stale files
/trekcontinue --cleanup --confirm <project-dir> # actually delete (requires status: completed)
/trekcontinue --help # this message
Reads .claude/projects/<project>/.session-state.local.json (per-project,
gitignored). On a valid resumable state, prints a 3-line summary and begins
executing the next session immediately. No interactive confirmation prompt.
State-file schema (v1):
schema_version 1
project string
next_session_brief_path string (validator soft-checks file existence)
next_session_label string
status in_progress | partial | failed | stopped | completed
(completed → no further sessions to resume)
updated_at ISO-8601 timestamp
(unknown top-level keys are tolerated — forward-compat for graceful-handoff v2.2)
Typical flow:
/trekbrief # produces brief.md
/trekplan --project ... # produces plan.md
/trekexecute --project .. # writes session-state on session-end
... (fresh Claude chat) ...
/trekcontinue # reads session-state, runs next session
```
## Phase 0.5 — Cleanup mode dispatch
After `parseArgs` has resolved `$ARGUMENTS`, check the parsed `flags`
object directly (NOT a string contains-check on raw `$ARGUMENTS` — that
substring pattern was the root cause of Bug 1).
If `flags['--cleanup'] === true`, switch into the terminal cleanup
flow and do NOT proceed to Phase 1 or any later phase.
**Required positional:** an explicit `<project-dir>` (`positional[0]`).
There is no "clean all" mode — accidental wholesale deletion would be
irreversible. If `positional[0]` is missing, empty, or starts with `-`,
print this usage block to stderr and exit non-zero:
```
Error: /trekcontinue --cleanup requires <project-dir>.
Usage:
/trekcontinue --cleanup <project-dir> # dry-run: list stale files
/trekcontinue --cleanup --confirm <project-dir> # actually delete (status: completed)
```
**Compute mode from parsed flags:**
```
dryRun = (flags['--confirm'] !== true)
confirm = (flags['--confirm'] === true)
```
**Invoke cleanup inline.** Emit the concrete project-dir path as a literal
token in the Bash command — never a template placeholder — same
anti-substitution rule as Phase 2:
```
node --input-type=module -e "import {cleanupProject} from './lib/util/cleanup.mjs'; const [, dir, mode] = process.argv; const r = cleanupProject(dir, {dryRun: mode !== 'confirm', confirm: mode === 'confirm'}); console.log(JSON.stringify(r, null, 2)); process.exit(r.valid ? 0 : 1)" '<RESOLVED-PROJECT-DIR>' '<MODE>'
```
Substitute `<RESOLVED-PROJECT-DIR>` with the literal `positional[0]`
value you have in your working context, and `<MODE>` with either the
literal string `dryrun` or the literal string `confirm` based on the
booleans above. The validator emits a `{valid, errors, warnings, parsed}`
JSON record. Print it to stdout. Exit with the validator's exit code.
**Cleanup is a terminal mode.** It must not fall through to Phase 1/2/3/4.
Operators who want to resume after cleanup must invoke `/trekcontinue`
again without `--cleanup`.
## Phase 1 — Resolve project directory
The parsed `positional[0]` from Phase 0 is the explicit project-dir argument,
when present. Otherwise (empty `$ARGUMENTS` or whitespace-only) auto-discover.
### Step 1.a — Reject `.md` positional argument (SC-2)
If `positional[0]` is non-empty AND ends in `.md`, the user almost certainly
pasted a `NEXT-SESSION-PROMPT.local.md` path instead of a project directory.
Print the following diagnostic to stderr and exit non-zero. Do NOT proceed.
```
Error: expected <project-dir>, got a markdown file path: <positional[0]>
Did you mean to paste the file path as a project directory?
Usage: /trekcontinue <project-dir>
```
### Step 1.b — Auto-discover candidates
When no explicit project-dir was given, enumerate
`.claude/projects/*/.session-state.local.json` paths with `node -e`
(NOT shell glob — harness-mode safety) and emit each as one JSON line of
`{"path": ..., "updated_at": ...}` so Phase 1 can sort numerically:
```bash
!`node -e "const fs=require('fs'),path=require('path');const root='.claude/projects';if(!fs.existsSync(root))process.exit(0);for(const d of fs.readdirSync(root)){const p=path.join(root,d,'.session-state.local.json');if(!fs.existsSync(p))continue;let u='';try{u=(JSON.parse(fs.readFileSync(p,'utf8'))||{}).updated_at||''}catch(_){};process.stdout.write(JSON.stringify({path:p,updated_at:u})+'\\n');}"`
```
Sort the emitted candidates by `Date.parse(updated_at)` descending (newest
first) — numeric comparison, NOT lexicographic string compare. The newest
resumable state wins.
### Step 1.c — Decision tree
- **0 candidates and no explicit arg:** print SC-2 cold-start message and exit:
```
No active multi-session project here.
Start with /trekbrief or /trekplan.
```
- **1 candidate (or explicit non-`.md` arg):** continue to Phase 2 with that path.
- **>1 candidates and no explicit arg:** with the Date.parse sort applied, the
newest resumable state wins automatically and the command continues to Phase 2
with that path. (Operators who want a different candidate re-invoke as
`/trekcontinue <project-dir>`.)
## Phase 1.5 — Frontmatter consistency check
Bug 3 contract: producers (`/trekexecute`, `/trekendsession`)
write `NEXT-SESSION-PROMPT.local.md` with YAML frontmatter (`produced_by:`,
`produced_at:`). Multiple producers may have written candidates in different
locations; this phase refuses ambiguity before validating the state file.
After resolving the project directory and state-file path, look for two
`NEXT-SESSION-PROMPT.local.md` candidates:
a. `<plugin-root>/NEXT-SESSION-PROMPT.local.md` — operator-managed master file
b. `<project-dir>/NEXT-SESSION-PROMPT.local.md` — producer-written sibling
**If both exist:**
- Read both via the **Read tool** (NOT Bash — same anti-substitution rule
as Phase 2).
- Invoke the consistency validator with both paths emitted as concrete
literal tokens (no template substitution at the Bash boundary):
```
node lib/validators/next-session-prompt-validator.mjs --json --consistency <RESOLVED-PATH-A> <RESOLVED-PATH-B>
```
Replace `<RESOLVED-PATH-A>` and `<RESOLVED-PATH-B>` with the two concrete
filesystem paths you have in your working context. The validator emits
`{valid, errors, warnings}` JSON on stdout.
- **If `valid: false`** (typically `NEXT_SESSION_PROMPT_PRODUCER_MISMATCH`):
print the structured `errors[]` (each `[code] message` on its own line),
list both candidate paths, and exit non-zero. Do NOT proceed to Phase 2.
Resolve the conflict by deleting the stale candidate (run
`/trekcontinue --cleanup --confirm <project-dir>` after the
current session closes, or remove by hand).
- **If `valid: true` with a `NEXT_SESSION_PROMPT_WALL_CLOCK_DRIFT` warning**
(one of the candidates is more than 24h old): print the warning to stderr
but continue — long pauses (weekend, vacation) are not failures.
- **If `valid: true` with a `NEXT_SESSION_PROMPT_STALE_IGNORED` warning**
(one candidate is older than the state file's `updated_at`): print the
warning and continue. The state-anchored check is the primary refusal
signal; staleness simply rejects the older candidate.
**If only one exists:** continue to Phase 2. No comparison needed.
**If neither exists:** continue to Phase 2. Legacy projects and first-run
flows have no NEXT-SESSION-PROMPT files.
## Phase 2 — Validate the state file
Phase 1 resolved a concrete state-file path. That path is a real string in
your working context — never a template. Phase 2 must read and validate the
state file without any placeholder substitution.
### Step 2.a — Read the file with the Read tool (no Bash)
Use the **Read tool** on the resolved state-file path from Phase 1. Do NOT use
Bash for the read. The Read tool is deterministic and not subject to
shell-substitution errors. Parse the returned JSON body programmatically.
### Step 2.b — Schema-validate the parsed object
Verify the schema by invoking the existing validator CLI shim. Emit the
resolved absolute path as a literal string token in the Bash command — the
exact same string you just passed to the Read tool. The validator accepts
`--json <path>` and prints a `{valid, errors, warnings}` JSON record:
```
node lib/validators/session-state-validator.mjs --json <RESOLVED-ABSOLUTE-PATH-FROM-PHASE-1>
```
Replace `<RESOLVED-ABSOLUTE-PATH-FROM-PHASE-1>` with the actual path string at
the time you issue the Bash call. There is no template engine; the string is
substituted by you, the model, before the Bash tool sees the command.
**Anti-substitution invariant.** If you ever find yourself about to emit a
literal angle-bracket placeholder, or any other unresolved variable name, to
the Bash tool — STOP. The resolved path is a concrete value you already have
from Phase 1; emit the value, not a placeholder for it.
### Step 2.c — Interpret the result
- **Validator exit code != 0 OR `valid: false` in JSON output:** print the
structured `errors[]` (each `[code] message` on its own line) and exit. Do
not proceed to narration. Suggest running the validator directly for
follow-up: `node lib/validators/session-state-validator.mjs <path>`.
- **`valid: true` AND any warning has `code: SESSION_STATE_NOT_RESUMABLE`**
(i.e. `status: completed`): print "no further sessions to resume; project
complete" and exit cleanly.
- **`valid: true` AND status is one of `in_progress | partial | failed | stopped`:**
proceed to Phase 3.
## Phase 3 — Narrate 3-line summary
Print this exact template (using values from the validated `parsed` object):
```
Project: {project}
Next session: {next_session_label}
Brief: {next_session_brief_path}
```
No interactive confirmation prompt — per the brief NFR ("ingen prompts, men la
informasjon synes"). The 3-line block is informational only.
## Phase 4 — Begin execution
Read the file at `next_session_brief_path` (it is the brief that the next
session is supposed to execute — typically the same `brief.md` for
single-brief multi-session plans, or a session-specific spec for parallel
session decomposition). Understand the task and begin executing per the
standard trekplan pipeline. The user did not type a separate "start"
command — `/trekcontinue` is the start.
If the brief file does not exist (validator emits a warning but does not
fail), print: `Warning: next_session_brief_path "{path}" does not exist on
disk. Cannot continue automatically.` and exit. Do not guess.
## Phase 5 — Stats tracking
Append a one-line JSON record to `${CLAUDE_PLUGIN_DATA}/trekcontinue-stats.jsonl`
if the env var is set; silently skip otherwise.
```json
{"ts":"<iso-8601>","project":"<project>","next_session_label":"<label>","status":"<status>"}
```
## Hard rules
- **Idempotent.** Running `/trekcontinue` twice in the same Claude session
does not advance state — the writer (Phase 8 / hook / helper) advances state
only when a session completes.
- **Zero secrets in the state file.** Status, paths, labels — never API keys,
never user content beyond filenames.
- **NEVER auto-load via SessionStart.** The command is operator-invoked only.
Auto-loading would re-introduce the stale-file risk noted in
`feedback_next_session_prompt_manual.md`.
- **No interactive prompts.** Phases 04 must run without `AskUserQuestion`.
This keeps the command headless-safe.

View file

@ -0,0 +1,172 @@
---
name: trekendsession
description: Mark the current session as complete and write session-state pointing at the next session. Helper for informal multi-session flows.
argument-hint: "<next-brief-path> <next-label> | --help"
model: sonnet
---
# Voyage End-Session Local v1.0
Tiny helper for **informal** multi-session flows (no formal plan with
Execution Strategy). Writes a `.session-state.local.json` pointing at the
next session so `/trekcontinue` can resume in a fresh Claude chat.
For formal flows (a plan produced by `/trekplan --project`),
`/trekexecute` Phase 8 already writes the state file — this helper
is unnecessary there. Use this command for ad-hoc release runs, manual
multi-session handovers, or any flow that does not run through
`/trekexecute`.
Pipeline position:
```
... session N work ...
/trekendsession <brief> "<next-label>" → writes state
... session boundary, fresh chat ...
/trekcontinue → reads state, starts session N+1
```
See **Handover 7** in `docs/HANDOVER-CONTRACTS.md` for the schema.
## Phase 0 — `--help` handling
If `$ARGUMENTS` contains `--help` or `-h`, print the usage block below and exit
cleanly. Do NOT proceed to any further phase.
```
/trekendsession — Mark current session done; point at next session.
Usage:
/trekendsession <next-brief-path> <next-label>
/trekendsession --help
Both arguments are REQUIRED. No interactive prompt — headless-safe.
Writes <project-dir>/.session-state.local.json with:
schema_version 1
project <auto-resolved from cwd>
next_session_brief_path <next-brief-path argument>
next_session_label <next-label argument>
status in_progress
updated_at <now, ISO-8601>
Then validates via lib/validators/session-state-validator.mjs and prints
the same 3-line narration that /trekcontinue will show in the next session.
Example:
/trekendsession .claude/projects/2026-05-01-feature/brief.md "Session 2 of 3"
```
## Phase 1 — Resolve project directory
Resolve the nearest `.claude/projects/*/brief.md` from cwd (the current working
directory). Use `node -e` enumeration (NOT shell glob — harness-mode safety):
```bash
!`node --input-type=module -e "import {existsSync, readdirSync} from 'node:fs'; import {join} from 'node:path'; const root='.claude/projects'; if(!existsSync(root)) process.exit(0); readdirSync(root).filter(d=>existsSync(join(root,d,'brief.md'))).forEach(d=>process.stdout.write(join(root,d)+'\\n'));"`
```
Decision tree:
- **0 candidates:** print error to stderr — "no `.claude/projects/<dir>/brief.md`
found under cwd; cannot determine project directory" — and exit 1. Do NOT
fall back to a synthesized path.
- **1 candidate:** use it as `<project-dir>`. Continue.
- **>1 candidates:** print all paths and ask the operator to `cd` into the
intended project directory before retrying. Exit 1.
## Phase 2 — Required args check (headless-safe)
Read `$ARGUMENTS`. Both `<next-brief-path>` and `<next-label>` are required.
If either is missing or empty:
```
Error: missing required args.
Usage: /trekendsession <next-brief-path> '<next-label>'
```
Print to stderr and exit 1. **No interactive prompt** — this keeps the helper
headless-safe (per brief NFR; addresses adversarial-review major #11). If you
want an interactive flow, use `/trekcontinue --help` to see the full pipeline.
## Phase 3 — Atomically write `.session-state.local.json` + sibling NEXT-SESSION-PROMPT.local.md
Write `<project-dir>/.session-state.local.json` with the schema-v1 object:
```json
{
"schema_version": 1,
"project": "<project-dir>",
"next_session_brief_path": "<arg 1>",
"next_session_label": "<arg 2>",
"status": "in_progress",
"updated_at": "<now, ISO-8601>"
}
```
Use the atomic-write util — write to `<path>.tmp`, then `rename` into place —
to avoid partial-state on crash. The util is ESM, so invoke via
`node --input-type=module -e` with an `import` statement (a CommonJS shim
would throw `ERR_REQUIRE_ESM` on Node 18+ since `atomic-write.mjs` is ESM).
Under `node --input-type=module -e "<script>" arg1 arg2 arg3`, Node sets
`process.argv[0]` to the node binary path and user args start at
`process.argv[1]`. Adjust the destructure if your Node version differs.
This phase ALSO writes a sibling `NEXT-SESSION-PROMPT.local.md` in the
project directory with YAML frontmatter (`produced_by: trekendsession`,
`produced_at: <ISO-8601>`, `project: <project-dir>`). Both files are written
in a single ESM block so the writes succeed or fail together:
```bash
!`node --input-type=module -e "
import path from 'node:path';
import { writeFileSync } from 'node:fs';
import { atomicWriteJson } from './lib/util/atomic-write.mjs';
const [, dir, brief, label] = process.argv;
const now = new Date().toISOString();
const stateObj = { schema_version: 1, project: dir, next_session_brief_path: brief, next_session_label: label, status: 'in_progress', updated_at: now };
const stateFile = path.join(dir, '.session-state.local.json');
atomicWriteJson(stateFile, stateObj);
const promptFile = path.join(dir, 'NEXT-SESSION-PROMPT.local.md');
const promptBody = '---\\nproduced_by: trekendsession\\nproduced_at: ' + now + '\\nproject: ' + dir + '\\n---\\n\\n# ' + label + '\\n\\nResume via /trekcontinue.\\n';
writeFileSync(promptFile, promptBody);
console.log(stateFile);
console.log(promptFile);
" '<project-dir>' '<next-brief-path>' '<next-label>'`
```
## Phase 4 — Validate + narrate
Validate the freshly-written state file:
```bash
!`node lib/validators/session-state-validator.mjs --json <project-dir>/.session-state.local.json`
```
If `valid: true`, print the success block matching `/trekcontinue` Phase 3
narration (SC-8 cross-project consistency — same template both sides):
```
Session state written: <project-dir>/.session-state.local.json
Project: <project-dir>
Next session: <next-label>
Brief: <next-brief-path>
In a fresh Claude session, run /trekcontinue to resume.
```
If `valid: false`, print the structured `errors[]` and exit 1. Investigate
before retrying — usually means a bad path or label argument.
## Hard rules
- **Required args, no defaults.** Never invent a brief path or session label.
If args are missing, fail loud.
- **Atomic write only.** Tmp + rename — no partial state files on disk.
- **Zero secrets.** Status, paths, labels — never API keys, never user content
beyond filenames.
- **NEVER auto-invoke this command.** It is operator-typed only at session-end.
- **Idempotent within a session.** Running twice with the same args
overwrites cleanly (atomic rename); does not double-advance.

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,855 @@
---
name: trekplan
description: Deep implementation planning from a task brief. Requires --brief or --project. Runs parallel specialized agents, optional external research, and adversarial review.
argument-hint: "--brief <path> | --project <dir> [--fg | --quick | --research <brief> | --decompose <plan> | --export <fmt> <plan>]"
model: opus
allowed-tools: Agent, Read, Glob, Grep, Write, Edit, Bash, AskUserQuestion, TaskCreate, TaskUpdate, TeamCreate, TeamDelete
---
# Voyage Local v2.0
Deep, multi-phase implementation planning driven by a **task brief**.
Planning consumes the brief (produced by `/trekbrief`) and any
research briefs referenced in it, then runs specialized exploration
agents, synthesis, and adversarial review to produce an executable plan.
**v2.0 is a breaking release.** The interview phase has been extracted
into `/trekbrief`. This command no longer accepts free-text task
descriptions — it requires either `--brief <path>` or `--project <dir>`.
Pipeline position:
```
/trekbrief → brief.md
/trekresearch → research/*.md
/trekplan → plan.md (this command)
/trekexecute → execution
```
## Phase 1 — Parse mode and validate input
Parse `$ARGUMENTS` for mode flags. Order of precedence:
1. **`--export <format> <plan-path>`** — extract `{format}` (first token after
`--export`) and `{plan-path}` (remainder). Valid formats: `pr`, `issue`,
`markdown`, `headless`. Set **mode = export**.
If format is not in the valid set:
```
Error: unknown export format '{format}'. Valid: pr, issue, markdown, headless
```
If the plan file does not exist:
```
Error: plan file not found: {path}
```
2. **`--decompose <plan-path>`** — extract the plan path. Set **mode = decompose**.
If the plan file does not exist:
```
Error: plan file not found: {path}
```
3. **`--project <dir>`** — extract the project directory path.
- Resolve `{dir}` (trim trailing slash).
- Derive implicit flags:
- `--brief {dir}/brief.md`
- Plan destination: `{dir}/plan.md`
- Research briefs auto-discovered from `{dir}/research/*.md` (sorted).
- If `{dir}` does not exist or `{dir}/brief.md` is missing:
```
Error: project directory not initialized. Run /trekbrief to create it.
Missing: {dir}/brief.md
```
- Set **project_dir = {dir}**, **brief_path = {dir}/brief.md**.
- **Validate inputs** (soft mode — warnings do not block, errors do):
```bash
# Brief schema sanity check (frontmatter + state machine, soft on body sections)
node ${CLAUDE_PLUGIN_ROOT}/lib/validators/brief-validator.mjs --soft --json "{dir}/brief.md"
# Research briefs (if any) — drift-warn only, none of these block the run
[ -d "{dir}/research" ] && \
node ${CLAUDE_PLUGIN_ROOT}/lib/validators/research-validator.mjs --soft --dir "{dir}/research" --json
# Architecture note discovery (EXTERNAL CONTRACT — drift-WARN, never drift-FAIL)
node ${CLAUDE_PLUGIN_ROOT}/lib/validators/architecture-discovery.mjs --json "{dir}"
```
Each call exits 0 on success or with a structured JSON error report on stderr.
Surface any warnings in the user-facing summary at Phase 3, but do not abort.
- Set **has_research_brief = true** if `{dir}/research/*.md` matches ≥ 1 file.
- Read the architecture-discovery JSON output: set **has_architecture_note = true**
if `found == true`. The discovery module emits warnings if the file lives at a
non-canonical path (e.g. `architecture-overview.md`); preserve them for the
user-facing summary. If set, **architecture_note_path = {result.overview}**.
Produced by an external opt-in architect plugin (no longer publicly distributed;
the filesystem slot remains available for any compatible producer). Missing file
is fine — additive discovery, not required.
4. **`--brief <path>`** — extract the brief path. If the file does not exist:
```
Error: brief file not found: {path}
```
Set **brief_path = {path}**. Plan destination will be derived in Phase 3
from the brief's slug and date (see Phase 3).
5. **`--research <brief.md> [brief2.md] [brief3.md]`** — collect paths after
`--research` until the next `--` flag or a token that does not look like a
file path. Maximum 3 briefs. Set **has_research_brief = true**. Validate
each path exists — if any is missing:
```
Error: research brief not found: {path}
```
`--research` combines with `--brief`, `--project`, `--fg`, and `--quick`.
When combined with `--project`, the explicit `--research` briefs are
appended to the auto-discovered ones (deduplicated by path).
6. **`--fg`** — accepted as a no-op alias for backwards compatibility. All
phases always run in the main session as of v2.4.0.
7. **`--quick`** — set **mode = quick**. Skip agent swarm; use lightweight
Glob/Grep scan and go directly to planning + adversarial review.
8. **`--gates`** — autonomy control. When present, set `gates_mode = true`.
Pause for operator confirmation after Phase 5 (exploration), Phase 7
(synthesis), and Phase 9 (adversarial review). Default `gates_mode =
false` lets phases flow continuously. The flag is consumed by the
autonomy-gate state machine via the CLI shim:
`node ${CLAUDE_PLUGIN_ROOT}/lib/util/autonomy-gate.mjs --state X --event Y --gates {true|false}`.
9. If neither `--brief` nor `--project` is present after flag parsing,
output usage and stop:
```
Usage: /trekplan --brief <path-to-brief.md>
/trekplan --project <project-dir>
/trekplan --brief <path> --research <research-brief.md>
/trekplan --project <dir> --fg
/trekplan --project <dir> --quick
/trekplan --export <pr|issue|markdown|headless> <plan-path>
/trekplan --decompose <plan-path>
A brief is required. Produce one with /trekbrief first.
Modes:
--brief Plan from a brief file (foreground, v2.4.0+)
--project Plan from a project directory (brief.md + research/ auto-resolved)
--research Add up to 3 extra research briefs as planning context
--fg No-op alias (foreground is the only mode as of v2.4.0)
--quick Skip exploration agent swarm; plan directly
--export Generate shareable output from an existing plan (no new planning)
--decompose Split an existing plan into self-contained headless sessions
Examples:
/trekplan --project .claude/projects/2026-04-18-jwt-auth
/trekplan --brief .claude/projects/2026-04-18-jwt-auth/brief.md
/trekplan --project .claude/projects/2026-04-18-jwt-auth --research extra.md
/trekplan --project .claude/projects/2026-04-18-jwt-auth --fg
/trekplan --export pr .claude/plans/trekplan-2026-04-06-rate-limiting.md
/trekplan --decompose .claude/plans/trekplan-2026-04-06-rate-limiting.md
Migrating from v1.x? See MIGRATION.md in this plugin. The old --spec flag
and free-text interview mode were removed in v2.0.
```
Do not continue past this step if no brief was provided.
### Read the brief
Read the brief file and parse its frontmatter. Extract:
- `task` — one-line task description
- `slug` — slug for plan filenames
- `project_dir` — if present, overrides derived project path (optional)
- `research_topics` — N (used as a sanity check)
- `research_status``pending | in_progress | complete | skipped`
If `research_status == pending` and `research_topics > 0`:
- Warn the user: "Brief declares {N} research topics but research is still
pending. Plan confidence will be lower. Continue anyway?"
- `AskUserQuestion`: **Continue with low confidence** / **Cancel — run research first**.
- If cancel: print the research invocations from the brief's "How to continue"
section and stop.
Report the detected mode:
```
Mode: {foreground | quick | export | decompose}
Brief: {brief_path}
Project: {project_dir or "-"}
Research: {N local briefs, M extra via --research}
```
### When the input is type:trekreview (Handover 6)
The brief input may be a `review.md` produced by `/trekreview`
instead of a `brief.md` produced by `/trekbrief`. Both files
share the same handover slot — `type` is the discriminator.
If `fm.type === 'trekreview'`:
1. Skip the `research_status` gate above (review.md has no
`research_topics` and no Research Plan section).
2. Extract the `findings` array from the frontmatter — this is the
list of 40-char hex finding-IDs the review surfaced.
3. Read the body's last fenced ```json``` block to recover the full
finding objects (the frontmatter only has IDs; the JSON has the
`severity`, `file`, `line`, `rule_key`, `title`, `detail`,
`recommended_action` payload).
4. Filter findings to severity ∈ `{BLOCKER, MAJOR}`. MINOR and
SUGGESTION are skipped for v1.0 plan-input — they are advisory
only and would inflate the plan with low-priority churn.
5. Treat each remaining finding as a plan goal:
- `recommended_action` → step intent
- `file` → primary `Files:` target
- `id` → goes into the plan's `source_findings:` frontmatter list
6. When writing `plan.md`, populate the frontmatter field
`source_findings: [<id1>, <id2>, ...]` containing exactly the IDs
of the BLOCKER + MAJOR findings consumed. The list provides the
audit trail back to `review.md`.
7. Use **block-style YAML** for the `source_findings:` list. The
frontmatter parser at `lib/util/frontmatter.mjs` does not support
flow-style arrays; `source_findings: [a, b]` is broken — use:
```yaml
source_findings:
- 0123456789abcdef0123456789abcdef01234567
- fedcba9876543210fedcba9876543210fedcba98
```
`source_findings:` is **additive and optional** — plans produced from a
`type: brief` input simply omit the field. No `plan_version` bump is
required for this addition (backwards compatible).
## Phase 1.5 — Export (runs only when mode = export)
**Skip this phase entirely unless mode = export.**
Read the plan file. Extract these sections from the plan content:
- Task description (from Context section)
- Implementation steps (from Implementation Plan section)
- Risks (from Risks and Mitigations section)
- Test strategy (from Test Strategy section, if present)
- Scope estimate (from Estimated Scope section)
### Format: `pr`
Output a markdown block formatted as a PR description:
```
## Summary
{23 sentence summary of what this change does and why}
## Changes
{Bulleted list of implementation steps, one line each}
## Test plan
{Bulleted checklist from test strategy, formatted as - [ ] items}
## Risks
{Risks from plan, abbreviated to 1 line each}
---
*Generated by trekplan from {plan filename}*
```
### Format: `issue`
Output a markdown block formatted as an issue comment:
```
## Implementation plan summary
**Task:** {task description}
**Plan file:** {plan path}
**Scope:** {N files, complexity}
### Proposed approach
{35 bullet points from key implementation steps}
### Open questions / risks
{Top 23 risks from plan}
---
*Generated by trekplan*
```
### Format: `markdown`
Output the plan content with internal metadata stripped:
- Remove the "Revisions" section
- Remove plan-critic and scope-guardian scores/verdicts
- Remove `[ASSUMPTION]` markers (but keep the surrounding sentence)
- Keep everything else verbatim
### Format: `headless`
This is a shortcut for `--decompose`. It runs the full session decomposition
pipeline and is equivalent to `--decompose {plan-path}`. Proceed to
Phase 1.6 (Decompose) below.
---
After outputting the formatted block (for pr/issue/markdown), say:
```
Export complete ({format}). Copy the block above.
```
Then **stop**. Do not continue to any subsequent phase.
## Phase 1.6 — Decompose (runs only when mode = decompose or export headless)
**Skip this phase entirely unless mode = decompose or export format = headless.**
Read the plan file. Verify it contains an Implementation Plan section with
numbered steps. If no steps are found, report and stop:
```
Error: plan has no implementation steps. Run /trekplan first to generate a plan.
```
Determine the output directory from the plan slug:
- Extract the slug from the plan filename (e.g., `trekplan-2026-04-06-auth-refactor``auth-refactor`)
- Output directory: `.claude/trekplan-sessions/{slug}/`
Launch the **session-decomposer** agent:
```
Plan file: {plan path}
Plugin root: ${CLAUDE_PLUGIN_ROOT}
Output directory: .claude/trekplan-sessions/{slug}/
```
The session-decomposer will:
1. Parse the plan's steps and their file dependencies
2. Build a dependency graph between steps
3. Group steps into sessions of 35 steps each
4. Identify which sessions can run in parallel (waves)
5. Generate one session spec file per session
6. Generate a dependency diagram (mermaid)
7. Generate a launch script (`launch.sh`)
When the session-decomposer completes, present the summary to the user:
```
## Decomposition Complete
**Master plan:** {plan path}
**Sessions:** {N} across {W} waves
**Output:** .claude/trekplan-sessions/{slug}/
### Sessions
| # | Title | Steps | Wave | Parallel |
|---|-------|-------|------|----------|
{session table from decomposer}
### Files generated
- Session specs: .claude/trekplan-sessions/{slug}/session-*.md
- Dependency graph: .claude/trekplan-sessions/{slug}/dependency-graph.md
- Launch script: .claude/trekplan-sessions/{slug}/launch.sh
You can:
- Review individual session specs before running
- Run all sessions: `bash .claude/trekplan-sessions/{slug}/launch.sh`
- Run a single session: `claude -p "$(cat .claude/trekplan-sessions/{slug}/session-1-*.md)"`
- Say **"launch"** to start headless execution from here
```
If the user says **"launch"**: run the launch script via Bash.
Then **stop**. Do not continue to any subsequent phase.
## Phase 2 — (removed in v2.0)
The interview phase has moved to `/trekbrief`. This command no
longer asks the user any requirements questions — the brief is the
authoritative input.
## Phase 3 — Destination and context recap (foreground)
Determine the plan destination path:
- If `project_dir` is set (from `--project` or the brief's `project_dir`
frontmatter field): **plan destination = {project_dir}/plan.md**.
- Otherwise: derive slug and date — if the brief has frontmatter `slug` and
`created`, use them; otherwise extract from the brief filename. Destination:
`.claude/plans/trekplan-{YYYY-MM-DD}-{slug}.md`.
Collect all research briefs (from `--research` flag and auto-discovered
`{project_dir}/research/*.md`).
Report to the user:
```
Planning pipeline running in foreground.
Brief: {brief_path}
Project: {project_dir or "-"}
Plan: {plan destination}
Research briefs: {N}
Architecture note: {present | none}
```
Then continue to the next phase inline.
> **Why foreground?** As of v2.4.0 the planning-orchestrator is no longer
> spawned as a background agent. The Claude Code harness does not expose the
> Agent tool to sub-agents, so an orchestrator launched with
> `run_in_background: true` cannot spawn the documented exploration swarm
> (`architecture-mapper`, `task-finder`, `plan-critic`, etc.) and silently
> degrades to single-context reasoning. Running the phases inline in main
> context keeps the swarm intact. Use `claude -p` in a separate terminal
> window for long-running headless work.
---
**All remaining phases run inline in the main command context.**
---
## Phase 4 — Codebase sizing
Determine codebase scale to calibrate agent turns (not agent count).
Run via Bash:
```
find . -type f \( -name "*.ts" -o -name "*.tsx" -o -name "*.js" -o -name "*.jsx" -o -name "*.py" -o -name "*.go" -o -name "*.rs" -o -name "*.java" -o -name "*.rb" -o -name "*.c" -o -name "*.cpp" -o -name "*.h" -o -name "*.cs" -o -name "*.swift" -o -name "*.kt" -o -name "*.sh" -o -name "*.md" \) -not -path "*/node_modules/*" -not -path "*/.git/*" -not -path "*/vendor/*" -not -path "*/dist/*" -not -path "*/build/*" | wc -l
```
Classify:
- **Small** (< 50 files)
- **Medium** (50500 files)
- **Large** (> 500 files)
Report:
```
Codebase: {N} source files ({scale}). Deploying exploration agents.
```
## Phase 4b — Brief review
Launch the **brief-reviewer** agent:
Prompt: "Review this task brief for quality: {brief_path}. Check completeness,
consistency, testability, scope clarity, and research-plan validity."
Handle the verdict:
- **PROCEED** — continue to Phase 5.
- **PROCEED_WITH_RISKS** — continue, carry flagged risks as `[ASSUMPTION]` in the plan.
- **REVISE** — present findings and ask the user for clarification
(foreground is the only mode). If the user force-stops, carry outstanding
findings as `[ASSUMPTION]` entries.
## Phase 5 — Parallel exploration (specialized agents + research)
**If mode = quick:** Do NOT launch any exploration agents. Instead, run a
lightweight file check:
- `Glob` for files matching key terms from the brief's task/intent (up to 3 patterns)
- `Grep` for function/type definitions matching key terms (up to 3 patterns)
Report findings as:
```
Quick scan: {N} potentially relevant files found via Glob/Grep.
No agent swarm — proceeding directly to planning.
```
Then skip Phase 6 (deep-dives) and proceed to Phase 7 (Synthesis) with only
the quick-scan results.
---
**All other modes:** Launch exploration agents **in parallel** (all in a single
message). Use the specialized agents from the `agents/` directory.
**All agents run for all codebase sizes.** Scale `maxTurns` by size (small: halved,
medium: default, large: default) instead of dropping agents.
| Agent | Small | Medium | Large | Purpose |
|-------|-------|--------|-------|---------|
| `architecture-mapper` | Yes | Yes | Yes | Codebase structure, patterns, anti-patterns |
| `dependency-tracer` | Yes | Yes | Yes | Module connections, data flow, side effects |
| `risk-assessor` | Yes | Yes | Yes | Risks, edge cases, failure modes |
| `task-finder` | Yes | Yes | Yes | Task-relevant files, functions, types, reuse candidates |
| `test-strategist` | Yes | Yes | Yes | Test patterns, coverage gaps, strategy |
| `git-historian` | Yes | Yes | Yes | Recent changes, ownership, hot files, active branches |
| `research-scout` | Conditional | Conditional | Conditional | External docs (only when unfamiliar tech detected AND no research brief covers it) |
| `convention-scanner` | No | Yes | Yes | Coding conventions, naming, style, test patterns |
### Always launch (all codebase sizes):
**architecture-mapper** — full codebase structure, tech stack, patterns, anti-patterns.
Prompt: "Analyze the architecture of this codebase. The task being planned is: {task}"
**dependency-tracer** — module connections, data flow, side effects for task-relevant code.
Prompt: "Trace dependencies and data flow relevant to this task: {task}. Focus on modules
that will be affected by the implementation."
**risk-assessor** — risks, edge cases, failure modes, technical debt near task area.
Prompt: "Assess risks and failure modes for implementing this task: {task}. Check for
complexity hotspots, security boundaries, and technical debt in the relevant code."
**task-finder** — all files, functions, types, and interfaces directly related to the task.
Prompt: "Find all code relevant to this task: {task}. Include existing implementations
that solve similar problems, API boundaries, database models, configuration files.
Report file paths and line numbers for every finding."
**test-strategist** — existing test patterns, coverage gaps, test strategy.
Prompt: "Analyze the test infrastructure and design a test strategy for this task: {task}.
Discover existing patterns and identify coverage gaps."
**git-historian** — recent changes, code ownership, hot files, active branches.
Prompt: "Analyze git history relevant to this task: {task}. Report recent changes,
ownership, hot files, and active branches that may affect planning."
### Launch for medium+ codebases (50+ files):
**Convention Scanner** — use the `convention-scanner` plugin agent (model: "sonnet")
for medium+ codebases only.
Provide concrete examples from the codebase, not generic advice."
### Conditional: External research
After reading the brief, determine if the task involves technologies, APIs, or
libraries that are:
- Not clearly present in the codebase
- Being upgraded to a new major version
- Being used in an unfamiliar way
**Skip research-scout** for any topic already answered by an attached research
brief. If the brief's `research_status == complete` and all `Research Plan`
topics have corresponding research files, skip research-scout entirely.
If yes (and not covered by attached briefs): launch **research-scout** in
parallel with the other agents.
Prompt: "Research the following technologies for this task: {task}.
Specific questions: {list specific questions about the technology}.
Technologies to research: {list}."
If no external technology is involved or all topics are covered by briefs:
skip research-scout and note:
"No external research needed — covered by research briefs / well-represented in codebase."
## Phase 6 — Targeted deep-dives
After all Phase 5 agents complete, review their results and identify **knowledge gaps**
— areas where exploration was too shallow to plan confidently.
Common reasons for deep-dives:
- A critical function was found but its implementation details are unclear
- A dependency chain needs tracing to understand side effects
- A test pattern was identified but the test infrastructure needs more detail
- A risk was flagged but the actual impact needs verification
For each significant gap, spawn a targeted deep-dive agent (model: "sonnet",
subagent_type: "Explore") with a narrow, specific brief.
Launch up to 3 deep-dive agents in parallel. If no gaps exist, skip this phase
and note: "Initial exploration was sufficient — no deep-dives needed."
## Phase 7 — Synthesis
After all agents complete (initial + deep-dives + research), synthesize:
1. Read all agent results carefully
2. Identify overlaps and contradictions between agents
3. Build a mental model of the codebase architecture
4. Catalog reusable code: existing functions, utilities, patterns
5. Integrate research findings with codebase analysis
6. Note remaining gaps — things you cannot determine from code or research
(these become assumptions in the plan, marked explicitly)
7. For each finding, track whether it came from **codebase analysis** or
**external research** — the plan must distinguish these sources
Do NOT write this synthesis to disk. It is internal working context only.
## Phase 8 — Deep planning
> **Schema-drift defense (sealed inline so this contract survives even when
> `agents/planning-orchestrator.md` is not implicitly loaded by Opus 4.7).**
>
> The plan you write MUST satisfy these regexes. The executor parses with
> strict regex matching; any deviation breaks parsing and forces a re-plan.
>
> ```
> STEP_HEADING_REGEX = /^### Step (\d+):\s+(.+?)\s*$/m
> FORBIDDEN_HEADING_REGEX = /^(?:##|###) (?:Fase|Phase|Stage|Steg) \d+/m
> ```
>
> **FORBIDDEN headings** (parser rejects these — do not emit them under
> Implementation Plan):
> - `## Fase 1`, `### Fase 1` — Norwegian narrative format
> - `## Phase 1`, `### Phase 1` — narrative phase format
> - `## Stage 1`, `### Stage 1` — narrative stage format
> - `## Steg 1`, `### Steg 1` — Norwegian step word
> - `### 1.` or `### 1)` — numbered without "Step"
> - `### Step 1 —` (em-dash instead of colon)
> - Any heading that doesn't match `STEP_HEADING_REGEX`
>
> **REQUIRED step shape** — copy this canonical example verbatim, substituting
> file paths, descriptions, and patterns. Preserve the exact heading format,
> bullet field names, and Manifest YAML structure. Do not invent new field
> names. Do not skip fields. Do not nest steps under sub-headings.
>
> ````markdown
> ### Step 1: Add JWT verification middleware
>
> - **Files:** `src/middleware/jwt.ts`
> - **Changes:** Create new middleware function `verifyJWT(req, res, next)` that reads `Authorization: Bearer <token>` header, verifies signature with `process.env.JWT_SECRET`, attaches decoded payload to `req.user`, and returns 401 on invalid/missing token. (new file)
> - **Reuses:** `jsonwebtoken.verify()` (already in package.json), pattern from `src/middleware/cors.ts`
> - **Test first:**
> - File: `src/middleware/jwt.test.ts` (new)
> - Verifies: valid token attaches user; invalid token returns 401; missing header returns 401
> - Pattern: `src/middleware/cors.test.ts` (follow this style)
> - **Verify:** `npm test -- jwt.test.ts` → expected: `3 passing`
> - **On failure:** revert — `git checkout -- src/middleware/jwt.ts src/middleware/jwt.test.ts`
> - **Checkpoint:** `git commit -m "feat(auth): add JWT verification middleware"`
> - **Manifest:**
> ```yaml
> manifest:
> expected_paths:
> - src/middleware/jwt.ts
> - src/middleware/jwt.test.ts
> min_file_count: 2
> commit_message_pattern: "^feat\\(auth\\): add JWT verification middleware$"
> bash_syntax_check: []
> forbidden_paths:
> - src/middleware/cors.ts
> must_contain:
> - path: src/middleware/jwt.ts
> pattern: "verifyJWT"
> ```
> ````
>
> **Validator self-check (mandatory after writing `plan.md`):** run
> `node ${CLAUDE_PLUGIN_ROOT}/lib/validators/plan-validator.mjs --strict --json {plan_path}`
> and re-revise the plan if it fails. The validator is the source of truth for
> heading shape, manifest presence, and required-field coverage. If
> `${CLAUDE_PLUGIN_ROOT}` is unset (rare in practice), fall back to the
> equivalent path under your validators cache or the repo's `lib/validators/`.
Read the brief file (from `--brief` or `--project`).
Read the plan template: @${CLAUDE_PLUGIN_ROOT}/templates/plan-template.md
Write the plan following the template structure. The plan MUST include:
### Required sections
1. **Context** — Why this change is needed. Use the brief's **Intent** verbatim
or tightly paraphrased. The plan's motivation must trace directly to the brief.
2. **Codebase Analysis** — Tech stack, patterns, relevant files, reusable code,
external tech researched. Every file path must be real (verified during exploration).
3. **Research Sources** — If any research briefs or research-scout was used: table
of technologies, sources, findings, and confidence levels. Omit if none.
4. **Implementation Plan** — Ordered steps. Each step specifies:
- Exact files to modify or create (with paths)
- What changes to make and why
- Which existing code to reuse
- Dependencies on other steps
- Whether the step is based on codebase analysis or external research
- **On failure:** — recovery action (revert/retry/skip/escalate)
- **Checkpoint:** — git commit command after success
10. **Execution Strategy** — For plans with > 5 steps: group steps into sessions
(35 steps each), organize sessions into waves (parallel where independent),
specify scope fences per session. Omit for plans with ≤ 5 steps.
5. **Alternatives Considered** — At least one alternative approach with
pros/cons and reason for rejection.
6. **Risks and Mitigations** — From the risk-assessor findings and the brief's
open questions. What could go wrong and how to handle it.
7. **Test Strategy** — From the test-strategist findings (if available).
What tests to write and which patterns to follow.
8. **Verification** — Reuse the brief's **Success Criteria** as the baseline.
Each criterion must be an executable command or observable condition.
9. **Estimated Scope** — File counts and complexity rating.
### Quality standards
- Every file path in the plan must exist in the codebase (or be explicitly
marked as "new file to create")
- Every "reuses" reference must point to a real function/pattern found during
exploration
- Steps must be ordered by dependency (not by file path or importance)
- Verification criteria must be concrete and executable
- The plan must be implementable by someone who has not seen the exploration
results — it must stand on its own
- Research-based decisions must cite their source
- Every implementation decision must be traceable to a brief section (Intent,
Goal, Constraint, Preference, NFR, or Success Criterion)
### Write the plan
Use the plan destination computed in Phase 3:
- `--project` mode: `{project_dir}/plan.md`
- `--brief` mode: `.claude/plans/trekplan-{YYYY-MM-DD}-{slug}.md`
Create the parent directory if it does not exist.
## Phase 9 — Adversarial review
Launch two review agents **in parallel — emit both Agent tool calls in a
single assistant message turn** (same pattern as Phase 5 exploration). They
have zero data dependencies; serializing them wastes 3060 seconds per run.
**plan-critic** — adversarial review of the plan.
Prompt: "Review this implementation plan for the task: {task}.
Plan file: {plan path}. Read it and find every problem — missing steps,
wrong ordering, fragile assumptions, missing error handling, scope creep,
underspecified steps. Rate each finding as blocker, major, or minor.
Write the structured JSON output to `/tmp/plan-critic-out.json` so the
dedup helper can merge with scope-guardian's findings."
**scope-guardian** — scope alignment check.
Prompt: "Check this implementation plan against the brief.
Task: {task}. Brief file: {brief_path}. Plan file: {plan path}.
Find scope creep (plan does more than the brief requires) and scope gaps
(plan misses brief requirements). Check that referenced files and functions
exist. Verify that every Success Criterion in the brief is covered by the
plan's Verification section. Write structured JSON output to
`/tmp/scope-guardian-out.json`."
After both complete, run an inline dedup pass:
```bash
node ${CLAUDE_PLUGIN_ROOT}/lib/review/plan-review-dedup.mjs \
--plan-critic /tmp/plan-critic-out.json \
--scope-guardian /tmp/scope-guardian-out.json \
> /tmp/plan-review-merged.json
```
The merged array attributes each finding to `[plan-critic, scope-guardian]`
when both reviewers raised the same issue (exact match on
`file:line:rule_key`, or Jaccard ≥ 0.7 on text tokens). Revise the plan
once for the merged set, not twice for the duplicates. Source: research/05
R1 + R2.
After both complete:
- If **blockers** are found: revise the plan to address them. Add a "Revisions"
note at the bottom of the plan listing what changed and why.
- If only **major** issues: revise to address them. Add revisions note.
- If only **minor** issues or clean: proceed without changes. Note the
review result in the plan.
## Phase 10 — Present and refine
Present a summary to the user:
```
## Voyage Complete
**Task:** {task description}
**Mode:** {foreground | quick}
**Brief:** {brief_path}
**Project:** {project_dir or "-"}
**Plan:** {plan_path}
**Exploration:** {N} agents deployed ({N} specialized + {N} deep-dives + {research status})
**Scope:** {N} files to modify, {N} to create — {complexity}
### Key decisions
- {Decision 1 and rationale}
- {Decision 2 and rationale}
### Implementation steps ({N} total)
1. {Step 1 summary}
2. {Step 2 summary}
...
### Research findings
{Summary of external research + attached research briefs, or "No external research used."}
### Adversarial review
**Plan critic:** {Summary — blockers/majors/minors found, how addressed}
**Scope guardian:** {Summary — creep/gaps found, how addressed}
You can:
- Ask questions or request changes to refine the plan
- Say **"execute"** to start implementing
- Say **"execute with team"** to implement with parallel Agent Team (if eligible)
- Say **"save"** to keep the plan for later
```
If the user asks questions or requests changes:
- Update the plan file in-place
- Show what changed
- Re-present the summary
## Phase 11 — Handoff
### "save" / "later" / "done"
Confirm the plan and brief file locations and exit.
### "execute" / "go" / "start"
Begin implementing the plan step by step in this session. Follow the plan exactly.
Mark each step complete as you go.
### "execute with team" / "team"
Before creating a team, verify eligibility:
1. Count implementation steps that are **independent** (no dependency on each other)
AND touch **different files/modules**
2. If fewer than 3 independent steps: inform the user and fall back to sequential
execution. "The plan has fewer than 3 independent steps — sequential execution
is more efficient."
If eligible:
1. Present the proposed team split: which steps go to which team member
2. Ask for confirmation: "Create Agent Team with {N} members? (yes/no)"
3. If confirmed: create the team with `TeamCreate`, assign step clusters to
each member. Use `isolation: "worktree"` on each team member agent so they
work in isolated git worktrees — this prevents file conflicts during parallel
implementation. Coordinate execution and clean up with `TeamDelete` when done.
4. If `TeamCreate` fails (tool not available): fall back to sequential execution
and notify the user
## Phase 12 — Session tracking
After the plan is presented (Phase 10) or after handoff (Phase 11), write a
session record to `${CLAUDE_PLUGIN_DATA}/trekplan-stats.jsonl` (create the file
if it does not exist).
Record format (one JSON line):
```json
{
"ts": "{ISO-8601 timestamp}",
"task": "{task description (first 100 chars)}",
"mode": "{default|fg|quick}",
"slug": "{plan slug}",
"brief_path": "{brief_path}",
"project_dir": "{project_dir or null}",
"codebase_size": "{small|medium|large}",
"codebase_files": {N},
"agents_deployed": {N},
"deep_dives": {N},
"research_briefs_used": {N},
"research_scout_used": {true|false},
"critic_verdict": "{BLOCK|REVISE|PASS}",
"guardian_verdict": "{ALIGNED|CREEP|GAP|MIXED}",
"outcome": "{execute|execute_team|save|refine}"
}
```
If `${CLAUDE_PLUGIN_DATA}` is not set or not writable, skip tracking silently.
Never let tracking failures block the main workflow.
## Hard rules
- **Brief-driven**: Every plan decision must trace back to a section of the
brief (Intent, Goal, Constraint, Preference, NFR, Success Criterion). If a
step has no brief basis, it is scope creep — flag it or remove it.
- **No interview**: Never ask the user requirements questions. If the brief is
inadequate, stop and ask the user to run `/trekbrief` again.
- **Scope**: Only explore the current working directory and its subdirectories.
Never read files outside the repo (no ~/.env, no credentials, no other repos).
- **Cost**: Sonnet for all agents (exploration, deep-dives, research, critics).
Opus only runs in the main thread for synthesis and planning.
- **Privacy**: Never log, store, or repeat file contents that look like
secrets, tokens, or credentials. Never log prompt text.
- **No premature execution**: Do not modify any project files until the user
explicitly approves the plan.
- **Plan stands alone**: The plan file must be understandable without access
to the exploration results. Include all necessary context.
- **Honesty**: If exploration reveals the task is trivial (single file, obvious
change), say so. Do not inflate the plan to justify the process. Suggest
the user just implements it directly.
- **Adaptive**: Never spawn more agents than the codebase warrants. A 10-file
project does not need 7 exploration agents. Scale down.
- **Research transparency**: Always distinguish codebase-derived decisions from
research-derived decisions in the plan.

View file

@ -0,0 +1,427 @@
---
name: trekresearch
description: Deep research combining local codebase analysis with external knowledge, producing structured research briefs with triangulation and confidence ratings
argument-hint: "[--project <dir>] [--quick | --local | --external | --fg] <research question>"
model: opus
allowed-tools: Agent, Read, Glob, Grep, Write, Edit, Bash, AskUserQuestion, WebSearch, WebFetch, mcp__tavily__tavily_search, mcp__tavily__tavily_research
---
# Ultraresearch Local v1.0
Deep, multi-phase research that combines local codebase analysis with external
knowledge. Uses specialized agent swarms to investigate multiple dimensions in
parallel, then triangulates findings to produce insights that neither local nor
external research could provide alone.
**Design principle: Context Engineering** — build the right context by orchestrating
specialized agents, each seeing only what they need. The value is in triangulation
(cross-checking local vs. external) and synthesis (insights from combining both).
**Pipeline integration:** Research briefs feed into trekplan via `--research`:
```
/trekresearch <question> → brief → /trekplan --research <brief> <task>
```
## Phase 1 — Parse mode and validate input
Parse `$ARGUMENTS` for mode flags. Flags can appear in any order before the
research question. Collect all flags first, then treat the remainder as the
research question.
Supported flags:
1. `--quick` — lightweight research, no agent swarm. The command itself does
3-5 targeted searches inline. Set **mode = quick**.
2. `--local` — only codebase research. Skip external agents and gemini bridge.
Set **scope = local**.
3. `--external` — only external research. Skip codebase analysis agents.
Set **scope = external**.
4. `--fg` — accepted as a no-op alias for backwards compatibility. Execution
is always foreground as of v2.4.0. Set **execution = foreground** (the
only mode).
5. `--project <dir>` — attach this research to an trekbrief project folder.
The brief will be written to `{dir}/research/{NN}-{slug}.md` (auto-incremented
index) instead of the default `.claude/research/` path. Set **project_dir = {dir}**.
If `{dir}` does not exist:
```
Error: project directory not found: {dir}
Run /trekbrief first to create it.
```
Create `{dir}/research/` if it does not already exist.
6. `--gates` — autonomy control. When present, set `gates_mode = true`. The
research command will pause after each topic completes ("Topic N
complete. Proceed to topic N+1? (yes/no)"). Default `gates_mode = false`
means topics run continuously. The flag is consumed by the autonomy-gate
state machine via the CLI shim:
`node ${CLAUDE_PLUGIN_ROOT}/lib/util/autonomy-gate.mjs --state X --event Y --gates {true|false}`.
Flags can be combined:
- `--local` — local-only research
- `--external --quick` — external-only, lightweight
- `--project <dir> --external` — attach external research to a project
- `--quick` alone implies both local and external (lightweight)
Defaults: **scope = both**, **execution = foreground** (only mode as of
v2.4.0), **project_dir = none**.
After stripping flags, the remaining text is the **research question**.
If no research question is provided, output usage and stop:
```
Usage: /trekresearch <research question>
/trekresearch --quick <research question>
/trekresearch --local <research question>
/trekresearch --external <research question>
/trekresearch --fg <research question>
/trekresearch --project <dir> [--external|--local|--quick|--fg] <research question>
Modes:
default Interview → foreground research (local + external) → brief
--quick Interview (short) → inline research (no agent swarm)
--local Only codebase analysis agents (skip external + Gemini)
--external Only external research agents (skip codebase analysis)
--fg No-op alias (foreground is the only mode as of v2.4.0)
--project Write brief into an trekbrief project folder (auto-indexed)
Flags can be combined: --local, --external --quick, --project <dir> --external
Examples:
/trekresearch Should we migrate from Express to Fastify?
/trekresearch --quick What auth libraries are popular for Node.js?
/trekresearch --local How is error handling structured in this codebase?
/trekresearch --external What are the security implications of using Redis for sessions?
/trekresearch --fg --local What patterns does this codebase use for database access?
/trekresearch --project .claude/projects/2026-04-18-jwt-auth --external What JWT library is best for Node.js?
```
Do not continue past this step if no question was provided.
Report the detected mode:
```
Mode: {default | quick}, Scope: {both | local | external}, Execution: foreground
Project: {project_dir or "-"}
Question: {research question}
```
### Compute brief destination
If **project_dir is set**:
- Scan `{project_dir}/research/` for existing files matching `NN-*.md`.
- Find the highest existing index; set `N = highest + 1`. If no files exist, `N = 1`.
- Zero-pad to 2 digits: `01`, `02`, ...
- Brief destination: `{project_dir}/research/{NN}-{slug}.md`
If **project_dir is not set**:
- Brief destination: `.claude/research/trekresearch-{YYYY-MM-DD}-{slug}.md`
Store as `brief_destination` for use in later phases.
## Phase 2 — Research interview
Use `AskUserQuestion` to clarify the research question. Ask **one question at a time**.
The interview is shorter than trekplan's (2-4 questions, not 3-8) because research
is more focused than planning.
### Interview flow
**Start with the research question itself.** If the user provided a clear, specific
question, you may skip directly to follow-ups.
**Core questions (pick 2-4 based on clarity of initial question):**
1. **Decision context:** "What decision does this research feed? Are you evaluating
options, investigating feasibility, or building understanding?"
*Skip if the question itself makes this obvious.*
2. **Dimensions:** "Are there specific aspects you care about most? (e.g., performance,
security, migration cost, team learning curve)"
*Skip if the question is narrow enough that dimensions are obvious.*
3. **Prior knowledge:** "What do you already know about this topic? What have you
tried or ruled out?"
*Always useful — prevents redundant research.*
4. **Constraints:** "Are there constraints that should guide the research?
(e.g., must be open-source, must support X, budget limitations)"
*Skip if no constraints are apparent.*
**Rules:**
- If the user says "just research it", "skip", or similar — stop interviewing.
Use the research question as-is.
- For `--quick` mode: ask 1-2 questions maximum.
- Never ask about things you can discover from the codebase.
### Determine research dimensions
Based on the interview, identify 3-8 research dimensions. These are the facets
of the question that will be investigated in parallel. Examples:
- "Should we use Redis?" → dimensions: performance, reliability, operational
complexity, security, cost, team familiarity
- "How should we handle auth?" → dimensions: standards compliance, implementation
complexity, library ecosystem, security posture, scalability
Report dimensions:
```
Research dimensions identified:
1. {Dimension 1}
2. {Dimension 2}
...
```
## Phase 3 — Slug and destination (foreground)
Generate a slug from the research question (first 3-4 meaningful words,
lowercase, hyphens). Confirm the `brief_destination` computed in Phase 1.
Report to the user:
```
Research pipeline running in foreground.
Question: {research question}
Dimensions: {N} identified
Scope: {both | local | external}
Project: {project_dir or "-"}
Brief: {brief_destination}
```
Then continue to the next phase inline.
> **Why foreground?** As of v2.4.0 the research-orchestrator is no longer
> spawned as a background agent. The Claude Code harness does not expose the
> Agent tool to sub-agents, so an orchestrator launched with
> `run_in_background: true` cannot spawn the documented research swarm
> (`docs-researcher`, `community-researcher`, etc.) and silently degrades to
> single-context reasoning without WebSearch / Tavily / WebFetch / Gemini.
> Running the phases inline in main context keeps the swarm intact. Use
> `claude -p` in a separate terminal window for long-running headless work.
---
**All remaining phases run inline in the main command context.**
---
## Phase 3.5 — Quick mode (inline research)
**Skip this phase entirely unless mode = quick.**
For quick mode, do NOT launch an agent swarm. Instead, do lightweight research
directly using available tools.
### Quick local research (if scope includes local)
- `Glob` for files matching key terms from the research question (up to 3 patterns)
- `Grep` for relevant definitions, patterns, or usage (up to 5 patterns)
- Read the 2-3 most relevant files found
### Quick external research (if scope includes external)
Use available search tools directly (in this priority order):
1. `mcp__tavily__tavily_search` — if available, use for 2-3 targeted queries
2. `WebSearch` — fallback for 2-3 targeted queries
3. `WebFetch` — fetch 1-2 specific pages if URLs were found
### Quick synthesis
Synthesize findings inline. Write a lightweight research brief to the destination
path, following the research-brief-template but with shorter sections and fewer
dimensions.
Skip to Phase 8 (stats tracking) after writing the brief.
## Phase 4 — Parallel research (agent swarm)
**Determine which agents to launch based on scope:**
### Local agents (scope = both or local)
Reuse existing plugin agents with research-focused prompts. These agents are
designed for planning, but work equally well for research when prompted differently.
| Agent | Purpose in research context |
|-------|----------------------------|
| `architecture-mapper` | How the architecture relates to the research question |
| `dependency-tracer` | Dependencies and integrations relevant to the topic |
| `task-finder` | Existing code that relates to the research question |
| `git-historian` | Recent changes and ownership relevant to the topic |
| `convention-scanner` | Coding patterns relevant to evaluating options |
For each local agent, prompt with the research question, NOT a task description:
- architecture-mapper: "Analyze the architecture relevant to this research question:
{question}. Focus on how {topic} relates to current patterns and constraints."
- dependency-tracer: "Trace dependencies relevant to this research question: {question}.
Identify which modules would be affected by {topic}."
- task-finder: "Find existing code relevant to this research question: {question}.
Look for prior implementations, patterns, or utilities related to {topic}."
- git-historian: "Analyze git history relevant to this research question: {question}.
Who owns the relevant code? What has changed recently in related areas?"
- convention-scanner: "Discover coding conventions relevant to evaluating {question}.
What patterns would a solution need to follow?"
### External agents (scope = both or external)
Launch the new research-specialized agents:
| Agent | Purpose |
|-------|---------|
| `docs-researcher` | Official documentation, RFCs, vendor docs |
| `community-researcher` | Real-world experience, issues, blog posts |
| `security-researcher` | CVEs, audit history, supply chain risks |
| `contrarian-researcher` | Counter-evidence, overlooked alternatives |
For each external agent, pass: the research question, specific dimensions to
investigate, and any context from the interview.
### Bridge agent (scope = both or external, if enabled)
Launch `gemini-bridge` with the research question. Do NOT include findings from
other agents — the value of Gemini is independence.
### Launch rules
- Launch ALL selected agents **in parallel** in a single message
- Use model: "sonnet" for all sub-agents (the orchestrator runs on Opus)
- Scale maxTurns by codebase size for local agents (same as trekplan):
small = halved, medium/large = default
- convention-scanner: medium+ codebases only (50+ files)
## Phase 5 — Targeted follow-ups
Review all agent results. Identify knowledge gaps — areas where findings are
thin, contradictory, or missing.
For each significant gap, launch a targeted follow-up agent (model: "sonnet")
with a narrow, specific brief. Maximum 2 follow-ups.
If no gaps exist, skip: "Initial research sufficient — no follow-ups needed."
## Phase 6 — Triangulation
This is the KEY phase that makes trekresearch more than aggregation.
For each research dimension:
1. **Collect** — gather relevant findings from local AND external agents
2. **Compare** — do local findings agree with external findings?
3. **Flag contradictions** — where they disagree, present both sides with evidence
4. **Cross-validate** — use codebase facts to validate external claims:
- External says "library X is fast" → local shows the codebase already uses
a similar pattern that could benchmark against
- External says "pattern Y is best practice" → local shows the codebase uses
pattern Z which conflicts
5. **Rate confidence** per dimension:
- **high** — multiple authoritative sources agree, local evidence confirms
- **medium** — good sources but limited cross-validation
- **low** — single source, limited evidence
- **contradictory** — credible sources actively disagree
Compute overall confidence as a weighted average (0.0-1.0) based on dimension
confidence levels and their relative importance.
## Phase 7 — Synthesis and brief writing
Read the research brief template:
@${CLAUDE_PLUGIN_ROOT}/templates/research-brief-template.md
Write the research brief following the template. Key rules:
1. **Executive Summary** — 3 sentences. Answer, confidence, key caveat.
2. **Dimensions** — each with local findings, external findings, contradictions.
3. **Synthesis** — NOT a summary. NEW insights from triangulation.
4. **Open Questions** — what remains unresolved and why.
5. **Recommendation** — only if decision-relevant. Omit for exploratory research.
6. **Sources** — every claim traced to URL or codebase path.
Generate the slug from the research question (first 3-4 meaningful words).
Write the brief to the `brief_destination` computed in Phase 1:
- With `--project`: `{project_dir}/research/{NN}-{slug}.md`
- Without `--project`: `.claude/research/trekresearch-{YYYY-MM-DD}-{slug}.md`
Create the parent directory if it does not exist.
## Phase 8 — Present and track
Present a summary to the user:
```
## Ultraresearch Complete
**Question:** {research question}
**Mode:** {default | quick}, Scope: {both | local | external}
**Brief:** {brief_destination}
**Project:** {project_dir or "-"}
**Confidence:** {overall confidence 0.0-1.0}
**Dimensions:** {N} researched
**Agents:** {N} local + {N} external + {gemini: used | unavailable | skipped}
### Key Findings
- {Finding 1}
- {Finding 2}
- {Finding 3}
### Contradictions Found
- {Contradiction 1, or "None — findings are consistent across sources."}
### Open Questions
- {Question 1, or "None — all dimensions adequately covered."}
You can:
- Read the full brief at {brief_destination}
- If `--project` was used: run `/trekplan --project {project_dir}` when all research topics are complete
- Otherwise: `/trekplan --research {brief_destination} --brief <your-brief.md>`
- Ask follow-up questions about specific findings
```
### Stats tracking
Write a session record to `${CLAUDE_PLUGIN_DATA}/trekresearch-stats.jsonl`
(create the file if it does not exist).
Record format (one JSON line):
```json
{
"ts": "{ISO-8601 timestamp}",
"question": "{research question (first 100 chars)}",
"mode": "{default|quick}",
"scope": "{both|local|external}",
"slug": "{brief slug}",
"project_dir": "{project_dir or null}",
"brief_path": "{brief_destination}",
"dimensions": {N},
"agents_local": {N},
"agents_external": {N},
"gemini_used": {true|false},
"confidence": {0.0-1.0},
"contradictions": {N},
"open_questions": {N}
}
```
If `${CLAUDE_PLUGIN_DATA}` is not set or not writable, skip tracking silently.
## Hard rules
- **No planning:** This command produces research briefs, not implementation plans.
If the user asks to plan, direct them to `/trekplan --research <brief>`.
- **Sources required:** Every claim must cite a source. No unsourced findings.
- **Independence:** Do not pre-bias external agents with local findings or vice versa.
Triangulate AFTER independent research.
- **Graceful degradation:** If MCP tools are unavailable (Tavily, Gemini, MS Learn),
proceed with available tools and note limitations in brief metadata.
- **Cost:** Sonnet for all sub-agents. Opus only in the main command/orchestrator.
- **Privacy:** Never log secrets, tokens, or credentials.
- **Honesty:** If the question is trivially answerable, say so. Don't inflate research.
- **Scope of codebase:** Only analyze the current working directory for local research.
- **Research transparency:** Clearly distinguish local findings from external findings.
Never blend them without attribution.

View file

@ -0,0 +1,340 @@
---
name: trekreview
description: |
Independent post-hoc review of delivered code against the brief. Produces
review.md with severity-tagged findings (BLOCKER/MAJOR/MINOR/SUGGESTION)
per Handover 6 (review → plan).
argument-hint: "--project <dir> [--since <ref>] [--quick] [--validate] [--dry-run]"
model: opus
allowed-tools: Agent, Read, Glob, Grep, Write, Edit, Bash, AskUserQuestion
---
# Ultrareview Local v1.0
Independent post-hoc review of code delivered by `/trekexecute`
against the contract in `brief.md`. Produces `review.md` — a structured
artifact with severity-tagged findings that `/trekplan --brief
review.md` can consume as plan input (Handover 6).
Pipeline position:
```
/trekbrief → brief.md
/trekresearch → research/*.md
/trekplan → plan.md
/trekexecute → progress.json (+ commits)
/trekreview → review.md (this command)
```
The review is **independent**: each reviewer runs without cross-feeding,
and the coordinator applies BOUNDED operations only. Synthesis-level
inference across files is forbidden in v1.0 (Judge Agent pattern).
See `agents/review-orchestrator.md` for the canonical workflow this
command executes inline.
## Phase 1 — Parse mode and validate input
Parse `$ARGUMENTS` via the shared arg-parser:
```bash
node ${CLAUDE_PLUGIN_ROOT}/lib/parsers/arg-parser.mjs --command trekreview "$@"
```
The parser recognizes these flags (see `lib/parsers/arg-parser.mjs`
FLAG_SCHEMA `trekreview` entry):
| Flag | Type | Purpose |
|------|------|---------|
| `--project <dir>` | valued | Required. Path to trekplan project folder containing `brief.md`. |
| `--since <ref>` | valued | Optional. Override "before" SHA for the diff. Validated via `git rev-parse --verify`. |
| `--quick` | boolean | Skip the brief-conformance pass; run only the code-correctness reviewer; skip the coordinator's reasonableness filter. |
| `--validate` | boolean | Schema-only check on existing `{project_dir}/review.md`. No LLM calls. |
| `--dry-run` | boolean | Print the discovered scope and triage map. Skip writes. |
| `--fg` | boolean | No-op alias (foreground is default). |
Resolution:
1. If `--project` is missing, print usage and stop:
```
Error: --project <dir> is required.
Usage: /trekreview --project <dir> [--since <ref>] [--quick] [--validate] [--dry-run]
```
2. Trim trailing slash from `{dir}`. Set:
- `project_dir = {dir}`
- `brief_path = {dir}/brief.md`
- `review_path = {dir}/review.md`
3. If `{dir}` does not exist or `{dir}/brief.md` is missing:
```
Error: project directory not initialized. Run /trekbrief first.
Missing: {dir}/brief.md
```
Set `mode`:
- `validate` if `--validate` is set (overrides everything else; skip to Phase 8.5).
- `dry-run` if `--dry-run` is set.
- `quick` if `--quick` is set.
- `default` otherwise.
## Phase 2 — Validate brief
Run the brief validator in soft mode — the brief is upstream context, not
something this command produces, so partial grades are acceptable as long
as the file is parseable:
```bash
node ${CLAUDE_PLUGIN_ROOT}/lib/validators/brief-validator.mjs --soft --json "{brief_path}"
```
Read the JSON output. If `valid: false` AND any error has code
`BRIEF_MISSING_REQUIRED_FIELD` or `FRONTMATTER_PARSE_ERROR`: stop and
ask the user to re-run `/trekbrief`. Other soft errors become
warnings in the review's Executive Summary.
Read the brief frontmatter. Capture for review.md:
- `task` → review frontmatter `task`
- `slug` → review frontmatter `slug`
- `project_dir` → review frontmatter `project_dir` (defaults to the
CLI `--project` value when missing)
## Phase 3 — Discover scope SHA range
Determine the "before" SHA that bounds the review:
1. **`--since <ref>` override** — if set, validate via:
```bash
git rev-parse --verify "$since_ref"
```
On failure: print `Error: --since ref is not a valid git revision: {ref}` and stop.
Set `before_sha = $(git rev-parse --verify "$since_ref")`.
2. **Preferred path** — read `{project_dir}/progress.json` if it exists.
Extract `session_start_sha`. Validate it via `git rev-parse --verify`.
Set `before_sha = session_start_sha`.
3. **Fallback** — no `progress.json`. Use the brief's mtime to find the
most recent commit at or before the brief was written:
```bash
brief_mtime=$(stat -f %m "{brief_path}") # macOS; on Linux use stat -c %Y
before_sha=$(git log --until="@$brief_mtime" -n 1 --format=%H)
```
Emit a clear warning that gets surfaced in the review's Executive
Summary: "scope_sha_start unavailable — falling back to brief mtime
({timestamp}). Coverage may include unrelated commits."
Compute the "after" SHA: `after_sha=$(git rev-parse HEAD)`.
Capture working-tree changes (uncommitted at review time):
```bash
git diff --name-only "$before_sha".."$after_sha"
git diff --name-only HEAD # uncommitted (annotated [uncommitted])
```
The combined file list is the review scope. Note that the
`[uncommitted]` annotation is a **brief-level contract** — the brief's
Assumptions section declares this is allowed; the review surfaces it
explicitly in the Coverage table.
If the file count is `0`, write a one-line review.md noting "No diff
between {before_sha} and {after_sha}; nothing to review." Verdict: ALLOW.
Skip Phases 47. Continue to Phase 8 (validate + stats).
## Phase 4 — Triage gate (deterministic path-pattern classifier)
The triage gate is **deterministic** — no LLM judgment. It classifies
every file from Phase 3 into a treatment bucket:
| Treatment | When |
|-----------|------|
| `skip` | Matches `*.lock`, `*.svg`, `dist/**`, `build/**`, `node_modules/**`, OR the file's first 3 lines contain a generated-file marker (`@generated`, `Code generated by`, `DO NOT EDIT`). |
| `deep-review` | Matches `auth/**`, `crypto/**`, `**/security/**`, `hooks/**`. |
| `summary-only` | Default treatment for everything else. |
Hard refuse-with-suggestion gates — use `AskUserQuestion`:
```
if (reviewed_files_count > 100) → ask user
if (estimated_diff_tokens > 100000) → ask user
```
Token estimation: `wc -c "$diff_file" / 4` (rough proxy). Use
`AskUserQuestion` with the prompt:
> The diff under review is large (`{N}` files / `~{T}` tokens). Continue
> with the full scope, narrow with `--since <closer-ref>`, or stop?
Options:
1. **Continue** — proceed at this scope.
2. **Narrow** — print suggested `git log --oneline {before}..HEAD` so the
user can pick a closer ref, then stop.
3. **Stop** — cancel.
Record the treatment for every file. Files marked `skip` MUST appear in
the Coverage section of `review.md` — never silently drop them. Silent
drops are `COVERAGE_SILENT_SKIP` (MAJOR) per the rule catalogue.
If `mode == dry-run`: print the triage map and exit.
## Phase 5 — Launch parallel reviewers
Launch two reviewer agents **in parallel** via the Agent tool — one
message, multiple tool calls.
Reviewers run independently. Do NOT pre-feed findings between them.
| Agent | Mode-gated | Purpose |
|-------|------------|---------|
| `brief-conformance-reviewer` | Skipped in `quick` | Trace each Success Criterion + Non-Goal to delivered code. Emits findings tagged with rule_keys from the conformance/scope categories. |
| `code-correctness-reviewer` | Always runs | 7-dimension code review. Emits findings tagged with rule_keys from the correctness/security/maintenance/tests categories. |
Each reviewer prompt includes:
- **Diff context** — the unified diff from Phase 3, truncated per file
for files marked `summary-only`.
- **Triage map** — full file list with treatments. Reviewers must
respect `skip` decisions.
- **Brief path**`{brief_path}` (read on demand; do not inline).
- **Rule catalogue** — reference to `lib/review/rule-catalogue.mjs`.
Collect each reviewer's trailing JSON block (last fenced `json` block in
their output). Parse with `JSON.parse()`. On parse error, ask the agent
to re-emit the JSON only.
In `quick` mode, launch only `code-correctness-reviewer`. The Executive
Summary will note the brief-conformance pass was skipped.
## Phase 6 — Coordinator dedup + verdict
Launch `review-coordinator` (Agent tool) with the merged findings array
from Phase 5 plus the triage map, brief metadata, and SHA range.
The coordinator runs the 4-pass process documented in
`agents/review-coordinator.md`:
1. **Dedup** by `(file, line, rule_key)` triplet.
2. **HubSpot Judge filters** — Succinctness, Accuracy, Actionability.
3. **Cloudflare reasonableness** — drop speculative or catalogue-violating
findings (skipped in `quick` mode).
4. **Verdict** — BLOCK / WARN / ALLOW per the threshold table.
The coordinator's output is the full review.md content — frontmatter +
body sections + trailing JSON block. Do NOT re-run the reviewers based
on the coordinator's output.
## Phase 7 — Write review.md
Write the coordinator's output verbatim to:
```
{project_dir}/review.md
```
Create parent directories if they do not exist. Atomic write pattern:
write to a temp file, then rename. The frontmatter `findings:` field
must use **block-style YAML** (one ID per line, ` - ` prefix). The
parser at `lib/util/frontmatter.mjs` does not support flow-style arrays.
If `mode == dry-run`: skip the write; print the would-be path and the
first 60 lines of the rendered output.
## Phase 8 — Validate output + stats
Run the strict validator:
```bash
node ${CLAUDE_PLUGIN_ROOT}/lib/validators/review-validator.mjs --json "{review_path}"
```
If validation fails:
- For repairable errors (missing required body section, malformed
finding-ID, REVIEW_VERSION_FORMAT warning): repair in place — re-emit
the missing section, recompute the finding-ID, fix the version
string. Re-validate.
- For unrepairable errors (REVIEW_WRONG_TYPE, malformed frontmatter):
stop and ask the user to re-run; do not silently produce an invalid
review.md.
Append a stats line to `${CLAUDE_PLUGIN_DATA}/trekreview-stats.jsonl`
(create the file if it does not exist):
```json
{"ts":"{ISO-8601}","slug":"{slug}","verdict":"BLOCK|WARN|ALLOW","counts":{"BLOCKER":N,"MAJOR":N,"MINOR":N,"SUGGESTION":N},"reviewed_files_count":N,"mode":"default|quick|validate|dry-run","duration_ms":N}
```
If `${CLAUDE_PLUGIN_DATA}` is unset or not writable, skip stats silently.
Never let stats failures block the main workflow.
## Phase 8.5 — Validate-only mode (`--validate`)
When `mode == validate`:
1. Skip Phases 37 entirely.
2. Run the strict validator on `{project_dir}/review.md`.
3. Print a one-line PASS/FAIL summary plus the JSON output on FAIL.
4. Exit 0 on PASS, 1 on FAIL. Never write to disk. Never call any agent.
## Phase 9 — Present summary
After the write succeeds, print:
```
## Ultrareview Complete
**Task:** {task}
**Mode:** {default | quick | dry-run}
**Brief:** {brief_path}
**Project:** {project_dir}
**Review:** {review_path}
**Scope:** {before_sha}..{after_sha} ({reviewed_files_count} files)
**Verdict:** {BLOCK | WARN | ALLOW}
### Counts
- BLOCKER: {N}
- MAJOR: {N}
- MINOR: {N}
- SUGGESTION: {N}
### Top findings
- [{severity}] {title} ({file}:{line})
...
{up to 5 highest-severity findings}
You can:
- Read the full review at {review_path}
- Feed BLOCKER + MAJOR findings into a follow-up plan:
/trekplan --brief {review_path}
- Re-run with `--quick` for a faster correctness-only pass
- Re-run with `--since <ref>` to narrow scope
```
Per **Handover 6**, BLOCKER and MAJOR findings are consumed by
`/trekplan --brief review.md` to produce a remediation plan. The
review's frontmatter `findings:` list and the trailing JSON block are
the contract for that handover (see `docs/HANDOVER-CONTRACTS.md`).
## Hard rules
- **Brief is the contract.** Every finding in the review traces to a
brief section via `brief_ref`, except `SCOPE_CREEP_BUILT` (which
traces to "no anchor"). Conformance is the conformance reviewer's
job — code-correctness findings carry generic anchors like
`"NFR — code correctness"`.
- **Independent reviewers.** Do NOT cross-feed findings between
brief-conformance-reviewer and code-correctness-reviewer. The
coordinator is the only place where outputs combine.
- **Bounded coordination.** Synthesis-level inference across files is
forbidden in v1.0. The coordinator dedups, filters, and computes the
verdict — nothing more.
- **Triage map respected.** Files marked `skip` MUST appear in the
Coverage section. Silent drops are `COVERAGE_SILENT_SKIP` (MAJOR).
- **Block-style YAML for findings list.** The frontmatter parser does
not support flow-style arrays. `findings: [a, b]` is broken; use
`findings:\n - a\n - b`.
- **Refuse-with-suggestion above 100 files / 100K tokens.** Never run
blind on a giant diff. Use AskUserQuestion to surface the gate.
- **Cost.** Sonnet for all sub-agents (reviewers + coordinator). Opus
only runs in the main /trekreview command thread.
- **Privacy.** Never log secrets, tokens, or credentials in review.md.
Findings citing files with secret-like content must redact the secret
in the `detail` field.
- **Honesty.** If the diff is trivially small or all-skip, say so. Do
not pad findings to make the review look thorough.
- **No production code.** This command never runs production code, never
writes to anything outside `{project_dir}` and `${CLAUDE_PLUGIN_DATA}`.