Wave 6 / Step 10 — autonomy-gated. Operator confirmed: gemini-bridge substitution for plan-critic doubling AND SC1 amendment to resolver-invariant encoding (decisions.local.json recorded). - commands/trekplan.md: gemini-bridge plan-review Pass 2 on post-revision plan in high-effort mode (replaces fragile plan-critic doubling per risk-assessor). - commands/trekresearch.md: full swarm + contrarian-researcher + gemini-bridge always-on. - commands/trekreview.md: skip Pass 3 reasonableness + invoke coordinator normalization rule. - commands/trekexecute.md: gates_mode = closed (strict manifest-audit, main-merge pauses); flag override still wins. - agents/review-coordinator.md: Pass 3 high-effort normalization — substitute unknown rule_key with PLAN_EXECUTE_DRIFT, preserve original in original_rule_key. - .claude/projects/2026-05-13-trekflow-solo-lane/brief.md (gitignored, not committed): Non-Goal amendment locks low/high tiers; SC1 amendment authorizes resolver-invariant interpretation. - tests/lib/doc-consistency.test.mjs: +4 pins for the "### High-effort behavior (v5.1.1)" heading per command. Tests: 578 pass, 0 fail, 2 skipped (+4 from 574). Closes #7 (operator-gated decisions captured + coordinator normalization landed). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
450 lines
18 KiB
Markdown
450 lines
18 KiB
Markdown
---
|
||
name: trekreview
|
||
description: |
|
||
Independent post-hoc review of delivered code against the brief. Produces
|
||
review.md with severity-tagged findings (BLOCKER/MAJOR/MINOR/SUGGESTION)
|
||
per Handover 6 (review → plan).
|
||
argument-hint: "--project <dir> [--since <ref>] [--quick] [--validate] [--dry-run]"
|
||
model: opus
|
||
allowed-tools: Agent, Read, Glob, Grep, Write, Edit, Bash, AskUserQuestion
|
||
---
|
||
|
||
# Ultrareview Local v1.0
|
||
|
||
Independent post-hoc review of code delivered by `/trekexecute`
|
||
against the contract in `brief.md`. Produces `review.md` — a structured
|
||
artifact with severity-tagged findings that `/trekplan --brief
|
||
review.md` can consume as plan input (Handover 6).
|
||
|
||
Pipeline position:
|
||
|
||
```
|
||
/trekbrief → brief.md
|
||
/trekresearch → research/*.md
|
||
/trekplan → plan.md
|
||
/trekexecute → progress.json (+ commits)
|
||
/trekreview → review.md (this command)
|
||
```
|
||
|
||
The review is **independent**: each reviewer runs without cross-feeding,
|
||
and the coordinator applies BOUNDED operations only. Synthesis-level
|
||
inference across files is forbidden in v1.0 (Judge Agent pattern).
|
||
|
||
See `agents/review-orchestrator.md` for the canonical workflow this
|
||
command executes inline.
|
||
|
||
## Phase 1 — Parse mode and validate input
|
||
|
||
Parse `$ARGUMENTS` via the shared arg-parser:
|
||
|
||
```bash
|
||
node ${CLAUDE_PLUGIN_ROOT}/lib/parsers/arg-parser.mjs --command trekreview "$@"
|
||
```
|
||
|
||
The parser recognizes these flags (see `lib/parsers/arg-parser.mjs`
|
||
FLAG_SCHEMA `trekreview` entry):
|
||
|
||
| Flag | Type | Purpose |
|
||
|------|------|---------|
|
||
| `--project <dir>` | valued | Required. Path to trekplan project folder containing `brief.md`. |
|
||
| `--since <ref>` | valued | Optional. Override "before" SHA for the diff. Validated via `git rev-parse --verify`. |
|
||
| `--quick` | boolean | Skip the brief-conformance pass; run only the code-correctness reviewer; skip the coordinator's reasonableness filter. |
|
||
| `--validate` | boolean | Schema-only check on existing `{project_dir}/review.md`. No LLM calls. |
|
||
| `--dry-run` | boolean | Print the discovered scope and triage map. Skip writes. |
|
||
| `--fg` | boolean | No-op alias (foreground is default). |
|
||
|
||
Resolution:
|
||
1. If `--project` is missing, print usage and stop:
|
||
```
|
||
Error: --project <dir> is required.
|
||
Usage: /trekreview --project <dir> [--since <ref>] [--quick] [--validate] [--dry-run]
|
||
```
|
||
2. Trim trailing slash from `{dir}`. Set:
|
||
- `project_dir = {dir}`
|
||
- `brief_path = {dir}/brief.md`
|
||
- `review_path = {dir}/review.md`
|
||
3. If `{dir}` does not exist or `{dir}/brief.md` is missing:
|
||
```
|
||
Error: project directory not initialized. Run /trekbrief first.
|
||
Missing: {dir}/brief.md
|
||
```
|
||
|
||
Set `mode`:
|
||
- `validate` if `--validate` is set (overrides everything else; skip to Phase 8.5).
|
||
- `dry-run` if `--dry-run` is set.
|
||
- `quick` if `--quick` is set.
|
||
- `default` otherwise.
|
||
|
||
## Phase 2 — Validate brief
|
||
|
||
Run the brief validator in soft mode — the brief is upstream context, not
|
||
something this command produces, so partial grades are acceptable as long
|
||
as the file is parseable:
|
||
|
||
```bash
|
||
node ${CLAUDE_PLUGIN_ROOT}/lib/validators/brief-validator.mjs --soft --json "{brief_path}"
|
||
|
||
# v5.1.1 — resolve the review-phase brief signal. The JSON is captured as
|
||
# phase_signal_result and used in Phase 7 at the reviewer-launch site to
|
||
# inject the brief-resolved model.
|
||
node ${CLAUDE_PLUGIN_ROOT}/lib/profiles/phase-signal-resolver.mjs --brief "{brief_path}" --phase review --json
|
||
```
|
||
|
||
Read the JSON output. If `valid: false` AND any error has code
|
||
`BRIEF_MISSING_REQUIRED_FIELD` or `FRONTMATTER_PARSE_ERROR`: stop and
|
||
ask the user to re-run `/trekbrief`. Other soft errors become
|
||
warnings in the review's Executive Summary.
|
||
|
||
Read the brief frontmatter. Capture for review.md:
|
||
- `task` → review frontmatter `task`
|
||
- `slug` → review frontmatter `slug`
|
||
- `project_dir` → review frontmatter `project_dir` (defaults to the
|
||
CLI `--project` value when missing)
|
||
|
||
## Phase 3 — Discover scope SHA range
|
||
|
||
Determine the "before" SHA that bounds the review:
|
||
|
||
1. **`--since <ref>` override** — if set, validate via:
|
||
```bash
|
||
git rev-parse --verify "$since_ref"
|
||
```
|
||
On failure: print `Error: --since ref is not a valid git revision: {ref}` and stop.
|
||
Set `before_sha = $(git rev-parse --verify "$since_ref")`.
|
||
|
||
2. **Preferred path** — read `{project_dir}/progress.json` if it exists.
|
||
Extract `session_start_sha`. Validate it via `git rev-parse --verify`.
|
||
Set `before_sha = session_start_sha`.
|
||
|
||
3. **Fallback** — no `progress.json`. Use the brief's mtime to find the
|
||
most recent commit at or before the brief was written:
|
||
```bash
|
||
brief_mtime=$(stat -f %m "{brief_path}") # macOS; on Linux use stat -c %Y
|
||
before_sha=$(git log --until="@$brief_mtime" -n 1 --format=%H)
|
||
```
|
||
Emit a clear warning that gets surfaced in the review's Executive
|
||
Summary: "scope_sha_start unavailable — falling back to brief mtime
|
||
({timestamp}). Coverage may include unrelated commits."
|
||
|
||
Compute the "after" SHA: `after_sha=$(git rev-parse HEAD)`.
|
||
|
||
Capture working-tree changes (uncommitted at review time):
|
||
```bash
|
||
git diff --name-only "$before_sha".."$after_sha"
|
||
git diff --name-only HEAD # uncommitted (annotated [uncommitted])
|
||
```
|
||
|
||
The combined file list is the review scope. Note that the
|
||
`[uncommitted]` annotation is a **brief-level contract** — the brief's
|
||
Assumptions section declares this is allowed; the review surfaces it
|
||
explicitly in the Coverage table.
|
||
|
||
If the file count is `0`, write a one-line review.md noting "No diff
|
||
between {before_sha} and {after_sha}; nothing to review." Verdict: ALLOW.
|
||
Skip Phases 4–7. Continue to Phase 8 (validate + stats).
|
||
|
||
## Phase 4 — Triage gate (deterministic path-pattern classifier)
|
||
|
||
The triage gate is **deterministic** — no LLM judgment. It classifies
|
||
every file from Phase 3 into a treatment bucket:
|
||
|
||
| Treatment | When |
|
||
|-----------|------|
|
||
| `skip` | Matches `*.lock`, `*.svg`, `dist/**`, `build/**`, `node_modules/**`, OR the file's first 3 lines contain a generated-file marker (`@generated`, `Code generated by`, `DO NOT EDIT`). |
|
||
| `deep-review` | Matches `auth/**`, `crypto/**`, `**/security/**`, `hooks/**`. |
|
||
| `summary-only` | Default treatment for everything else. |
|
||
|
||
Hard refuse-with-suggestion gates — use `AskUserQuestion`:
|
||
|
||
```
|
||
if (reviewed_files_count > 100) → ask user
|
||
if (estimated_diff_tokens > 100000) → ask user
|
||
```
|
||
|
||
Token estimation: `wc -c "$diff_file" / 4` (rough proxy). Use
|
||
`AskUserQuestion` with the prompt:
|
||
|
||
> The diff under review is large (`{N}` files / `~{T}` tokens). Continue
|
||
> with the full scope, narrow with `--since <closer-ref>`, or stop?
|
||
|
||
Options:
|
||
1. **Continue** — proceed at this scope.
|
||
2. **Narrow** — print suggested `git log --oneline {before}..HEAD` so the
|
||
user can pick a closer ref, then stop.
|
||
3. **Stop** — cancel.
|
||
|
||
Record the treatment for every file. Files marked `skip` MUST appear in
|
||
the Coverage section of `review.md` — never silently drop them. Silent
|
||
drops are `COVERAGE_SILENT_SKIP` (MAJOR) per the rule catalogue.
|
||
|
||
If `mode == dry-run`: print the triage map and exit.
|
||
|
||
## Phase 5 — Launch parallel reviewers
|
||
|
||
Launch two reviewer agents **in parallel** via the Agent tool — one
|
||
message, multiple tool calls.
|
||
|
||
Reviewers run independently. Do NOT pre-feed findings between them.
|
||
|
||
| Agent | Mode-gated | Purpose |
|
||
|-------|------------|---------|
|
||
| `brief-conformance-reviewer` | Skipped in `quick` | Trace each Success Criterion + Non-Goal to delivered code. Emits findings tagged with rule_keys from the conformance/scope categories. |
|
||
| `code-correctness-reviewer` | Always runs | 7-dimension code review. Emits findings tagged with rule_keys from the correctness/security/maintenance/tests categories. |
|
||
|
||
Each reviewer prompt includes:
|
||
- **Diff context** — the unified diff from Phase 3, truncated per file
|
||
for files marked `summary-only`.
|
||
- **Triage map** — full file list with treatments. Reviewers must
|
||
respect `skip` decisions.
|
||
- **Brief path** — `{brief_path}` (read on demand; do not inline).
|
||
- **Rule catalogue** — reference to `lib/review/rule-catalogue.mjs`.
|
||
|
||
Collect each reviewer's trailing JSON block (last fenced `json` block in
|
||
their output). Parse with `JSON.parse()`. On parse error, ask the agent
|
||
to re-emit the JSON only.
|
||
|
||
In `quick` mode, launch only `code-correctness-reviewer`. The Executive
|
||
Summary will note the brief-conformance pass was skipped.
|
||
|
||
## Phase 6 — Coordinator dedup + verdict
|
||
|
||
Launch `review-coordinator` (Agent tool) with the merged findings array
|
||
from Phase 5 plus the triage map, brief metadata, and SHA range.
|
||
|
||
The coordinator runs the 4-pass process documented in
|
||
`agents/review-coordinator.md`:
|
||
|
||
1. **Dedup** by `(file, line, rule_key)` triplet.
|
||
2. **HubSpot Judge filters** — Succinctness, Accuracy, Actionability.
|
||
3. **Cloudflare reasonableness** — drop speculative or catalogue-violating
|
||
findings (skipped in `quick` mode).
|
||
4. **Verdict** — BLOCK / WARN / ALLOW per the threshold table.
|
||
|
||
The coordinator's output is the full review.md content — frontmatter +
|
||
body sections + trailing JSON block. Do NOT re-run the reviewers based
|
||
on the coordinator's output.
|
||
|
||
## Phase 7 — Write review.md
|
||
|
||
Write the coordinator's output verbatim to:
|
||
|
||
```
|
||
{project_dir}/review.md
|
||
```
|
||
|
||
Create parent directories if they do not exist. Atomic write pattern:
|
||
write to a temp file, then rename. The frontmatter `findings:` field
|
||
must use **block-style YAML** (one ID per line, ` - ` prefix). The
|
||
parser at `lib/util/frontmatter.mjs` does not support flow-style arrays.
|
||
|
||
If `mode == dry-run`: skip the write; print the would-be path and the
|
||
first 60 lines of the rendered output.
|
||
|
||
## Phase 8 — Validate output + stats
|
||
|
||
Run the strict validator:
|
||
|
||
```bash
|
||
node ${CLAUDE_PLUGIN_ROOT}/lib/validators/review-validator.mjs --json "{review_path}"
|
||
```
|
||
|
||
If validation fails:
|
||
- For repairable errors (missing required body section, malformed
|
||
finding-ID, REVIEW_VERSION_FORMAT warning): repair in place — re-emit
|
||
the missing section, recompute the finding-ID, fix the version
|
||
string. Re-validate.
|
||
- For unrepairable errors (REVIEW_WRONG_TYPE, malformed frontmatter):
|
||
stop and ask the user to re-run; do not silently produce an invalid
|
||
review.md.
|
||
|
||
Append a stats line to `${CLAUDE_PLUGIN_DATA}/trekreview-stats.jsonl`
|
||
(create the file if it does not exist):
|
||
|
||
```json
|
||
{"ts":"{ISO-8601}","slug":"{slug}","verdict":"BLOCK|WARN|ALLOW","counts":{"BLOCKER":N,"MAJOR":N,"MINOR":N,"SUGGESTION":N},"reviewed_files_count":N,"mode":"default|quick|validate|dry-run","duration_ms":N}
|
||
```
|
||
|
||
If `${CLAUDE_PLUGIN_DATA}` is unset or not writable, skip stats silently.
|
||
Never let stats failures block the main workflow.
|
||
|
||
**Build the operator-annotation HTML.** After stats land, run:
|
||
|
||
```bash
|
||
ANNOT_HTML=$(node ${CLAUDE_PLUGIN_ROOT}/scripts/annotate.mjs "{review_path}" 2>&1)
|
||
```
|
||
|
||
`stdout` is the absolute path to the `.html` on success. The HTML renders
|
||
`review.md` with line numbers, lets the operator click any line to attach
|
||
their own note (not Claude-generated suggestions — the operator drives
|
||
every annotation), keeps a sidebar of all notes, persists state in
|
||
localStorage, and exposes a "Copy Prompt" button. If `annotate.mjs`
|
||
exits non-zero, surface a one-line warning and continue — the annotation
|
||
HTML is a convenience, not a gate.
|
||
|
||
## Phase 8.5 — Validate-only mode (`--validate`)
|
||
|
||
When `mode == validate`:
|
||
1. Skip Phases 3–7 entirely.
|
||
2. Run the strict validator on `{project_dir}/review.md`.
|
||
3. Print a one-line PASS/FAIL summary plus the JSON output on FAIL.
|
||
4. Exit 0 on PASS, 1 on FAIL. Never write to disk. Never call any agent.
|
||
|
||
## Phase 9 — Present summary
|
||
|
||
After the write succeeds, print:
|
||
|
||
```
|
||
## Ultrareview Complete
|
||
|
||
**Task:** {task}
|
||
**Mode:** {default | quick | dry-run}
|
||
**Brief:** {brief_path}
|
||
**Project:** {project_dir}
|
||
**Review:** {review_path}
|
||
**Annotation HTML:** file://{$ANNOT_HTML}
|
||
**Scope:** {before_sha}..{after_sha} ({reviewed_files_count} files)
|
||
**Verdict:** {BLOCK | WARN | ALLOW}
|
||
|
||
### Counts
|
||
- BLOCKER: {N}
|
||
- MAJOR: {N}
|
||
- MINOR: {N}
|
||
- SUGGESTION: {N}
|
||
|
||
### Top findings
|
||
- [{severity}] {title} ({file}:{line})
|
||
...
|
||
{up to 5 highest-severity findings}
|
||
|
||
────────────────────────────────────────────────────────────────────
|
||
To review and annotate the review, open it in a browser:
|
||
|
||
open file://{$ANNOT_HTML}
|
||
|
||
Click any line to add YOUR OWN note. The sidebar collects every note,
|
||
the "Copy Prompt" button gathers them into one structured prompt.
|
||
Paste that prompt back into this chat and Claude revises review.md
|
||
from your notes. Annotations persist in your browser if you close
|
||
the tab and reopen the same file.
|
||
────────────────────────────────────────────────────────────────────
|
||
|
||
You can also:
|
||
- Feed BLOCKER + MAJOR findings into a follow-up plan:
|
||
/trekplan --brief {review_path}
|
||
- Re-run with `--quick` for a faster correctness-only pass
|
||
- Re-run with `--since <ref>` to narrow scope
|
||
```
|
||
|
||
Per **Handover 6**, BLOCKER and MAJOR findings are consumed by
|
||
`/trekplan --brief review.md` to produce a remediation plan. The
|
||
review's frontmatter `findings:` list and the trailing JSON block are
|
||
the contract for that handover (see `docs/HANDOVER-CONTRACTS.md`).
|
||
|
||
## Profile (v4.1)
|
||
|
||
Accepts `--profile <name>` where `<name>` is `economy`, `balanced`, `premium`,
|
||
or a custom profile under `voyage-profiles/`. Default: `premium`.
|
||
|
||
Resolution order (per `lib/profiles/resolver.mjs`):
|
||
1. `--profile` flag (source: `flag`)
|
||
2. `VOYAGE_PROFILE` env-var (source: `env`)
|
||
3. `premium` default (source: `default`)
|
||
|
||
The selected profile drives `phase_models.review` — `economy` uses sonnet
|
||
for the brief-conformance + code-correctness reviewers; `balanced` and
|
||
`premium` use opus (review benefits from deeper reasoning).
|
||
|
||
Examples:
|
||
```
|
||
/trekreview --profile balanced --project .claude/projects/2026-05-09-add-auth
|
||
VOYAGE_PROFILE=premium /trekreview --project ...
|
||
```
|
||
|
||
Stats records emit `profile` and `profile_source`.
|
||
|
||
## Composition rule (v5.1)
|
||
|
||
Independent of the profile system. When `brief.md` carries
|
||
`phase_signals` (brief_version ≥ 2.1), each downstream phase resolves
|
||
effort + model as:
|
||
|
||
```
|
||
effort_for_phase = brief.phase_signals[<phase>]?.effort ?? 'standard'
|
||
model_for_phase = brief.phase_signals[<phase>]?.model ?? profile.phase_models[<phase>]
|
||
```
|
||
|
||
The brief signal wins per-phase when present; the profile fills any
|
||
gaps. Composition is mechanically resolved via
|
||
`node ${CLAUDE_PLUGIN_ROOT}/lib/profiles/phase-signal-resolver.mjs`
|
||
invoked in Phase 2; the resolved JSON is captured as `phase_signal_result`
|
||
and passed to `Agent` tool calls explicitly. The resolver controls only
|
||
the orchestrator and the model parameter at Agent-spawn sites — sub-agents
|
||
otherwise read `model:` from their own `agents/*.md` frontmatter (still
|
||
pinned to `opus`).
|
||
|
||
For `/trekreview` specifically: `effort == 'low'` activates the existing
|
||
`--quick`-equivalent code-path (skip the brief-conformance reviewer; run
|
||
correctness-only). `effort == 'standard'` (or absent) → no change.
|
||
`effort == 'high'` activates the high-effort behavior documented under
|
||
`### High-effort behavior (v5.1.1)` below.
|
||
|
||
### Sequencing gate surface
|
||
|
||
Phase 1 already calls `brief-validator.mjs --soft` against `{brief_path}`.
|
||
If the validator returns `BRIEF_V51_MISSING_SIGNALS` in `errors`
|
||
(brief_version ≥ 2.1 without `phase_signals` or `phase_signals_partial:
|
||
true`), halt with: `Brief is brief_version 2.1 but does not carry
|
||
phase_signals — re-run /trekbrief to commit them (Phase 3.5).`
|
||
Enforcement is validator-only; this surface just makes the friendly hint
|
||
readable.
|
||
|
||
### High-effort behavior (v5.1.1)
|
||
|
||
When `phase_signal_result.effort == 'high'` for the `review` phase,
|
||
skip Pass 3 (Cloudflare reasonableness filter) in
|
||
`agents/review-coordinator.md`. Passes 1, 2, and 4 still run.
|
||
Rationale: high-effort review trusts the operator to weigh borderline
|
||
findings rather than have the coordinator drop them. To prevent
|
||
unknown `rule_key` values from polluting downstream remediation plans
|
||
(Handover 6), the coordinator applies its v5.1.1 high-effort
|
||
normalization rule — substituting unknown `rule_key` values with the
|
||
literal string `PLAN_EXECUTE_DRIFT` (the most general drift category
|
||
in the 12-entry catalogue) and preserving the original in
|
||
`original_rule_key`. See `agents/review-coordinator.md` § Pass 3
|
||
"High-effort normalization (v5.1.1)" for the full normalization spec.
|
||
|
||
Standard effort (or absent): run all 4 passes as usual.
|
||
Low effort: skip the brief-conformance reviewer entirely (existing
|
||
`--quick`-equivalent code-path).
|
||
|
||
## Hard rules
|
||
|
||
- **Brief is the contract.** Every finding in the review traces to a
|
||
brief section via `brief_ref`, except `SCOPE_CREEP_BUILT` (which
|
||
traces to "no anchor"). Conformance is the conformance reviewer's
|
||
job — code-correctness findings carry generic anchors like
|
||
`"NFR — code correctness"`.
|
||
- **Independent reviewers.** Do NOT cross-feed findings between
|
||
brief-conformance-reviewer and code-correctness-reviewer. The
|
||
coordinator is the only place where outputs combine.
|
||
- **Bounded coordination.** Synthesis-level inference across files is
|
||
forbidden in v1.0. The coordinator dedups, filters, and computes the
|
||
verdict — nothing more.
|
||
- **Triage map respected.** Files marked `skip` MUST appear in the
|
||
Coverage section. Silent drops are `COVERAGE_SILENT_SKIP` (MAJOR).
|
||
- **Block-style YAML for findings list.** The frontmatter parser does
|
||
not support flow-style arrays. `findings: [a, b]` is broken; use
|
||
`findings:\n - a\n - b`.
|
||
- **Refuse-with-suggestion above 100 files / 100K tokens.** Never run
|
||
blind on a giant diff. Use AskUserQuestion to surface the gate.
|
||
- **Cost.** Sub-agents use their pinned `model:` frontmatter (currently `opus`).
|
||
When `phase_signals[<phase>].model` is set, the orchestrator AND Agent-spawn
|
||
sites use the resolved model (`phase_signal_result.model`) for that phase.
|
||
Frontmatter is the default; brief signal is the per-phase override.
|
||
- **Privacy.** Never log secrets, tokens, or credentials in review.md.
|
||
Findings citing files with secret-like content must redact the secret
|
||
in the `detail` field.
|
||
- **Honesty.** If the diff is trivially small or all-skip, say so. Do
|
||
not pad findings to make the review look thorough.
|
||
- **No production code.** This command never runs production code, never
|
||
writes to anything outside `{project_dir}` and `${CLAUDE_PLUGIN_DATA}`.
|