v5.0.0 added a read-only HTML render. v5.0.1 deleted that and pointed at /playground document-critique, which pre-generates Claude's suggestions and asks the operator to approve/reject them. The operator asked for the opposite — a surface where THEY drive every annotation. v5.0.2 lands it. scripts/annotate.mjs (~430 lines, zero deps) takes any artifact .md and writes a self-contained HTML next to it. The HTML renders the document with line numbers, lets the operator click any line to add their own note (inline textarea, save with Cmd+Enter or button), keeps a sidebar of all notes (editable + deletable + persisted in localStorage per artifact path), and exposes Copy Prompt to gather every note into one structured prompt. Operator copies, pastes back, Claude revises the .md. The three producing commands now run annotate.mjs at their last step and print the file:// link with explicit "Click any line to add YOUR OWN note" instructions. The v5.0.1 /playground document-critique line is gone. npm test green: 516 tests, 514 pass, 0 fail, 2 skipped. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
388 lines
15 KiB
Markdown
388 lines
15 KiB
Markdown
---
|
||
name: trekreview
|
||
description: |
|
||
Independent post-hoc review of delivered code against the brief. Produces
|
||
review.md with severity-tagged findings (BLOCKER/MAJOR/MINOR/SUGGESTION)
|
||
per Handover 6 (review → plan).
|
||
argument-hint: "--project <dir> [--since <ref>] [--quick] [--validate] [--dry-run]"
|
||
model: opus
|
||
allowed-tools: Agent, Read, Glob, Grep, Write, Edit, Bash, AskUserQuestion
|
||
---
|
||
|
||
# Ultrareview Local v1.0
|
||
|
||
Independent post-hoc review of code delivered by `/trekexecute`
|
||
against the contract in `brief.md`. Produces `review.md` — a structured
|
||
artifact with severity-tagged findings that `/trekplan --brief
|
||
review.md` can consume as plan input (Handover 6).
|
||
|
||
Pipeline position:
|
||
|
||
```
|
||
/trekbrief → brief.md
|
||
/trekresearch → research/*.md
|
||
/trekplan → plan.md
|
||
/trekexecute → progress.json (+ commits)
|
||
/trekreview → review.md (this command)
|
||
```
|
||
|
||
The review is **independent**: each reviewer runs without cross-feeding,
|
||
and the coordinator applies BOUNDED operations only. Synthesis-level
|
||
inference across files is forbidden in v1.0 (Judge Agent pattern).
|
||
|
||
See `agents/review-orchestrator.md` for the canonical workflow this
|
||
command executes inline.
|
||
|
||
## Phase 1 — Parse mode and validate input
|
||
|
||
Parse `$ARGUMENTS` via the shared arg-parser:
|
||
|
||
```bash
|
||
node ${CLAUDE_PLUGIN_ROOT}/lib/parsers/arg-parser.mjs --command trekreview "$@"
|
||
```
|
||
|
||
The parser recognizes these flags (see `lib/parsers/arg-parser.mjs`
|
||
FLAG_SCHEMA `trekreview` entry):
|
||
|
||
| Flag | Type | Purpose |
|
||
|------|------|---------|
|
||
| `--project <dir>` | valued | Required. Path to trekplan project folder containing `brief.md`. |
|
||
| `--since <ref>` | valued | Optional. Override "before" SHA for the diff. Validated via `git rev-parse --verify`. |
|
||
| `--quick` | boolean | Skip the brief-conformance pass; run only the code-correctness reviewer; skip the coordinator's reasonableness filter. |
|
||
| `--validate` | boolean | Schema-only check on existing `{project_dir}/review.md`. No LLM calls. |
|
||
| `--dry-run` | boolean | Print the discovered scope and triage map. Skip writes. |
|
||
| `--fg` | boolean | No-op alias (foreground is default). |
|
||
|
||
Resolution:
|
||
1. If `--project` is missing, print usage and stop:
|
||
```
|
||
Error: --project <dir> is required.
|
||
Usage: /trekreview --project <dir> [--since <ref>] [--quick] [--validate] [--dry-run]
|
||
```
|
||
2. Trim trailing slash from `{dir}`. Set:
|
||
- `project_dir = {dir}`
|
||
- `brief_path = {dir}/brief.md`
|
||
- `review_path = {dir}/review.md`
|
||
3. If `{dir}` does not exist or `{dir}/brief.md` is missing:
|
||
```
|
||
Error: project directory not initialized. Run /trekbrief first.
|
||
Missing: {dir}/brief.md
|
||
```
|
||
|
||
Set `mode`:
|
||
- `validate` if `--validate` is set (overrides everything else; skip to Phase 8.5).
|
||
- `dry-run` if `--dry-run` is set.
|
||
- `quick` if `--quick` is set.
|
||
- `default` otherwise.
|
||
|
||
## Phase 2 — Validate brief
|
||
|
||
Run the brief validator in soft mode — the brief is upstream context, not
|
||
something this command produces, so partial grades are acceptable as long
|
||
as the file is parseable:
|
||
|
||
```bash
|
||
node ${CLAUDE_PLUGIN_ROOT}/lib/validators/brief-validator.mjs --soft --json "{brief_path}"
|
||
```
|
||
|
||
Read the JSON output. If `valid: false` AND any error has code
|
||
`BRIEF_MISSING_REQUIRED_FIELD` or `FRONTMATTER_PARSE_ERROR`: stop and
|
||
ask the user to re-run `/trekbrief`. Other soft errors become
|
||
warnings in the review's Executive Summary.
|
||
|
||
Read the brief frontmatter. Capture for review.md:
|
||
- `task` → review frontmatter `task`
|
||
- `slug` → review frontmatter `slug`
|
||
- `project_dir` → review frontmatter `project_dir` (defaults to the
|
||
CLI `--project` value when missing)
|
||
|
||
## Phase 3 — Discover scope SHA range
|
||
|
||
Determine the "before" SHA that bounds the review:
|
||
|
||
1. **`--since <ref>` override** — if set, validate via:
|
||
```bash
|
||
git rev-parse --verify "$since_ref"
|
||
```
|
||
On failure: print `Error: --since ref is not a valid git revision: {ref}` and stop.
|
||
Set `before_sha = $(git rev-parse --verify "$since_ref")`.
|
||
|
||
2. **Preferred path** — read `{project_dir}/progress.json` if it exists.
|
||
Extract `session_start_sha`. Validate it via `git rev-parse --verify`.
|
||
Set `before_sha = session_start_sha`.
|
||
|
||
3. **Fallback** — no `progress.json`. Use the brief's mtime to find the
|
||
most recent commit at or before the brief was written:
|
||
```bash
|
||
brief_mtime=$(stat -f %m "{brief_path}") # macOS; on Linux use stat -c %Y
|
||
before_sha=$(git log --until="@$brief_mtime" -n 1 --format=%H)
|
||
```
|
||
Emit a clear warning that gets surfaced in the review's Executive
|
||
Summary: "scope_sha_start unavailable — falling back to brief mtime
|
||
({timestamp}). Coverage may include unrelated commits."
|
||
|
||
Compute the "after" SHA: `after_sha=$(git rev-parse HEAD)`.
|
||
|
||
Capture working-tree changes (uncommitted at review time):
|
||
```bash
|
||
git diff --name-only "$before_sha".."$after_sha"
|
||
git diff --name-only HEAD # uncommitted (annotated [uncommitted])
|
||
```
|
||
|
||
The combined file list is the review scope. Note that the
|
||
`[uncommitted]` annotation is a **brief-level contract** — the brief's
|
||
Assumptions section declares this is allowed; the review surfaces it
|
||
explicitly in the Coverage table.
|
||
|
||
If the file count is `0`, write a one-line review.md noting "No diff
|
||
between {before_sha} and {after_sha}; nothing to review." Verdict: ALLOW.
|
||
Skip Phases 4–7. Continue to Phase 8 (validate + stats).
|
||
|
||
## Phase 4 — Triage gate (deterministic path-pattern classifier)
|
||
|
||
The triage gate is **deterministic** — no LLM judgment. It classifies
|
||
every file from Phase 3 into a treatment bucket:
|
||
|
||
| Treatment | When |
|
||
|-----------|------|
|
||
| `skip` | Matches `*.lock`, `*.svg`, `dist/**`, `build/**`, `node_modules/**`, OR the file's first 3 lines contain a generated-file marker (`@generated`, `Code generated by`, `DO NOT EDIT`). |
|
||
| `deep-review` | Matches `auth/**`, `crypto/**`, `**/security/**`, `hooks/**`. |
|
||
| `summary-only` | Default treatment for everything else. |
|
||
|
||
Hard refuse-with-suggestion gates — use `AskUserQuestion`:
|
||
|
||
```
|
||
if (reviewed_files_count > 100) → ask user
|
||
if (estimated_diff_tokens > 100000) → ask user
|
||
```
|
||
|
||
Token estimation: `wc -c "$diff_file" / 4` (rough proxy). Use
|
||
`AskUserQuestion` with the prompt:
|
||
|
||
> The diff under review is large (`{N}` files / `~{T}` tokens). Continue
|
||
> with the full scope, narrow with `--since <closer-ref>`, or stop?
|
||
|
||
Options:
|
||
1. **Continue** — proceed at this scope.
|
||
2. **Narrow** — print suggested `git log --oneline {before}..HEAD` so the
|
||
user can pick a closer ref, then stop.
|
||
3. **Stop** — cancel.
|
||
|
||
Record the treatment for every file. Files marked `skip` MUST appear in
|
||
the Coverage section of `review.md` — never silently drop them. Silent
|
||
drops are `COVERAGE_SILENT_SKIP` (MAJOR) per the rule catalogue.
|
||
|
||
If `mode == dry-run`: print the triage map and exit.
|
||
|
||
## Phase 5 — Launch parallel reviewers
|
||
|
||
Launch two reviewer agents **in parallel** via the Agent tool — one
|
||
message, multiple tool calls.
|
||
|
||
Reviewers run independently. Do NOT pre-feed findings between them.
|
||
|
||
| Agent | Mode-gated | Purpose |
|
||
|-------|------------|---------|
|
||
| `brief-conformance-reviewer` | Skipped in `quick` | Trace each Success Criterion + Non-Goal to delivered code. Emits findings tagged with rule_keys from the conformance/scope categories. |
|
||
| `code-correctness-reviewer` | Always runs | 7-dimension code review. Emits findings tagged with rule_keys from the correctness/security/maintenance/tests categories. |
|
||
|
||
Each reviewer prompt includes:
|
||
- **Diff context** — the unified diff from Phase 3, truncated per file
|
||
for files marked `summary-only`.
|
||
- **Triage map** — full file list with treatments. Reviewers must
|
||
respect `skip` decisions.
|
||
- **Brief path** — `{brief_path}` (read on demand; do not inline).
|
||
- **Rule catalogue** — reference to `lib/review/rule-catalogue.mjs`.
|
||
|
||
Collect each reviewer's trailing JSON block (last fenced `json` block in
|
||
their output). Parse with `JSON.parse()`. On parse error, ask the agent
|
||
to re-emit the JSON only.
|
||
|
||
In `quick` mode, launch only `code-correctness-reviewer`. The Executive
|
||
Summary will note the brief-conformance pass was skipped.
|
||
|
||
## Phase 6 — Coordinator dedup + verdict
|
||
|
||
Launch `review-coordinator` (Agent tool) with the merged findings array
|
||
from Phase 5 plus the triage map, brief metadata, and SHA range.
|
||
|
||
The coordinator runs the 4-pass process documented in
|
||
`agents/review-coordinator.md`:
|
||
|
||
1. **Dedup** by `(file, line, rule_key)` triplet.
|
||
2. **HubSpot Judge filters** — Succinctness, Accuracy, Actionability.
|
||
3. **Cloudflare reasonableness** — drop speculative or catalogue-violating
|
||
findings (skipped in `quick` mode).
|
||
4. **Verdict** — BLOCK / WARN / ALLOW per the threshold table.
|
||
|
||
The coordinator's output is the full review.md content — frontmatter +
|
||
body sections + trailing JSON block. Do NOT re-run the reviewers based
|
||
on the coordinator's output.
|
||
|
||
## Phase 7 — Write review.md
|
||
|
||
Write the coordinator's output verbatim to:
|
||
|
||
```
|
||
{project_dir}/review.md
|
||
```
|
||
|
||
Create parent directories if they do not exist. Atomic write pattern:
|
||
write to a temp file, then rename. The frontmatter `findings:` field
|
||
must use **block-style YAML** (one ID per line, ` - ` prefix). The
|
||
parser at `lib/util/frontmatter.mjs` does not support flow-style arrays.
|
||
|
||
If `mode == dry-run`: skip the write; print the would-be path and the
|
||
first 60 lines of the rendered output.
|
||
|
||
## Phase 8 — Validate output + stats
|
||
|
||
Run the strict validator:
|
||
|
||
```bash
|
||
node ${CLAUDE_PLUGIN_ROOT}/lib/validators/review-validator.mjs --json "{review_path}"
|
||
```
|
||
|
||
If validation fails:
|
||
- For repairable errors (missing required body section, malformed
|
||
finding-ID, REVIEW_VERSION_FORMAT warning): repair in place — re-emit
|
||
the missing section, recompute the finding-ID, fix the version
|
||
string. Re-validate.
|
||
- For unrepairable errors (REVIEW_WRONG_TYPE, malformed frontmatter):
|
||
stop and ask the user to re-run; do not silently produce an invalid
|
||
review.md.
|
||
|
||
Append a stats line to `${CLAUDE_PLUGIN_DATA}/trekreview-stats.jsonl`
|
||
(create the file if it does not exist):
|
||
|
||
```json
|
||
{"ts":"{ISO-8601}","slug":"{slug}","verdict":"BLOCK|WARN|ALLOW","counts":{"BLOCKER":N,"MAJOR":N,"MINOR":N,"SUGGESTION":N},"reviewed_files_count":N,"mode":"default|quick|validate|dry-run","duration_ms":N}
|
||
```
|
||
|
||
If `${CLAUDE_PLUGIN_DATA}` is unset or not writable, skip stats silently.
|
||
Never let stats failures block the main workflow.
|
||
|
||
**Build the operator-annotation HTML.** After stats land, run:
|
||
|
||
```bash
|
||
ANNOT_HTML=$(node ${CLAUDE_PLUGIN_ROOT}/scripts/annotate.mjs "{review_path}" 2>&1)
|
||
```
|
||
|
||
`stdout` is the absolute path to the `.html` on success. The HTML renders
|
||
`review.md` with line numbers, lets the operator click any line to attach
|
||
their own note (not Claude-generated suggestions — the operator drives
|
||
every annotation), keeps a sidebar of all notes, persists state in
|
||
localStorage, and exposes a "Copy Prompt" button. If `annotate.mjs`
|
||
exits non-zero, surface a one-line warning and continue — the annotation
|
||
HTML is a convenience, not a gate.
|
||
|
||
## Phase 8.5 — Validate-only mode (`--validate`)
|
||
|
||
When `mode == validate`:
|
||
1. Skip Phases 3–7 entirely.
|
||
2. Run the strict validator on `{project_dir}/review.md`.
|
||
3. Print a one-line PASS/FAIL summary plus the JSON output on FAIL.
|
||
4. Exit 0 on PASS, 1 on FAIL. Never write to disk. Never call any agent.
|
||
|
||
## Phase 9 — Present summary
|
||
|
||
After the write succeeds, print:
|
||
|
||
```
|
||
## Ultrareview Complete
|
||
|
||
**Task:** {task}
|
||
**Mode:** {default | quick | dry-run}
|
||
**Brief:** {brief_path}
|
||
**Project:** {project_dir}
|
||
**Review:** {review_path}
|
||
**Annotation HTML:** file://{$ANNOT_HTML}
|
||
**Scope:** {before_sha}..{after_sha} ({reviewed_files_count} files)
|
||
**Verdict:** {BLOCK | WARN | ALLOW}
|
||
|
||
### Counts
|
||
- BLOCKER: {N}
|
||
- MAJOR: {N}
|
||
- MINOR: {N}
|
||
- SUGGESTION: {N}
|
||
|
||
### Top findings
|
||
- [{severity}] {title} ({file}:{line})
|
||
...
|
||
{up to 5 highest-severity findings}
|
||
|
||
────────────────────────────────────────────────────────────────────
|
||
To review and annotate the review, open it in a browser:
|
||
|
||
open file://{$ANNOT_HTML}
|
||
|
||
Click any line to add YOUR OWN note. The sidebar collects every note,
|
||
the "Copy Prompt" button gathers them into one structured prompt.
|
||
Paste that prompt back into this chat and Claude revises review.md
|
||
from your notes. Annotations persist in your browser if you close
|
||
the tab and reopen the same file.
|
||
────────────────────────────────────────────────────────────────────
|
||
|
||
You can also:
|
||
- Feed BLOCKER + MAJOR findings into a follow-up plan:
|
||
/trekplan --brief {review_path}
|
||
- Re-run with `--quick` for a faster correctness-only pass
|
||
- Re-run with `--since <ref>` to narrow scope
|
||
```
|
||
|
||
Per **Handover 6**, BLOCKER and MAJOR findings are consumed by
|
||
`/trekplan --brief review.md` to produce a remediation plan. The
|
||
review's frontmatter `findings:` list and the trailing JSON block are
|
||
the contract for that handover (see `docs/HANDOVER-CONTRACTS.md`).
|
||
|
||
## Profile (v4.1)
|
||
|
||
Accepts `--profile <name>` where `<name>` is `economy`, `balanced`, `premium`,
|
||
or a custom profile under `voyage-profiles/`. Default: `premium`.
|
||
|
||
Resolution order (per `lib/profiles/resolver.mjs`):
|
||
1. `--profile` flag (source: `flag`)
|
||
2. `VOYAGE_PROFILE` env-var (source: `env`)
|
||
3. `premium` default (source: `default`)
|
||
|
||
The selected profile drives `phase_models.review` — `economy` uses sonnet
|
||
for the brief-conformance + code-correctness reviewers; `balanced` and
|
||
`premium` use opus (review benefits from deeper reasoning).
|
||
|
||
Examples:
|
||
```
|
||
/trekreview --profile balanced --project .claude/projects/2026-05-09-add-auth
|
||
VOYAGE_PROFILE=premium /trekreview --project ...
|
||
```
|
||
|
||
Stats records emit `profile` and `profile_source`.
|
||
|
||
## Hard rules
|
||
|
||
- **Brief is the contract.** Every finding in the review traces to a
|
||
brief section via `brief_ref`, except `SCOPE_CREEP_BUILT` (which
|
||
traces to "no anchor"). Conformance is the conformance reviewer's
|
||
job — code-correctness findings carry generic anchors like
|
||
`"NFR — code correctness"`.
|
||
- **Independent reviewers.** Do NOT cross-feed findings between
|
||
brief-conformance-reviewer and code-correctness-reviewer. The
|
||
coordinator is the only place where outputs combine.
|
||
- **Bounded coordination.** Synthesis-level inference across files is
|
||
forbidden in v1.0. The coordinator dedups, filters, and computes the
|
||
verdict — nothing more.
|
||
- **Triage map respected.** Files marked `skip` MUST appear in the
|
||
Coverage section. Silent drops are `COVERAGE_SILENT_SKIP` (MAJOR).
|
||
- **Block-style YAML for findings list.** The frontmatter parser does
|
||
not support flow-style arrays. `findings: [a, b]` is broken; use
|
||
`findings:\n - a\n - b`.
|
||
- **Refuse-with-suggestion above 100 files / 100K tokens.** Never run
|
||
blind on a giant diff. Use AskUserQuestion to surface the gate.
|
||
- **Cost.** Sonnet for all sub-agents (reviewers + coordinator). Opus
|
||
only runs in the main /trekreview command thread.
|
||
- **Privacy.** Never log secrets, tokens, or credentials in review.md.
|
||
Findings citing files with secret-like content must redact the secret
|
||
in the `detail` field.
|
||
- **Honesty.** If the diff is trivially small or all-skip, say so. Do
|
||
not pad findings to make the review look thorough.
|
||
- **No production code.** This command never runs production code, never
|
||
writes to anything outside `{project_dir}` and `${CLAUDE_PLUGIN_DATA}`.
|