v5.0.0 added a read-only HTML render. v5.0.1 deleted that and pointed at /playground document-critique, which pre-generates Claude's suggestions and asks the operator to approve/reject them. The operator asked for the opposite — a surface where THEY drive every annotation. v5.0.2 lands it. scripts/annotate.mjs (~430 lines, zero deps) takes any artifact .md and writes a self-contained HTML next to it. The HTML renders the document with line numbers, lets the operator click any line to add their own note (inline textarea, save with Cmd+Enter or button), keeps a sidebar of all notes (editable + deletable + persisted in localStorage per artifact path), and exposes Copy Prompt to gather every note into one structured prompt. Operator copies, pastes back, Claude revises the .md. The three producing commands now run annotate.mjs at their last step and print the file:// link with explicit "Click any line to add YOUR OWN note" instructions. The v5.0.1 /playground document-critique line is gone. npm test green: 516 tests, 514 pass, 0 fail, 2 skipped. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
15 KiB
| name | description | argument-hint | model | allowed-tools |
|---|---|---|---|---|
| trekreview | Independent post-hoc review of delivered code against the brief. Produces review.md with severity-tagged findings (BLOCKER/MAJOR/MINOR/SUGGESTION) per Handover 6 (review → plan). | --project <dir> [--since <ref>] [--quick] [--validate] [--dry-run] | opus | Agent, Read, Glob, Grep, Write, Edit, Bash, AskUserQuestion |
Ultrareview Local v1.0
Independent post-hoc review of code delivered by /trekexecute
against the contract in brief.md. Produces review.md — a structured
artifact with severity-tagged findings that /trekplan --brief review.md can consume as plan input (Handover 6).
Pipeline position:
/trekbrief → brief.md
/trekresearch → research/*.md
/trekplan → plan.md
/trekexecute → progress.json (+ commits)
/trekreview → review.md (this command)
The review is independent: each reviewer runs without cross-feeding, and the coordinator applies BOUNDED operations only. Synthesis-level inference across files is forbidden in v1.0 (Judge Agent pattern).
See agents/review-orchestrator.md for the canonical workflow this
command executes inline.
Phase 1 — Parse mode and validate input
Parse $ARGUMENTS via the shared arg-parser:
node ${CLAUDE_PLUGIN_ROOT}/lib/parsers/arg-parser.mjs --command trekreview "$@"
The parser recognizes these flags (see lib/parsers/arg-parser.mjs
FLAG_SCHEMA trekreview entry):
| Flag | Type | Purpose |
|---|---|---|
--project <dir> |
valued | Required. Path to trekplan project folder containing brief.md. |
--since <ref> |
valued | Optional. Override "before" SHA for the diff. Validated via git rev-parse --verify. |
--quick |
boolean | Skip the brief-conformance pass; run only the code-correctness reviewer; skip the coordinator's reasonableness filter. |
--validate |
boolean | Schema-only check on existing {project_dir}/review.md. No LLM calls. |
--dry-run |
boolean | Print the discovered scope and triage map. Skip writes. |
--fg |
boolean | No-op alias (foreground is default). |
Resolution:
- If
--projectis missing, print usage and stop:Error: --project <dir> is required. Usage: /trekreview --project <dir> [--since <ref>] [--quick] [--validate] [--dry-run] - Trim trailing slash from
{dir}. Set:project_dir = {dir}brief_path = {dir}/brief.mdreview_path = {dir}/review.md
- If
{dir}does not exist or{dir}/brief.mdis missing:Error: project directory not initialized. Run /trekbrief first. Missing: {dir}/brief.md
Set mode:
validateif--validateis set (overrides everything else; skip to Phase 8.5).dry-runif--dry-runis set.quickif--quickis set.defaultotherwise.
Phase 2 — Validate brief
Run the brief validator in soft mode — the brief is upstream context, not something this command produces, so partial grades are acceptable as long as the file is parseable:
node ${CLAUDE_PLUGIN_ROOT}/lib/validators/brief-validator.mjs --soft --json "{brief_path}"
Read the JSON output. If valid: false AND any error has code
BRIEF_MISSING_REQUIRED_FIELD or FRONTMATTER_PARSE_ERROR: stop and
ask the user to re-run /trekbrief. Other soft errors become
warnings in the review's Executive Summary.
Read the brief frontmatter. Capture for review.md:
task→ review frontmattertaskslug→ review frontmatterslugproject_dir→ review frontmatterproject_dir(defaults to the CLI--projectvalue when missing)
Phase 3 — Discover scope SHA range
Determine the "before" SHA that bounds the review:
-
--since <ref>override — if set, validate via:git rev-parse --verify "$since_ref"On failure: print
Error: --since ref is not a valid git revision: {ref}and stop. Setbefore_sha = $(git rev-parse --verify "$since_ref"). -
Preferred path — read
{project_dir}/progress.jsonif it exists. Extractsession_start_sha. Validate it viagit rev-parse --verify. Setbefore_sha = session_start_sha. -
Fallback — no
progress.json. Use the brief's mtime to find the most recent commit at or before the brief was written:brief_mtime=$(stat -f %m "{brief_path}") # macOS; on Linux use stat -c %Y before_sha=$(git log --until="@$brief_mtime" -n 1 --format=%H)Emit a clear warning that gets surfaced in the review's Executive Summary: "scope_sha_start unavailable — falling back to brief mtime ({timestamp}). Coverage may include unrelated commits."
Compute the "after" SHA: after_sha=$(git rev-parse HEAD).
Capture working-tree changes (uncommitted at review time):
git diff --name-only "$before_sha".."$after_sha"
git diff --name-only HEAD # uncommitted (annotated [uncommitted])
The combined file list is the review scope. Note that the
[uncommitted] annotation is a brief-level contract — the brief's
Assumptions section declares this is allowed; the review surfaces it
explicitly in the Coverage table.
If the file count is 0, write a one-line review.md noting "No diff
between {before_sha} and {after_sha}; nothing to review." Verdict: ALLOW.
Skip Phases 4–7. Continue to Phase 8 (validate + stats).
Phase 4 — Triage gate (deterministic path-pattern classifier)
The triage gate is deterministic — no LLM judgment. It classifies every file from Phase 3 into a treatment bucket:
| Treatment | When |
|---|---|
skip |
Matches *.lock, *.svg, dist/**, build/**, node_modules/**, OR the file's first 3 lines contain a generated-file marker (@generated, Code generated by, DO NOT EDIT). |
deep-review |
Matches auth/**, crypto/**, **/security/**, hooks/**. |
summary-only |
Default treatment for everything else. |
Hard refuse-with-suggestion gates — use AskUserQuestion:
if (reviewed_files_count > 100) → ask user
if (estimated_diff_tokens > 100000) → ask user
Token estimation: wc -c "$diff_file" / 4 (rough proxy). Use
AskUserQuestion with the prompt:
The diff under review is large (
{N}files /~{T}tokens). Continue with the full scope, narrow with--since <closer-ref>, or stop?
Options:
- Continue — proceed at this scope.
- Narrow — print suggested
git log --oneline {before}..HEADso the user can pick a closer ref, then stop. - Stop — cancel.
Record the treatment for every file. Files marked skip MUST appear in
the Coverage section of review.md — never silently drop them. Silent
drops are COVERAGE_SILENT_SKIP (MAJOR) per the rule catalogue.
If mode == dry-run: print the triage map and exit.
Phase 5 — Launch parallel reviewers
Launch two reviewer agents in parallel via the Agent tool — one message, multiple tool calls.
Reviewers run independently. Do NOT pre-feed findings between them.
| Agent | Mode-gated | Purpose |
|---|---|---|
brief-conformance-reviewer |
Skipped in quick |
Trace each Success Criterion + Non-Goal to delivered code. Emits findings tagged with rule_keys from the conformance/scope categories. |
code-correctness-reviewer |
Always runs | 7-dimension code review. Emits findings tagged with rule_keys from the correctness/security/maintenance/tests categories. |
Each reviewer prompt includes:
- Diff context — the unified diff from Phase 3, truncated per file
for files marked
summary-only. - Triage map — full file list with treatments. Reviewers must
respect
skipdecisions. - Brief path —
{brief_path}(read on demand; do not inline). - Rule catalogue — reference to
lib/review/rule-catalogue.mjs.
Collect each reviewer's trailing JSON block (last fenced json block in
their output). Parse with JSON.parse(). On parse error, ask the agent
to re-emit the JSON only.
In quick mode, launch only code-correctness-reviewer. The Executive
Summary will note the brief-conformance pass was skipped.
Phase 6 — Coordinator dedup + verdict
Launch review-coordinator (Agent tool) with the merged findings array
from Phase 5 plus the triage map, brief metadata, and SHA range.
The coordinator runs the 4-pass process documented in
agents/review-coordinator.md:
- Dedup by
(file, line, rule_key)triplet. - HubSpot Judge filters — Succinctness, Accuracy, Actionability.
- Cloudflare reasonableness — drop speculative or catalogue-violating
findings (skipped in
quickmode). - Verdict — BLOCK / WARN / ALLOW per the threshold table.
The coordinator's output is the full review.md content — frontmatter + body sections + trailing JSON block. Do NOT re-run the reviewers based on the coordinator's output.
Phase 7 — Write review.md
Write the coordinator's output verbatim to:
{project_dir}/review.md
Create parent directories if they do not exist. Atomic write pattern:
write to a temp file, then rename. The frontmatter findings: field
must use block-style YAML (one ID per line, - prefix). The
parser at lib/util/frontmatter.mjs does not support flow-style arrays.
If mode == dry-run: skip the write; print the would-be path and the
first 60 lines of the rendered output.
Phase 8 — Validate output + stats
Run the strict validator:
node ${CLAUDE_PLUGIN_ROOT}/lib/validators/review-validator.mjs --json "{review_path}"
If validation fails:
- For repairable errors (missing required body section, malformed finding-ID, REVIEW_VERSION_FORMAT warning): repair in place — re-emit the missing section, recompute the finding-ID, fix the version string. Re-validate.
- For unrepairable errors (REVIEW_WRONG_TYPE, malformed frontmatter): stop and ask the user to re-run; do not silently produce an invalid review.md.
Append a stats line to ${CLAUDE_PLUGIN_DATA}/trekreview-stats.jsonl
(create the file if it does not exist):
{"ts":"{ISO-8601}","slug":"{slug}","verdict":"BLOCK|WARN|ALLOW","counts":{"BLOCKER":N,"MAJOR":N,"MINOR":N,"SUGGESTION":N},"reviewed_files_count":N,"mode":"default|quick|validate|dry-run","duration_ms":N}
If ${CLAUDE_PLUGIN_DATA} is unset or not writable, skip stats silently.
Never let stats failures block the main workflow.
Build the operator-annotation HTML. After stats land, run:
ANNOT_HTML=$(node ${CLAUDE_PLUGIN_ROOT}/scripts/annotate.mjs "{review_path}" 2>&1)
stdout is the absolute path to the .html on success. The HTML renders
review.md with line numbers, lets the operator click any line to attach
their own note (not Claude-generated suggestions — the operator drives
every annotation), keeps a sidebar of all notes, persists state in
localStorage, and exposes a "Copy Prompt" button. If annotate.mjs
exits non-zero, surface a one-line warning and continue — the annotation
HTML is a convenience, not a gate.
Phase 8.5 — Validate-only mode (--validate)
When mode == validate:
- Skip Phases 3–7 entirely.
- Run the strict validator on
{project_dir}/review.md. - Print a one-line PASS/FAIL summary plus the JSON output on FAIL.
- Exit 0 on PASS, 1 on FAIL. Never write to disk. Never call any agent.
Phase 9 — Present summary
After the write succeeds, print:
## Ultrareview Complete
**Task:** {task}
**Mode:** {default | quick | dry-run}
**Brief:** {brief_path}
**Project:** {project_dir}
**Review:** {review_path}
**Annotation HTML:** file://{$ANNOT_HTML}
**Scope:** {before_sha}..{after_sha} ({reviewed_files_count} files)
**Verdict:** {BLOCK | WARN | ALLOW}
### Counts
- BLOCKER: {N}
- MAJOR: {N}
- MINOR: {N}
- SUGGESTION: {N}
### Top findings
- [{severity}] {title} ({file}:{line})
...
{up to 5 highest-severity findings}
────────────────────────────────────────────────────────────────────
To review and annotate the review, open it in a browser:
open file://{$ANNOT_HTML}
Click any line to add YOUR OWN note. The sidebar collects every note,
the "Copy Prompt" button gathers them into one structured prompt.
Paste that prompt back into this chat and Claude revises review.md
from your notes. Annotations persist in your browser if you close
the tab and reopen the same file.
────────────────────────────────────────────────────────────────────
You can also:
- Feed BLOCKER + MAJOR findings into a follow-up plan:
/trekplan --brief {review_path}
- Re-run with `--quick` for a faster correctness-only pass
- Re-run with `--since <ref>` to narrow scope
Per Handover 6, BLOCKER and MAJOR findings are consumed by
/trekplan --brief review.md to produce a remediation plan. The
review's frontmatter findings: list and the trailing JSON block are
the contract for that handover (see docs/HANDOVER-CONTRACTS.md).
Profile (v4.1)
Accepts --profile <name> where <name> is economy, balanced, premium,
or a custom profile under voyage-profiles/. Default: premium.
Resolution order (per lib/profiles/resolver.mjs):
--profileflag (source:flag)VOYAGE_PROFILEenv-var (source:env)premiumdefault (source:default)
The selected profile drives phase_models.review — economy uses sonnet
for the brief-conformance + code-correctness reviewers; balanced and
premium use opus (review benefits from deeper reasoning).
Examples:
/trekreview --profile balanced --project .claude/projects/2026-05-09-add-auth
VOYAGE_PROFILE=premium /trekreview --project ...
Stats records emit profile and profile_source.
Hard rules
- Brief is the contract. Every finding in the review traces to a
brief section via
brief_ref, exceptSCOPE_CREEP_BUILT(which traces to "no anchor"). Conformance is the conformance reviewer's job — code-correctness findings carry generic anchors like"NFR — code correctness". - Independent reviewers. Do NOT cross-feed findings between brief-conformance-reviewer and code-correctness-reviewer. The coordinator is the only place where outputs combine.
- Bounded coordination. Synthesis-level inference across files is forbidden in v1.0. The coordinator dedups, filters, and computes the verdict — nothing more.
- Triage map respected. Files marked
skipMUST appear in the Coverage section. Silent drops areCOVERAGE_SILENT_SKIP(MAJOR). - Block-style YAML for findings list. The frontmatter parser does
not support flow-style arrays.
findings: [a, b]is broken; usefindings:\n - a\n - b. - Refuse-with-suggestion above 100 files / 100K tokens. Never run blind on a giant diff. Use AskUserQuestion to surface the gate.
- Cost. Sonnet for all sub-agents (reviewers + coordinator). Opus only runs in the main /trekreview command thread.
- Privacy. Never log secrets, tokens, or credentials in review.md.
Findings citing files with secret-like content must redact the secret
in the
detailfield. - Honesty. If the diff is trivially small or all-skip, say so. Do not pad findings to make the review look thorough.
- No production code. This command never runs production code, never
writes to anything outside
{project_dir}and${CLAUDE_PLUGIN_DATA}.