From b4e58e3fc299d96db33e70ae66a2771ba82d2cda Mon Sep 17 00:00:00 2001 From: Kjell Tore Guttormsen Date: Fri, 1 May 2026 16:56:47 +0200 Subject: [PATCH] feat(ultraplan-local): add commands/ultrareview-local.md --- .../commands/ultrareview-local.md | 340 ++++++++++++++++++ 1 file changed, 340 insertions(+) create mode 100644 plugins/ultraplan-local/commands/ultrareview-local.md diff --git a/plugins/ultraplan-local/commands/ultrareview-local.md b/plugins/ultraplan-local/commands/ultrareview-local.md new file mode 100644 index 0000000..c4d63fa --- /dev/null +++ b/plugins/ultraplan-local/commands/ultrareview-local.md @@ -0,0 +1,340 @@ +--- +name: ultrareview-local +description: | + Independent post-hoc review of delivered code against the brief. Produces + review.md with severity-tagged findings (BLOCKER/MAJOR/MINOR/SUGGESTION) + per Handover 6 (review → plan). +argument-hint: "--project [--since ] [--quick] [--validate] [--dry-run]" +model: opus +allowed-tools: Agent, Read, Glob, Grep, Write, Edit, Bash, AskUserQuestion +--- + +# Ultrareview Local v1.0 + +Independent post-hoc review of code delivered by `/ultraexecute-local` +against the contract in `brief.md`. Produces `review.md` — a structured +artifact with severity-tagged findings that `/ultraplan-local --brief +review.md` can consume as plan input (Handover 6). + +Pipeline position: + +``` +/ultrabrief-local → brief.md +/ultraresearch-local → research/*.md +/ultraplan-local → plan.md +/ultraexecute-local → progress.json (+ commits) +/ultrareview-local → review.md (this command) +``` + +The review is **independent**: each reviewer runs without cross-feeding, +and the coordinator applies BOUNDED operations only. Synthesis-level +inference across files is forbidden in v1.0 (Judge Agent pattern). + +See `agents/review-orchestrator.md` for the canonical workflow this +command executes inline. + +## Phase 1 — Parse mode and validate input + +Parse `$ARGUMENTS` via the shared arg-parser: + +```bash +node ${CLAUDE_PLUGIN_ROOT}/lib/parsers/arg-parser.mjs --command ultrareview "$@" +``` + +The parser recognizes these flags (see `lib/parsers/arg-parser.mjs` +FLAG_SCHEMA `ultrareview` entry): + +| Flag | Type | Purpose | +|------|------|---------| +| `--project ` | valued | Required. Path to ultraplan-local project folder containing `brief.md`. | +| `--since ` | valued | Optional. Override "before" SHA for the diff. Validated via `git rev-parse --verify`. | +| `--quick` | boolean | Skip the brief-conformance pass; run only the code-correctness reviewer; skip the coordinator's reasonableness filter. | +| `--validate` | boolean | Schema-only check on existing `{project_dir}/review.md`. No LLM calls. | +| `--dry-run` | boolean | Print the discovered scope and triage map. Skip writes. | +| `--fg` | boolean | No-op alias (foreground is default). | + +Resolution: +1. If `--project` is missing, print usage and stop: + ``` + Error: --project is required. + Usage: /ultrareview-local --project [--since ] [--quick] [--validate] [--dry-run] + ``` +2. Trim trailing slash from `{dir}`. Set: + - `project_dir = {dir}` + - `brief_path = {dir}/brief.md` + - `review_path = {dir}/review.md` +3. If `{dir}` does not exist or `{dir}/brief.md` is missing: + ``` + Error: project directory not initialized. Run /ultrabrief-local first. + Missing: {dir}/brief.md + ``` + +Set `mode`: +- `validate` if `--validate` is set (overrides everything else; skip to Phase 8.5). +- `dry-run` if `--dry-run` is set. +- `quick` if `--quick` is set. +- `default` otherwise. + +## Phase 2 — Validate brief + +Run the brief validator in soft mode — the brief is upstream context, not +something this command produces, so partial grades are acceptable as long +as the file is parseable: + +```bash +node ${CLAUDE_PLUGIN_ROOT}/lib/validators/brief-validator.mjs --soft --json "{brief_path}" +``` + +Read the JSON output. If `valid: false` AND any error has code +`BRIEF_MISSING_REQUIRED_FIELD` or `FRONTMATTER_PARSE_ERROR`: stop and +ask the user to re-run `/ultrabrief-local`. Other soft errors become +warnings in the review's Executive Summary. + +Read the brief frontmatter. Capture for review.md: +- `task` → review frontmatter `task` +- `slug` → review frontmatter `slug` +- `project_dir` → review frontmatter `project_dir` (defaults to the + CLI `--project` value when missing) + +## Phase 3 — Discover scope SHA range + +Determine the "before" SHA that bounds the review: + +1. **`--since ` override** — if set, validate via: + ```bash + git rev-parse --verify "$since_ref" + ``` + On failure: print `Error: --since ref is not a valid git revision: {ref}` and stop. + Set `before_sha = $(git rev-parse --verify "$since_ref")`. + +2. **Preferred path** — read `{project_dir}/progress.json` if it exists. + Extract `session_start_sha`. Validate it via `git rev-parse --verify`. + Set `before_sha = session_start_sha`. + +3. **Fallback** — no `progress.json`. Use the brief's mtime to find the + most recent commit at or before the brief was written: + ```bash + brief_mtime=$(stat -f %m "{brief_path}") # macOS; on Linux use stat -c %Y + before_sha=$(git log --until="@$brief_mtime" -n 1 --format=%H) + ``` + Emit a clear warning that gets surfaced in the review's Executive + Summary: "scope_sha_start unavailable — falling back to brief mtime + ({timestamp}). Coverage may include unrelated commits." + +Compute the "after" SHA: `after_sha=$(git rev-parse HEAD)`. + +Capture working-tree changes (uncommitted at review time): +```bash +git diff --name-only "$before_sha".."$after_sha" +git diff --name-only HEAD # uncommitted (annotated [uncommitted]) +``` + +The combined file list is the review scope. Note that the +`[uncommitted]` annotation is a **brief-level contract** — the brief's +Assumptions section declares this is allowed; the review surfaces it +explicitly in the Coverage table. + +If the file count is `0`, write a one-line review.md noting "No diff +between {before_sha} and {after_sha}; nothing to review." Verdict: ALLOW. +Skip Phases 4–7. Continue to Phase 8 (validate + stats). + +## Phase 4 — Triage gate (deterministic path-pattern classifier) + +The triage gate is **deterministic** — no LLM judgment. It classifies +every file from Phase 3 into a treatment bucket: + +| Treatment | When | +|-----------|------| +| `skip` | Matches `*.lock`, `*.svg`, `dist/**`, `build/**`, `node_modules/**`, OR the file's first 3 lines contain a generated-file marker (`@generated`, `Code generated by`, `DO NOT EDIT`). | +| `deep-review` | Matches `auth/**`, `crypto/**`, `**/security/**`, `hooks/**`. | +| `summary-only` | Default treatment for everything else. | + +Hard refuse-with-suggestion gates — use `AskUserQuestion`: + +``` +if (reviewed_files_count > 100) → ask user +if (estimated_diff_tokens > 100000) → ask user +``` + +Token estimation: `wc -c "$diff_file" / 4` (rough proxy). Use +`AskUserQuestion` with the prompt: + +> The diff under review is large (`{N}` files / `~{T}` tokens). Continue +> with the full scope, narrow with `--since `, or stop? + +Options: +1. **Continue** — proceed at this scope. +2. **Narrow** — print suggested `git log --oneline {before}..HEAD` so the + user can pick a closer ref, then stop. +3. **Stop** — cancel. + +Record the treatment for every file. Files marked `skip` MUST appear in +the Coverage section of `review.md` — never silently drop them. Silent +drops are `COVERAGE_SILENT_SKIP` (MAJOR) per the rule catalogue. + +If `mode == dry-run`: print the triage map and exit. + +## Phase 5 — Launch parallel reviewers + +Launch two reviewer agents **in parallel** via the Agent tool — one +message, multiple tool calls. + +Reviewers run independently. Do NOT pre-feed findings between them. + +| Agent | Mode-gated | Purpose | +|-------|------------|---------| +| `brief-conformance-reviewer` | Skipped in `quick` | Trace each Success Criterion + Non-Goal to delivered code. Emits findings tagged with rule_keys from the conformance/scope categories. | +| `code-correctness-reviewer` | Always runs | 7-dimension code review. Emits findings tagged with rule_keys from the correctness/security/maintenance/tests categories. | + +Each reviewer prompt includes: +- **Diff context** — the unified diff from Phase 3, truncated per file + for files marked `summary-only`. +- **Triage map** — full file list with treatments. Reviewers must + respect `skip` decisions. +- **Brief path** — `{brief_path}` (read on demand; do not inline). +- **Rule catalogue** — reference to `lib/review/rule-catalogue.mjs`. + +Collect each reviewer's trailing JSON block (last fenced `json` block in +their output). Parse with `JSON.parse()`. On parse error, ask the agent +to re-emit the JSON only. + +In `quick` mode, launch only `code-correctness-reviewer`. The Executive +Summary will note the brief-conformance pass was skipped. + +## Phase 6 — Coordinator dedup + verdict + +Launch `review-coordinator` (Agent tool) with the merged findings array +from Phase 5 plus the triage map, brief metadata, and SHA range. + +The coordinator runs the 4-pass process documented in +`agents/review-coordinator.md`: + +1. **Dedup** by `(file, line, rule_key)` triplet. +2. **HubSpot Judge filters** — Succinctness, Accuracy, Actionability. +3. **Cloudflare reasonableness** — drop speculative or catalogue-violating + findings (skipped in `quick` mode). +4. **Verdict** — BLOCK / WARN / ALLOW per the threshold table. + +The coordinator's output is the full review.md content — frontmatter + +body sections + trailing JSON block. Do NOT re-run the reviewers based +on the coordinator's output. + +## Phase 7 — Write review.md + +Write the coordinator's output verbatim to: + +``` +{project_dir}/review.md +``` + +Create parent directories if they do not exist. Atomic write pattern: +write to a temp file, then rename. The frontmatter `findings:` field +must use **block-style YAML** (one ID per line, ` - ` prefix). The +parser at `lib/util/frontmatter.mjs` does not support flow-style arrays. + +If `mode == dry-run`: skip the write; print the would-be path and the +first 60 lines of the rendered output. + +## Phase 8 — Validate output + stats + +Run the strict validator: + +```bash +node ${CLAUDE_PLUGIN_ROOT}/lib/validators/review-validator.mjs --json "{review_path}" +``` + +If validation fails: +- For repairable errors (missing required body section, malformed + finding-ID, REVIEW_VERSION_FORMAT warning): repair in place — re-emit + the missing section, recompute the finding-ID, fix the version + string. Re-validate. +- For unrepairable errors (REVIEW_WRONG_TYPE, malformed frontmatter): + stop and ask the user to re-run; do not silently produce an invalid + review.md. + +Append a stats line to `${CLAUDE_PLUGIN_DATA}/ultrareview-stats.jsonl` +(create the file if it does not exist): + +```json +{"ts":"{ISO-8601}","slug":"{slug}","verdict":"BLOCK|WARN|ALLOW","counts":{"BLOCKER":N,"MAJOR":N,"MINOR":N,"SUGGESTION":N},"reviewed_files_count":N,"mode":"default|quick|validate|dry-run","duration_ms":N} +``` + +If `${CLAUDE_PLUGIN_DATA}` is unset or not writable, skip stats silently. +Never let stats failures block the main workflow. + +## Phase 8.5 — Validate-only mode (`--validate`) + +When `mode == validate`: +1. Skip Phases 3–7 entirely. +2. Run the strict validator on `{project_dir}/review.md`. +3. Print a one-line PASS/FAIL summary plus the JSON output on FAIL. +4. Exit 0 on PASS, 1 on FAIL. Never write to disk. Never call any agent. + +## Phase 9 — Present summary + +After the write succeeds, print: + +``` +## Ultrareview Complete + +**Task:** {task} +**Mode:** {default | quick | dry-run} +**Brief:** {brief_path} +**Project:** {project_dir} +**Review:** {review_path} +**Scope:** {before_sha}..{after_sha} ({reviewed_files_count} files) +**Verdict:** {BLOCK | WARN | ALLOW} + +### Counts +- BLOCKER: {N} +- MAJOR: {N} +- MINOR: {N} +- SUGGESTION: {N} + +### Top findings +- [{severity}] {title} ({file}:{line}) + ... +{up to 5 highest-severity findings} + +You can: +- Read the full review at {review_path} +- Feed BLOCKER + MAJOR findings into a follow-up plan: + /ultraplan-local --brief {review_path} +- Re-run with `--quick` for a faster correctness-only pass +- Re-run with `--since ` to narrow scope +``` + +Per **Handover 6**, BLOCKER and MAJOR findings are consumed by +`/ultraplan-local --brief review.md` to produce a remediation plan. The +review's frontmatter `findings:` list and the trailing JSON block are +the contract for that handover (see `docs/HANDOVER-CONTRACTS.md`). + +## Hard rules + +- **Brief is the contract.** Every finding in the review traces to a + brief section via `brief_ref`, except `SCOPE_CREEP_BUILT` (which + traces to "no anchor"). Conformance is the conformance reviewer's + job — code-correctness findings carry generic anchors like + `"NFR — code correctness"`. +- **Independent reviewers.** Do NOT cross-feed findings between + brief-conformance-reviewer and code-correctness-reviewer. The + coordinator is the only place where outputs combine. +- **Bounded coordination.** Synthesis-level inference across files is + forbidden in v1.0. The coordinator dedups, filters, and computes the + verdict — nothing more. +- **Triage map respected.** Files marked `skip` MUST appear in the + Coverage section. Silent drops are `COVERAGE_SILENT_SKIP` (MAJOR). +- **Block-style YAML for findings list.** The frontmatter parser does + not support flow-style arrays. `findings: [a, b]` is broken; use + `findings:\n - a\n - b`. +- **Refuse-with-suggestion above 100 files / 100K tokens.** Never run + blind on a giant diff. Use AskUserQuestion to surface the gate. +- **Cost.** Sonnet for all sub-agents (reviewers + coordinator). Opus + only runs in the main /ultrareview-local command thread. +- **Privacy.** Never log secrets, tokens, or credentials in review.md. + Findings citing files with secret-like content must redact the secret + in the `detail` field. +- **Honesty.** If the diff is trivially small or all-skip, say so. Do + not pad findings to make the review look thorough. +- **No production code.** This command never runs production code, never + writes to anything outside `{project_dir}` and `${CLAUDE_PLUGIN_DATA}`.