From 29ee34113f9ef002b590ded28f5cc8ad69c41b7c Mon Sep 17 00:00:00 2001 From: Kjell Tore Guttormsen Date: Fri, 1 May 2026 16:50:51 +0200 Subject: [PATCH] feat(ultraplan-local): add agents/review-orchestrator.md --- .../agents/review-orchestrator.md | 248 ++++++++++++++++++ 1 file changed, 248 insertions(+) create mode 100644 plugins/ultraplan-local/agents/review-orchestrator.md diff --git a/plugins/ultraplan-local/agents/review-orchestrator.md b/plugins/ultraplan-local/agents/review-orchestrator.md new file mode 100644 index 0000000..6071a5b --- /dev/null +++ b/plugins/ultraplan-local/agents/review-orchestrator.md @@ -0,0 +1,248 @@ +--- +name: review-orchestrator +description: | + Inline reference (v3.2.0) — documents the review workflow that + /ultrareview-local executes in main context. This file is NOT spawned + as a sub-agent. The Claude Code harness does not expose the Agent tool + to sub-agents, so a background orchestrator launched with + run_in_background: true cannot spawn the reviewer swarm + (brief-conformance-reviewer, code-correctness-reviewer, review-coordinator) + and would degrade silently to single-context reasoning. The + /ultrareview-local command now orchestrates the phases below directly in + the main session. +model: opus +color: red +tools: ["Agent", "Read", "Glob", "Grep", "Write", "Edit", "Bash", "TaskCreate", "TaskUpdate"] +--- + + + +This document is the canonical workflow description for the ultrareview +pipeline as of v3.2.0. The `/ultrareview-local` command reads it as +reference and executes the phases below **inline in the main command +context**. It is not spawned as a background sub-agent — that mode would +silently lose the Agent tool and degrade the reviewer swarm to +single-context reasoning. + +The role of the "orchestrator" now belongs to the command markdown itself: +the main Opus session launches reviewer agents via the Agent tool, runs the +coordinator, validates the output, and writes review.md to disk. + +## Design principle: independent, then bounded + +Each reviewer runs independently — no cross-feeding of findings between +brief-conformance-reviewer and code-correctness-reviewer. The coordinator +then applies BOUNDED operations only: deduplication, severity ranking, +reasonableness filter. Synthesis-level inference across files is +explicitly forbidden in v1.0 (Judge Agent pattern). + +## Input + +You will receive a prompt containing: +- **Project dir** — path to the ultraplan-local project folder (the brief and + optional `progress.json` live here; the review will be written to + `{project_dir}/review.md`). +- **Brief path** — `{project_dir}/brief.md`. Read it; the brief is the + contract that bounds review scope. +- **Mode** — `default`, `quick`, `validate`, or `dry-run`. + - `default` — run the full pipeline. + - `quick` — skip the coordinator's reasonableness filter; use single + reviewer (code-correctness only) for faster turnaround. + - `validate` — schema-only check on existing review.md, no LLM calls. + - `dry-run` — print the discovered scope and triage map; skip writes. +- **Since-ref** (optional) — explicit `--since ` override for the SHA + range. Validated via `git rev-parse --verify `. +- **Plugin root** — for template access. + +Read the brief file first. It is the contract. Parse its frontmatter and +every section (Intent, Goal, Non-Goals, Constraints, Success Criteria, +Open Questions, Prior Attempts). + +## Your workflow + +Execute these phases in order. Do not skip phases. + +### Phase 1 — Parse mode and validate input + +Run the arg-parser via Bash: +``` +node ${CLAUDE_PLUGIN_ROOT}/lib/parsers/arg-parser.mjs --command ultrareview "$@" +``` + +Pull the structured flags from its JSON output. Reject unknown flags. If +`--project` is missing and a brief argument was not supplied directly, +print usage and stop. + +### Phase 2 — Validate brief + +Run the brief validator in soft mode (the brief was produced earlier in +the pipeline — we accept partial grades, we just want a parseable +contract): +``` +node ${CLAUDE_PLUGIN_ROOT}/lib/validators/brief-validator.mjs --soft --json {brief_path} +``` + +If `valid: false` with REQUIRED-field errors: stop, ask the user to +re-run `/ultrabrief-local` first. + +### Phase 3 — Discover scope SHA range + +Determine the range of commits this review covers. + +1. **Preferred path** — read `{project_dir}/progress.json` if it exists. + Extract `session_start_sha`. This is the "before" SHA. +2. **Fallback** — if no `progress.json`, use the brief's mtime to find the + most recent commit AT OR BEFORE the brief was written. Emit a clear + warning in the review's Executive Summary noting the fallback. +3. **Override** — `--since ` overrides the discovered "before" SHA. + Validate the ref with `git rev-parse --verify `. Reject if invalid. +4. The "after" SHA is `git rev-parse HEAD`. + +Compute the diff: +``` +git diff --name-only {before_sha}..{after_sha} +``` + +Add working-tree changes (uncommitted) with the `[uncommitted]` annotation +the brief contract specifies. The Coverage table marks them explicitly. + +### Phase 4 — Triage gate (path-pattern classifier) + +The triage gate is **deterministic** — no LLM judgment. It runs a +hardcoded path-pattern classifier over the file list from Phase 3 and +produces a treatment map: + +| Treatment | When | +|-----------|------| +| `skip` | Matches `*.lock`, `*.svg`, `dist/**`, `build/**`, `node_modules/**`, generated-file marker present in first 3 lines | +| `deep-review` | Matches `auth/**`, `crypto/**`, `**/security/**`, `hooks/**` | +| `summary-only` | Default treatment for everything else | + +Hard refuse-with-suggestion gates (use AskUserQuestion): +- > 100 files in the diff +- > 100,000 tokens of estimated diff content (`git diff` output size / 4) + +If gated, suggest narrowing the scope with `--since ` or +splitting the review across multiple commits. + +Record the treatment for every file. Files marked `skip` MUST appear in +the Coverage section of review.md — never silently drop them. A silent +drop is a `COVERAGE_SILENT_SKIP` finding emitted by the coordinator. + +### Phase 5 — Launch parallel reviewers + +Launch **two reviewer agents in parallel** using the Agent tool — one +message, multiple tool calls. + +Reviewers run independently. Do NOT pre-feed findings between them. The +coordinator handles cross-cutting decisions later. + +| Agent | Purpose | +|-------|---------| +| `brief-conformance-reviewer` | Trace each Success Criterion + Non-Goal to delivered code. Flag UNIMPLEMENTED_CRITERION, NON_GOAL_VIOLATED, BROKEN_SUCCESS_CRITERION, MISSING_BRIEF_REF, SCOPE_CREEP_BUILT, PLAN_EXECUTE_DRIFT. | +| `code-correctness-reviewer` | 7-dimension code review. Flag MISSING_ERROR_HANDLING, PLAN_EXECUTE_DRIFT, MISSING_TEST, PLACEHOLDER_IN_CODE, SECURITY_INJECTION, UNDECLARED_DEPENDENCY. | + +Each reviewer receives: +- **Diff context** — the unified diff from Phase 3 (truncated per file + for files marked `summary-only`). +- **Triage map** — full file list with treatments. Reviewers must respect + `skip` decisions — if they want to flag a skipped file they emit a + COVERAGE_SILENT_SKIP finding instead. +- **Brief path** — for re-reading; do not inline the full brief into the + prompt to keep token budgets honest. + +In `quick` mode, launch only `code-correctness-reviewer`. Skip the +brief-conformance pass; the coverage matrix will still appear in +review.md but it is structural, not behavioral. + +### Phase 6 — Coordinator dedup + verdict + +Launch `review-coordinator` with the merged findings array from Phase 5. +The coordinator runs a 4-pass process: + +1. **Dedup** by `(file, line, rule_key)` triplet — keep highest severity. +2. **HubSpot Judge filters** — drop findings failing Succinctness, + Accuracy, or Actionability. +3. **Cloudflare reasonableness** — drop speculative findings without a + `file:line` citation; drop findings whose `rule_key` is not in + `RULE_CATALOGUE`. +4. **Compute verdict** — `BLOCK` if `BLOCKER ≥ 1`, `WARN` if `MAJOR ≥ 1`, + else `ALLOW`. + +The coordinator's output is the full review.md content — frontmatter + +body sections + trailing JSON block — ready to write. + +In `quick` mode, skip pass 3 (reasonableness filter). Passes 1, 2, 4 +still run. + +### Phase 7 — Write review.md + +Use the destination from Phase 1: +- **With `--project`:** write to `{project_dir}/review.md`. + +Create parent directories if needed. The frontmatter `findings:` field +must use **block-style YAML** (one ID per line with ` - ` prefix). The +parser at `lib/util/frontmatter.mjs` does not accept flow-style arrays. + +The trailing JSON block in the body must be a valid `json` fenced code +block, last fenced block in the file, parseable by `JSON.parse()`. + +### Phase 8 — Validate + stats + +Run the review validator in strict mode: +``` +node ${CLAUDE_PLUGIN_ROOT}/lib/validators/review-validator.mjs --json {project_dir}/review.md +``` + +If validation fails, repair the file (most failures are fixable in place +— missing required frontmatter field, missing body section, malformed +finding-ID). Do NOT proceed if any REVIEW_REQUIRED_FRONTMATTER field is +missing. + +Append a stats line to `${CLAUDE_PLUGIN_DATA}/ultrareview-stats.jsonl`: +```json +{"ts":"...","slug":"...","verdict":"BLOCK|WARN|ALLOW","counts":{"BLOCKER":N,"MAJOR":N,"MINOR":N,"SUGGESTION":N},"reviewed_files_count":N,"mode":"default|quick|validate|dry-run","duration_ms":N} +``` + +## Hard rules + +- **Never spawn in background.** This orchestrator file is reference, not + a runnable sub-agent. Background mode silently degrades — the harness + does not expose the Agent tool to sub-agents, so the reviewer swarm + collapses to single-context reasoning. Always run review agents from + the main /ultrareview-local command context. +- **Reviewers run independently.** No cross-feeding of findings. The + coordinator is the only place where reviewer outputs are combined. +- **Coordinator scope is bounded.** Dedup, severity ranking, reasonableness + filter only. No cross-file inference. No synthesis-level hallucination. + Synthesis is a v1.1 candidate — for v1.0 it is forbidden. +- **Brief is the contract.** Every finding must have a `brief_ref` tracing + back to a brief section (SC, Non-Goal, Constraint, NFR). Findings without + `brief_ref` are MISSING_BRIEF_REF (MAJOR). +- **No silent drops.** Every file in the discovered diff must appear in + the Coverage section, even if its treatment is `skip`. Hidden truncation + is COVERAGE_SILENT_SKIP (MAJOR). +- **Cost:** Use Sonnet for all sub-agents. The orchestrator (the + /ultrareview-local command itself) runs on Opus. +- **Privacy:** Never log secrets, tokens, or credentials. Findings citing + files with secret-like content must redact the secret in the `detail`. +- **Honesty:** If the diff is trivially small or all-skip, say so. Do + not pad findings to make the review look thorough. +- **Block-style YAML for findings list.** The frontmatter parser does not + support flow-style arrays. `findings: [a, b]` is broken; use: + ```yaml + findings: + - + - + ```