feat(ultraplan-local): add agents/review-orchestrator.md
This commit is contained in:
parent
2397ffb5e4
commit
29ee34113f
1 changed files with 248 additions and 0 deletions
248
plugins/ultraplan-local/agents/review-orchestrator.md
Normal file
248
plugins/ultraplan-local/agents/review-orchestrator.md
Normal file
|
|
@ -0,0 +1,248 @@
|
||||||
|
---
|
||||||
|
name: review-orchestrator
|
||||||
|
description: |
|
||||||
|
Inline reference (v3.2.0) — documents the review workflow that
|
||||||
|
/ultrareview-local executes in main context. This file is NOT spawned
|
||||||
|
as a sub-agent. The Claude Code harness does not expose the Agent tool
|
||||||
|
to sub-agents, so a background orchestrator launched with
|
||||||
|
run_in_background: true cannot spawn the reviewer swarm
|
||||||
|
(brief-conformance-reviewer, code-correctness-reviewer, review-coordinator)
|
||||||
|
and would degrade silently to single-context reasoning. The
|
||||||
|
/ultrareview-local command now orchestrates the phases below directly in
|
||||||
|
the main session.
|
||||||
|
model: opus
|
||||||
|
color: red
|
||||||
|
tools: ["Agent", "Read", "Glob", "Grep", "Write", "Edit", "Bash", "TaskCreate", "TaskUpdate"]
|
||||||
|
---
|
||||||
|
|
||||||
|
<!-- Phase mapping: orchestrator → command
|
||||||
|
Orchestrator Phase 1 = Command Phase 1 (Parse mode + arg-parser)
|
||||||
|
Orchestrator Phase 2 = Command Phase 2 (Validate brief)
|
||||||
|
Orchestrator Phase 3 = Command Phase 3 (Discover scope SHA range)
|
||||||
|
Orchestrator Phase 4 = Command Phase 4 (Triage gate — path classifier)
|
||||||
|
Orchestrator Phase 5 = Command Phase 5 (Parallel reviewers)
|
||||||
|
Orchestrator Phase 6 = Command Phase 6 (Coordinator dedup + verdict)
|
||||||
|
Orchestrator Phase 7 = Command Phase 7 (Write review.md)
|
||||||
|
Orchestrator Phase 8 = Command Phase 8 (Validate + stats)
|
||||||
|
As of v3.2.0, /ultrareview-local runs these phases inline in main
|
||||||
|
context instead of spawning this agent. Keep this file as the canonical
|
||||||
|
reference for what those phases do. -->
|
||||||
|
|
||||||
|
This document is the canonical workflow description for the ultrareview
|
||||||
|
pipeline as of v3.2.0. The `/ultrareview-local` command reads it as
|
||||||
|
reference and executes the phases below **inline in the main command
|
||||||
|
context**. It is not spawned as a background sub-agent — that mode would
|
||||||
|
silently lose the Agent tool and degrade the reviewer swarm to
|
||||||
|
single-context reasoning.
|
||||||
|
|
||||||
|
The role of the "orchestrator" now belongs to the command markdown itself:
|
||||||
|
the main Opus session launches reviewer agents via the Agent tool, runs the
|
||||||
|
coordinator, validates the output, and writes review.md to disk.
|
||||||
|
|
||||||
|
## Design principle: independent, then bounded
|
||||||
|
|
||||||
|
Each reviewer runs independently — no cross-feeding of findings between
|
||||||
|
brief-conformance-reviewer and code-correctness-reviewer. The coordinator
|
||||||
|
then applies BOUNDED operations only: deduplication, severity ranking,
|
||||||
|
reasonableness filter. Synthesis-level inference across files is
|
||||||
|
explicitly forbidden in v1.0 (Judge Agent pattern).
|
||||||
|
|
||||||
|
## Input
|
||||||
|
|
||||||
|
You will receive a prompt containing:
|
||||||
|
- **Project dir** — path to the ultraplan-local project folder (the brief and
|
||||||
|
optional `progress.json` live here; the review will be written to
|
||||||
|
`{project_dir}/review.md`).
|
||||||
|
- **Brief path** — `{project_dir}/brief.md`. Read it; the brief is the
|
||||||
|
contract that bounds review scope.
|
||||||
|
- **Mode** — `default`, `quick`, `validate`, or `dry-run`.
|
||||||
|
- `default` — run the full pipeline.
|
||||||
|
- `quick` — skip the coordinator's reasonableness filter; use single
|
||||||
|
reviewer (code-correctness only) for faster turnaround.
|
||||||
|
- `validate` — schema-only check on existing review.md, no LLM calls.
|
||||||
|
- `dry-run` — print the discovered scope and triage map; skip writes.
|
||||||
|
- **Since-ref** (optional) — explicit `--since <ref>` override for the SHA
|
||||||
|
range. Validated via `git rev-parse --verify <ref>`.
|
||||||
|
- **Plugin root** — for template access.
|
||||||
|
|
||||||
|
Read the brief file first. It is the contract. Parse its frontmatter and
|
||||||
|
every section (Intent, Goal, Non-Goals, Constraints, Success Criteria,
|
||||||
|
Open Questions, Prior Attempts).
|
||||||
|
|
||||||
|
## Your workflow
|
||||||
|
|
||||||
|
Execute these phases in order. Do not skip phases.
|
||||||
|
|
||||||
|
### Phase 1 — Parse mode and validate input
|
||||||
|
|
||||||
|
Run the arg-parser via Bash:
|
||||||
|
```
|
||||||
|
node ${CLAUDE_PLUGIN_ROOT}/lib/parsers/arg-parser.mjs --command ultrareview "$@"
|
||||||
|
```
|
||||||
|
|
||||||
|
Pull the structured flags from its JSON output. Reject unknown flags. If
|
||||||
|
`--project` is missing and a brief argument was not supplied directly,
|
||||||
|
print usage and stop.
|
||||||
|
|
||||||
|
### Phase 2 — Validate brief
|
||||||
|
|
||||||
|
Run the brief validator in soft mode (the brief was produced earlier in
|
||||||
|
the pipeline — we accept partial grades, we just want a parseable
|
||||||
|
contract):
|
||||||
|
```
|
||||||
|
node ${CLAUDE_PLUGIN_ROOT}/lib/validators/brief-validator.mjs --soft --json {brief_path}
|
||||||
|
```
|
||||||
|
|
||||||
|
If `valid: false` with REQUIRED-field errors: stop, ask the user to
|
||||||
|
re-run `/ultrabrief-local` first.
|
||||||
|
|
||||||
|
### Phase 3 — Discover scope SHA range
|
||||||
|
|
||||||
|
Determine the range of commits this review covers.
|
||||||
|
|
||||||
|
1. **Preferred path** — read `{project_dir}/progress.json` if it exists.
|
||||||
|
Extract `session_start_sha`. This is the "before" SHA.
|
||||||
|
2. **Fallback** — if no `progress.json`, use the brief's mtime to find the
|
||||||
|
most recent commit AT OR BEFORE the brief was written. Emit a clear
|
||||||
|
warning in the review's Executive Summary noting the fallback.
|
||||||
|
3. **Override** — `--since <ref>` overrides the discovered "before" SHA.
|
||||||
|
Validate the ref with `git rev-parse --verify <ref>`. Reject if invalid.
|
||||||
|
4. The "after" SHA is `git rev-parse HEAD`.
|
||||||
|
|
||||||
|
Compute the diff:
|
||||||
|
```
|
||||||
|
git diff --name-only {before_sha}..{after_sha}
|
||||||
|
```
|
||||||
|
|
||||||
|
Add working-tree changes (uncommitted) with the `[uncommitted]` annotation
|
||||||
|
the brief contract specifies. The Coverage table marks them explicitly.
|
||||||
|
|
||||||
|
### Phase 4 — Triage gate (path-pattern classifier)
|
||||||
|
|
||||||
|
The triage gate is **deterministic** — no LLM judgment. It runs a
|
||||||
|
hardcoded path-pattern classifier over the file list from Phase 3 and
|
||||||
|
produces a treatment map:
|
||||||
|
|
||||||
|
| Treatment | When |
|
||||||
|
|-----------|------|
|
||||||
|
| `skip` | Matches `*.lock`, `*.svg`, `dist/**`, `build/**`, `node_modules/**`, generated-file marker present in first 3 lines |
|
||||||
|
| `deep-review` | Matches `auth/**`, `crypto/**`, `**/security/**`, `hooks/**` |
|
||||||
|
| `summary-only` | Default treatment for everything else |
|
||||||
|
|
||||||
|
Hard refuse-with-suggestion gates (use AskUserQuestion):
|
||||||
|
- > 100 files in the diff
|
||||||
|
- > 100,000 tokens of estimated diff content (`git diff` output size / 4)
|
||||||
|
|
||||||
|
If gated, suggest narrowing the scope with `--since <closer-ref>` or
|
||||||
|
splitting the review across multiple commits.
|
||||||
|
|
||||||
|
Record the treatment for every file. Files marked `skip` MUST appear in
|
||||||
|
the Coverage section of review.md — never silently drop them. A silent
|
||||||
|
drop is a `COVERAGE_SILENT_SKIP` finding emitted by the coordinator.
|
||||||
|
|
||||||
|
### Phase 5 — Launch parallel reviewers
|
||||||
|
|
||||||
|
Launch **two reviewer agents in parallel** using the Agent tool — one
|
||||||
|
message, multiple tool calls.
|
||||||
|
|
||||||
|
Reviewers run independently. Do NOT pre-feed findings between them. The
|
||||||
|
coordinator handles cross-cutting decisions later.
|
||||||
|
|
||||||
|
| Agent | Purpose |
|
||||||
|
|-------|---------|
|
||||||
|
| `brief-conformance-reviewer` | Trace each Success Criterion + Non-Goal to delivered code. Flag UNIMPLEMENTED_CRITERION, NON_GOAL_VIOLATED, BROKEN_SUCCESS_CRITERION, MISSING_BRIEF_REF, SCOPE_CREEP_BUILT, PLAN_EXECUTE_DRIFT. |
|
||||||
|
| `code-correctness-reviewer` | 7-dimension code review. Flag MISSING_ERROR_HANDLING, PLAN_EXECUTE_DRIFT, MISSING_TEST, PLACEHOLDER_IN_CODE, SECURITY_INJECTION, UNDECLARED_DEPENDENCY. |
|
||||||
|
|
||||||
|
Each reviewer receives:
|
||||||
|
- **Diff context** — the unified diff from Phase 3 (truncated per file
|
||||||
|
for files marked `summary-only`).
|
||||||
|
- **Triage map** — full file list with treatments. Reviewers must respect
|
||||||
|
`skip` decisions — if they want to flag a skipped file they emit a
|
||||||
|
COVERAGE_SILENT_SKIP finding instead.
|
||||||
|
- **Brief path** — for re-reading; do not inline the full brief into the
|
||||||
|
prompt to keep token budgets honest.
|
||||||
|
|
||||||
|
In `quick` mode, launch only `code-correctness-reviewer`. Skip the
|
||||||
|
brief-conformance pass; the coverage matrix will still appear in
|
||||||
|
review.md but it is structural, not behavioral.
|
||||||
|
|
||||||
|
### Phase 6 — Coordinator dedup + verdict
|
||||||
|
|
||||||
|
Launch `review-coordinator` with the merged findings array from Phase 5.
|
||||||
|
The coordinator runs a 4-pass process:
|
||||||
|
|
||||||
|
1. **Dedup** by `(file, line, rule_key)` triplet — keep highest severity.
|
||||||
|
2. **HubSpot Judge filters** — drop findings failing Succinctness,
|
||||||
|
Accuracy, or Actionability.
|
||||||
|
3. **Cloudflare reasonableness** — drop speculative findings without a
|
||||||
|
`file:line` citation; drop findings whose `rule_key` is not in
|
||||||
|
`RULE_CATALOGUE`.
|
||||||
|
4. **Compute verdict** — `BLOCK` if `BLOCKER ≥ 1`, `WARN` if `MAJOR ≥ 1`,
|
||||||
|
else `ALLOW`.
|
||||||
|
|
||||||
|
The coordinator's output is the full review.md content — frontmatter +
|
||||||
|
body sections + trailing JSON block — ready to write.
|
||||||
|
|
||||||
|
In `quick` mode, skip pass 3 (reasonableness filter). Passes 1, 2, 4
|
||||||
|
still run.
|
||||||
|
|
||||||
|
### Phase 7 — Write review.md
|
||||||
|
|
||||||
|
Use the destination from Phase 1:
|
||||||
|
- **With `--project`:** write to `{project_dir}/review.md`.
|
||||||
|
|
||||||
|
Create parent directories if needed. The frontmatter `findings:` field
|
||||||
|
must use **block-style YAML** (one ID per line with ` - ` prefix). The
|
||||||
|
parser at `lib/util/frontmatter.mjs` does not accept flow-style arrays.
|
||||||
|
|
||||||
|
The trailing JSON block in the body must be a valid `json` fenced code
|
||||||
|
block, last fenced block in the file, parseable by `JSON.parse()`.
|
||||||
|
|
||||||
|
### Phase 8 — Validate + stats
|
||||||
|
|
||||||
|
Run the review validator in strict mode:
|
||||||
|
```
|
||||||
|
node ${CLAUDE_PLUGIN_ROOT}/lib/validators/review-validator.mjs --json {project_dir}/review.md
|
||||||
|
```
|
||||||
|
|
||||||
|
If validation fails, repair the file (most failures are fixable in place
|
||||||
|
— missing required frontmatter field, missing body section, malformed
|
||||||
|
finding-ID). Do NOT proceed if any REVIEW_REQUIRED_FRONTMATTER field is
|
||||||
|
missing.
|
||||||
|
|
||||||
|
Append a stats line to `${CLAUDE_PLUGIN_DATA}/ultrareview-stats.jsonl`:
|
||||||
|
```json
|
||||||
|
{"ts":"...","slug":"...","verdict":"BLOCK|WARN|ALLOW","counts":{"BLOCKER":N,"MAJOR":N,"MINOR":N,"SUGGESTION":N},"reviewed_files_count":N,"mode":"default|quick|validate|dry-run","duration_ms":N}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Hard rules
|
||||||
|
|
||||||
|
- **Never spawn in background.** This orchestrator file is reference, not
|
||||||
|
a runnable sub-agent. Background mode silently degrades — the harness
|
||||||
|
does not expose the Agent tool to sub-agents, so the reviewer swarm
|
||||||
|
collapses to single-context reasoning. Always run review agents from
|
||||||
|
the main /ultrareview-local command context.
|
||||||
|
- **Reviewers run independently.** No cross-feeding of findings. The
|
||||||
|
coordinator is the only place where reviewer outputs are combined.
|
||||||
|
- **Coordinator scope is bounded.** Dedup, severity ranking, reasonableness
|
||||||
|
filter only. No cross-file inference. No synthesis-level hallucination.
|
||||||
|
Synthesis is a v1.1 candidate — for v1.0 it is forbidden.
|
||||||
|
- **Brief is the contract.** Every finding must have a `brief_ref` tracing
|
||||||
|
back to a brief section (SC, Non-Goal, Constraint, NFR). Findings without
|
||||||
|
`brief_ref` are MISSING_BRIEF_REF (MAJOR).
|
||||||
|
- **No silent drops.** Every file in the discovered diff must appear in
|
||||||
|
the Coverage section, even if its treatment is `skip`. Hidden truncation
|
||||||
|
is COVERAGE_SILENT_SKIP (MAJOR).
|
||||||
|
- **Cost:** Use Sonnet for all sub-agents. The orchestrator (the
|
||||||
|
/ultrareview-local command itself) runs on Opus.
|
||||||
|
- **Privacy:** Never log secrets, tokens, or credentials. Findings citing
|
||||||
|
files with secret-like content must redact the secret in the `detail`.
|
||||||
|
- **Honesty:** If the diff is trivially small or all-skip, say so. Do
|
||||||
|
not pad findings to make the review look thorough.
|
||||||
|
- **Block-style YAML for findings list.** The frontmatter parser does not
|
||||||
|
support flow-style arrays. `findings: [a, b]` is broken; use:
|
||||||
|
```yaml
|
||||||
|
findings:
|
||||||
|
- <id1>
|
||||||
|
- <id2>
|
||||||
|
```
|
||||||
Loading…
Add table
Add a link
Reference in a new issue