21 KiB
ultraplan-local
Voyage — a contract-driven Claude Code pipeline: brief, research, plan, execute, review, continue. Deep implementation planning and research with specialized agent swarms, external research, adversarial review, session decomposition, disciplined execution, and headless support.
Design principle: Context Engineering — build the right context by orchestrating specialized agents. Each step in the pipeline (brief → research → plan → execute) produces a structured artifact that the next step consumes.
v3.0.0 — architect step extracted from this plugin. The plan command still auto-discovers
architecture/overview.mdif present, so any compatible producer (architect plugin no longer publicly distributed; the architecture/overview.md slot remains available for any compatible producer) plugs into the same slot. See CHANGELOG.md for migration history.
Commands
| Command | Description | Model |
|---|---|---|
/trekbrief |
Brief — interactive interview produces a task brief with explicit research plan; optionally orchestrates the pipeline | opus |
/trekresearch |
Research — deep local + external research, produces structured research brief | opus |
/trekplan |
Plan — brief-reviewer, explore, plan, review. Requires --brief or --project. Auto-discovers architecture/overview.md if present |
opus |
/trekexecute |
Execute — disciplined plan/session-spec executor with failure recovery | opus |
/trekreview |
Review — independent post-hoc review of delivered code against the brief. Produces review.md with severity-tagged findings (Handover 6) |
opus |
/trekcontinue |
Continue — resumes the next session of a multi-session voyage project. Reads .session-state.local.json (Handover 7) and immediately begins executing |
opus |
/trekendsession |
End-session — mark the current session complete and write session-state pointing at the next session. Helper for informal multi-session flows | sonnet |
/ultrabrief-local modes
| Flag | Behavior |
|---|---|
| (default) | Dynamic interview until quality gates pass → brief.md with research plan |
--quick |
Compact start; still escalates if required sections are weak or the brief-review gate fails → brief.md with research plan |
--gates {open|closed|adaptive} |
(v3.4.0) Autonomy-checkpoint policy. Default adaptive |
Always interactive. Phase 3 is a section-driven completeness loop (no hard cap on question count); Phase 4 runs a brief-reviewer stop-gate with max 3 review iterations. After writing the brief, asks the user to choose manual (print commands) or auto (Claude runs research + plan in foreground).
/ultraresearch-local modes
| Flag | Behavior |
|---|---|
| (default) | Interview + research (local + external) + synthesis + brief (foreground) |
--project <dir> |
Write brief to {dir}/research/{NN}-{slug}.md (auto-incremented) |
--quick |
Interview (short) + inline research (no agent swarm) |
--local |
Only codebase analysis agents (skip external + Gemini) |
--external |
Only external research agents (skip codebase analysis) |
--fg |
No-op alias (foreground is default since v2.4.0) |
--gates {open|closed|adaptive} |
(v3.4.0) Autonomy-checkpoint policy. Default adaptive |
Flags combine: --project <dir> --local, --external --quick.
/ultraplan-local modes
| Flag | Behavior |
|---|---|
--project <dir> |
Required path A — read {dir}/brief.md, auto-discover {dir}/research/*.md, write {dir}/plan.md |
--brief <path> |
Required path B — plan from a specific brief file; write to .claude/plans/ultraplan-{date}-{slug}.md |
--research <brief> [brief2] |
Enrich with extra research briefs beyond what is in {project_dir}/research/ |
--fg |
No-op alias (foreground is default since v2.4.0) |
--quick |
Plan directly (no agent swarm) |
--export <pr|issue|markdown|headless> <plan> |
Generate shareable output from existing plan |
--decompose <plan> |
Split plan into self-contained headless sessions |
--gates {open|closed|adaptive} |
(v3.4.0) Autonomy-checkpoint policy. Default adaptive |
Breaking change (v2.0): one of --brief or --project is required. There is no interview inside /ultraplan-local. The --spec flag has been removed — use /ultrabrief-local to produce a brief instead.
If {project_dir}/architecture/overview.md exists (typically produced by an opt-in upstream architect plugin, not bundled), the plan command auto-discovers it and treats cc_features_proposed as priors. Missing file is fine — discovery is additive, not required.
/ultraexecute-local modes
| Flag | Behavior |
|---|---|
| (default) | Execute plan — auto-detects Execution Strategy for multi-session |
--project <dir> |
Read {dir}/plan.md, write {dir}/progress.json |
--resume |
Resume from last progress checkpoint |
--dry-run |
Validate plan structure without executing |
--validate |
Schema-only check — parse steps + manifests, report READY | FAIL, no execution |
--step N |
Execute only step N |
--fg |
Force foreground — run all steps sequentially, ignore Execution Strategy |
--session N |
Execute only session N from plan's Execution Strategy |
--gates {open|closed|adaptive} |
(v3.4.0) Autonomy-checkpoint policy. Default adaptive |
/ultrareview-local modes
| Flag | Behavior |
|---|---|
| (default) | Run brief-conformance + code-correctness reviewers in parallel, coordinator dedup + verdict, write {project_dir}/review.md |
--project <dir> |
Required. Path to ultraplan-local project folder containing brief.md. Review is written to {dir}/review.md |
--since <ref> |
Override "before" SHA for the diff range. Validated via git rev-parse --verify |
--quick |
Skip brief-conformance reviewer; skip coordinator's reasonableness filter — fast correctness-only pass |
--validate |
Schema-only check on existing {dir}/review.md. No LLM calls |
--dry-run |
Print discovered scope + triage map; skip writes |
--fg |
No-op alias (foreground is default) |
The triage gate is deterministic — path-pattern classifier produces {file → deep-review|summary-only|skip}. Hard refuse-with-suggestion above 100 files / 100K diff tokens.
Agents
| Agent | Model | Role |
|---|---|---|
| planning-orchestrator | opus | Inline reference documentation for the planning pipeline workflow (brief-driven) |
| research-orchestrator | opus | Inline reference documentation for the research pipeline workflow |
| review-orchestrator | opus | Inline reference documentation for the review pipeline workflow |
| architecture-mapper | sonnet | Codebase structure, tech stack, patterns |
| dependency-tracer | sonnet | Import chains, data flow, side effects |
| task-finder | sonnet | Task-relevant files, functions, reuse candidates |
| risk-assessor | sonnet | Risks, edge cases, failure modes |
| test-strategist | sonnet | Test patterns, coverage gaps, strategy |
| git-historian | sonnet | Recent changes, ownership, hot files |
| research-scout | sonnet | External docs for unfamiliar tech (conditional, planning only) |
| convention-scanner | sonnet | Coding conventions: naming, style, error handling, test patterns |
| brief-reviewer | sonnet | Task brief quality (5 dimensions: completeness, consistency, testability, scope clarity, research plan validity) |
| brief-conformance-reviewer | sonnet | Brief conformance review (SC + Non-Goal traceability) |
| code-correctness-reviewer | sonnet | Code correctness review (7 dimensions) |
| review-coordinator | sonnet | Judge Agent — dedup + reasonableness filter + verdict |
| plan-critic | sonnet | Adversarial plan review (9 dimensions) |
| scope-guardian | sonnet | Scope alignment (creep + gaps) |
| session-decomposer | sonnet | Splits plans into headless sessions with dependency graph |
| docs-researcher | sonnet | Official documentation, RFCs, vendor docs (Tavily, MS Learn) |
| community-researcher | sonnet | Community experience: issues, blogs, discussions |
| security-researcher | sonnet | CVEs, audit history, supply chain risks |
| contrarian-researcher | sonnet | Counter-evidence, overlooked alternatives |
| gemini-bridge | sonnet | Gemini Deep Research second opinion (conditional) |
Quality infrastructure (v3.4.0)
lib/ contains zero-dep validators, parsers, and autonomy primitives wired into the four commands:
lib/util/{frontmatter,result,atomic-write,autonomy-gate}.mjs— shared YAML-frontmatter parser + Result helpers +atomicWriteJson(path, obj)for tmp+rename writes + autonomy-gate state machine (v3.4.0)lib/parsers/{plan-schema,manifest-yaml,project-discovery,arg-parser,bash-normalize,jaccard,finding-id}.mjs— pure parsers (no I/O), unit-tested.manifest-yamlextended in v3.4.0 with additiveskip_commit_check+memory_writeflags (forward-compat: unknown keys ignored)lib/review/{rule-catalogue,plan-review-dedup}.mjs— version-pinned rule catalogue (12 keys) + Phase 9 inline dedup helpers (v3.4.0)lib/stats/event-emit.mjs— single-source stats event emitter for autonomy-gate transitions and main-merge-gate (v3.4.0)lib/validators/{brief,research,plan,progress,session-state}-validator.mjs— schema validators with CLI shims (node lib/validators/X.mjs --json <path>)lib/validators/architecture-discovery.mjs— drift-WARN external-contract discovery forarchitecture/overview.md
Wiring points (replaces previous prose-grep instructions):
/ultrabrief-localPhase 4g →brief-validator(post-write sanity check)/ultraplan-localPhase 1 →brief-validator --soft,research-validator --dir,architecture-discoveryplanning-orchestratorPhase 5.5 →plan-validator --strict(replaces 3grep -cEcalls)/ultraexecute-local --validate→plan-validator --strict+progress-validator
Tests under tests/**/*.test.mjs (~290 tests, 0 deps). npm test is the fork-readiness gate. v3.4.0 adds: synthetic determinism fixtures (tests/synthetic/plan-run-*.md + review-run-*.md + companion *-determinism.test.mjs enforcing Jaccard ≥ 0.833 SC7 floor) and hook baseline regression pins (tests/hooks/{path-guard,bash-guard}.test.mjs exercising pre-write-executor.mjs + pre-bash-executor.mjs denylist BLOCK paths).
Doc-consistency test at tests/lib/doc-consistency.test.mjs pins agent-table count, command-table coverage, plan_version invariant, settings.json scope cleanliness, Handover 7 presence, and session-state-validator CLI shim.
docs/HANDOVER-CONTRACTS.md is the single source of truth for the 7 pipeline handovers (brief→research, research→plan, architecture→plan EXTERNAL, plan→execute, progress.json resume, review→plan, .session-state.local.json). Read it before changing any artifact format.
hooks/scripts/pre-compact-flush.mjs (PreCompact event, CC v2.1.105+) fixes the documented P0 in docs/ultraexecute-v2-observations-from-config-audit-v4.md: keeps progress.json in sync with git history before context compaction so --resume works after long conversations. Atomic write, monotonic only, never blocks compaction.
hooks/scripts/session-title.mjs (UserPromptSubmit, CC v2.1.94+) sets sessionTitle to ultra:<command>:<slug> for ultra-command invocations. Helps multi-session headless runs identify themselves in process lists.
hooks/scripts/post-bash-stats.mjs (PostToolUse, CC v2.1.97+) appends duration_ms for each Bash call into ${CLAUDE_PLUGIN_DATA}/ultraexecute-stats.jsonl. Useful for finding long-running verify or checkpoint commands.
hooks/scripts/post-compact-flush.mjs (PostCompact event, v3.4.0) re-injects .session-state.local.json after context compaction so multi-session work survives a compaction boundary. Companion to pre-compact-flush.mjs (which writes the state file before compaction); together they form the rehydrate cycle that keeps /ultracontinue-local reliable across long-running multi-session work.
Autonomy mode (--gates, v3.4.0)
All four pipeline commands accept --gates {open|closed|adaptive}:
| Value | Behavior |
|---|---|
open |
Skip optional checkpoints; trust manifests + verify gates only |
closed |
Stop at every autonomy boundary; operator confirms each transition |
adaptive (default) |
Stop only at meaningful boundaries (manifest-audit FAIL, plan-critic BLOCKER, main-merge gate) |
Under the hood: lib/util/autonomy-gate.mjs runs the state machine idle → approved → executing → merge-pending → main-merged. lib/stats/event-emit.mjs records each transition to ${CLAUDE_PLUGIN_DATA}/ultra*-stats.jsonl. The main-merge gate is the final autonomy boundary before HEAD lands on main.
Path A/B/C decision (v3.4.0; Path C closed 2026-05-05)
Three architectural options were considered for the speedup work:
- Path A — cache-first (drop
--allowedToolsper child to recover cross-phase cache sharing): REJECTED. Inverts the security model; plugin hooks don't fire reliably inclaude -p(research/06 GH #36071). - Path B — sequential
--no-ffparallel waves with manifest-driven failure recovery: CHOSEN. Ships in v3.4.0. Phase 2.6 of/ultraexecute-localruns the wave executor with hardenings for plugin-in-monorepo + gitignored-state topology. - Path C — hybrid (cache-warm sentinel + identical-tool parallel): CLOSED 2026-05-05. Q3 experiment measured median
cache_creation_input_tokens= 163,903 across 3 fork-children at 186K parent context (CC v2.1.128, Sonnet 4.6). Master-plan thresholds: ≤ 1,500 POSITIVE / ≥ 3,500 NEGATIVE. Result is solidly NEGATIVE —CLAUDE_CODE_FORK_SUBAGENTdoes not preserve cache prefix across identical-tool children at our context size. Path C migration is deferred indefinitely; reassessment is appropriate when CC v2.2.xxx ships fork-cache-relevant features. Harness:scripts/q3-cache-prefix-experiment.mjs. Companion analyser:lib/stats/cache-analyzer.mjs.
A revived Path C (post-v2.2.xxx) would require: (1) re-architecting tool-list to be identical across all wave children, (2) cache-telemetry analysis confirming the new fork-cache behaviour holds, (3) prompt-level deny re-enablement to compensate for tool scoping rollback.
Architecture
Brief: 7-phase workflow: Parse mode → Create project dir → Phase 3 completeness loop (section-driven, no question cap) → Phase 4 draft/review/revise with brief-reviewer as stop-gate (max 3 iterations; gate = all dimensions ≥ 4 and research plan = 5) → Finalize (brief.md on pass, or brief_quality: partial on cap/force-stop) → Manual/auto opt-in → Stats. Always interactive. Auto mode runs research + plan inline in the main context (v2.4.0).
Research: Foreground workflow (v2.4.0): Parse mode → Interview → Parallel research swarm (5 local + 4 external + 1 bridge, spawned from main context) → Follow-ups → Triangulation → Synthesis + brief → Stats. With --project, writes to {dir}/research/NN-slug.md.
Plan: Foreground workflow (v2.4.0): Parse mode (validate brief input) → Codebase sizing → Brief review (brief-reviewer) → Parallel exploration (6-8 agents, spawned from main context) → Deep-dives → Synthesis (with architecture-note cross-reference if present) → Planning → Adversarial review (plan-critic + scope-guardian) → Present/refine → Handoff. With --project, writes to {dir}/plan.md and auto-detects {dir}/architecture/overview.md (produced by an opt-in upstream architect plugin if installed; not bundled).
Decompose: Parse plan → Analyze step dependencies → Group into sessions → Identify parallel waves → Generate session specs + dependency graph + launch script.
Execute: Parse plan → Security scan (Phase 2.4) → Detect Execution Strategy → Single-session (step loop) or multi-session (parallel waves via claude -p with scoped --allowedTools) → Phase 7.5 manifest audit → Phase 7.6 bounded recovery (if partial) → Phase 8 atomically writes progress.json + .session-state.local.json (Handover 7) → Report. With --project, reads {dir}/plan.md. Phase 2.55 (pre-flight stop) and Phase 4 (entry-condition stop) also write .session-state.local.json so /ultracontinue can surface the stop and prompt for next steps.
Continue: /ultracontinue-local reads {dir}/.session-state.local.json (Handover 7), validates schema-v1 via session-state-validator, narrates a 3-line summary (project / next-session-label / brief-path), and immediately begins executing the next session. Auto-discovers active project state files under .claude/projects/*/.session-state.local.json if no explicit <project-dir> argument. Operator-invoked only — never auto-loaded via SessionStart. The /ultraplan-end-session-local helper is the informal-flow producer: writes the same state file for ad-hoc multi-session handovers that don't run through /ultraexecute-local.
Security: 4-layer defense-in-depth: plugin hooks (pre-bash-executor, pre-write-executor), prompt-level denylist (works in headless sessions), pre-execution plan scan (Phase 2.4), scoped --allowedTools replacing --dangerously-skip-permissions. Hard Rules 14-16 enforce verify command security, repo-boundary writes, and sensitive path protection.
Pipeline: /ultrabrief-local produces the task brief. /ultraresearch-local --project <dir> fills in {dir}/research/. /ultraplan-local --project <dir> reads brief + research to produce {dir}/plan.md (and auto-discovers {dir}/architecture/overview.md if an opt-in upstream architect plugin produced one). /ultraexecute-local --project <dir> executes and writes {dir}/progress.json. All artifacts live in one project directory.
Project-directory contract (v3.0.0): ultraplan-local owns the directory layout below. The architecture/ subdirectory is opt-in and produced by an opt-in upstream architect plugin (not bundled) — the architect plugin is no longer publicly distributed, but the architecture/overview.md slot remains available for any compatible producer.
.claude/projects/{YYYY-MM-DD}-{slug}/
brief.md ← ultrabrief-local writes; everyone reads
research/*.md ← ultraresearch-local writes; plan + architect read
architecture/ ← OPT-IN, owned by an opt-in upstream architect plugin (not bundled)
overview.md
gaps.md
plan.md ← ultraplan-local writes; ultraexecute reads
progress.json ← ultraexecute-local writes
No code-level dependency between plugins — the contract is filesystem-level only.
State
All artifacts in one project directory (default):
- Project root:
.claude/projects/{YYYY-MM-DD}-{slug}/brief.md(task brief from/ultrabrief-local)research/{NN}-{slug}.md(research briefs from/ultraresearch-local --project)architecture/overview.md+architecture/gaps.md(opt-in, produced by an opt-in upstream architect plugin, not bundled)plan.md(from/ultraplan-local --project)sessions/session-*.md(from--decompose)progress.json(from/ultraexecute-local --project)review.md(from/ultrareview-local --project).session-state.local.json(Handover 7 — gitignored via*.local.json; written by/ultraexecute-localPhase 8/2.55/4 or/ultraplan-end-session-local; read by/ultracontinue)
Legacy paths (still work without --project):
- Research briefs:
.claude/research/ultraresearch-{date}-{slug}.md - Plans:
.claude/plans/ultraplan-{date}-{slug}.md - Sessions:
.claude/ultraplan-sessions/{slug}/session-*.md - Launch scripts:
.claude/ultraplan-sessions/{slug}/launch.sh - Progress:
{plan-dir}/.ultraexecute-progress-{slug}.json
Stats:
- Brief stats:
${CLAUDE_PLUGIN_DATA}/ultrabrief-stats.jsonl - Plan stats:
${CLAUDE_PLUGIN_DATA}/ultraplan-stats.jsonl - Exec stats:
${CLAUDE_PLUGIN_DATA}/ultraexecute-stats.jsonl - Research stats:
${CLAUDE_PLUGIN_DATA}/ultraresearch-stats.jsonl - Continue stats:
${CLAUDE_PLUGIN_DATA}/ultracontinue-stats.jsonl
Terminology
- Task brief — produced by
/ultrabrief-local. Declares intent, goal, and research plan. Drives planning. - Research brief — produced by
/ultraresearch-local. Answers a specific research question. Feeds planning. - Architecture note — opt-in, produced by an opt-in upstream architect plugin (not bundled; the architect plugin is no longer publicly distributed, but the
architecture/overview.mdfilesystem slot remains available for any compatible producer). Proposes which Claude Code features fit the task with brief-anchored rationale + explicit gaps. When present, enriches planning. - Review — produced by
/ultrareview-local. Independent post-hoc review of delivered code against the task brief. Handover 6 (review → plan) routes BLOCKER + MAJOR findings into/ultraplan-local --brief review.mdfor a remediation plan. The plan's optionalsource_findings:frontmatter list is the audit trail back to the consumed findings. MINOR + SUGGESTION are skipped for v1.0 plan-input. - Session state —
.session-state.local.jsonper project. Handover 7 — produced by any session-end mechanism (/ultraexecute-localPhase 8/2.55/4,/ultraplan-end-session-localhelper, future graceful-handoff v2.2). Consumed by/ultracontinue-localto resume the next session in a fresh chat. Schema v1 is forward-compat (unknown top-level keys ignored). Never committed (gitignored via*.local.json).
A project typically has 1 task brief, 0–N research briefs, 0 or 1 architecture note, 0–N reviews (one per review iteration), and 0 or 1 session-state file (overwritten on every session-end).