Kjell Tore Guttormsen 47a4ad47d8 feat(voyage)!: rename commands, templates, fixtures for v4.0.0 [skip-docs]

2026-05-05 14:13:44 +02:00

21 KiB

Raw Blame History

ultraplan-local

Voyage — a contract-driven Claude Code pipeline: brief, research, plan, execute, review, continue. Deep implementation planning and research with specialized agent swarms, external research, adversarial review, session decomposition, disciplined execution, and headless support.

Design principle: Context Engineering — build the right context by orchestrating specialized agents. Each step in the pipeline (brief → research → plan → execute) produces a structured artifact that the next step consumes.

v3.0.0 — architect step extracted from this plugin. The plan command still auto-discovers architecture/overview.md if present, so any compatible producer (architect plugin no longer publicly distributed; the architecture/overview.md slot remains available for any compatible producer) plugs into the same slot. See CHANGELOG.md for migration history.

Commands

Command	Description	Model
`/trekbrief`	Brief — interactive interview produces a task brief with explicit research plan; optionally orchestrates the pipeline	opus
`/trekresearch`	Research — deep local + external research, produces structured research brief	opus
`/trekplan`	Plan — brief-reviewer, explore, plan, review. Requires `--brief` or `--project`. Auto-discovers `architecture/overview.md` if present	opus
`/trekexecute`	Execute — disciplined plan/session-spec executor with failure recovery	opus
`/trekreview`	Review — independent post-hoc review of delivered code against the brief. Produces `review.md` with severity-tagged findings (Handover 6)	opus
`/trekcontinue`	Continue — resumes the next session of a multi-session voyage project. Reads `.session-state.local.json` (Handover 7) and immediately begins executing	opus
`/trekendsession`	End-session — mark the current session complete and write session-state pointing at the next session. Helper for informal multi-session flows	sonnet

/ultrabrief-local modes

Flag	Behavior
(default)	Dynamic interview until quality gates pass → brief.md with research plan
`--quick`	Compact start; still escalates if required sections are weak or the brief-review gate fails → brief.md with research plan
`--gates {open\|closed\|adaptive}`	(v3.4.0) Autonomy-checkpoint policy. Default `adaptive`

Always interactive. Phase 3 is a section-driven completeness loop (no hard cap on question count); Phase 4 runs a brief-reviewer stop-gate with max 3 review iterations. After writing the brief, asks the user to choose manual (print commands) or auto (Claude runs research + plan in foreground).

/ultraresearch-local modes

Flag	Behavior
(default)	Interview + research (local + external) + synthesis + brief (foreground)
`--project <dir>`	Write brief to `{dir}/research/{NN}-{slug}.md` (auto-incremented)
`--quick`	Interview (short) + inline research (no agent swarm)
`--local`	Only codebase analysis agents (skip external + Gemini)
`--external`	Only external research agents (skip codebase analysis)
`--fg`	No-op alias (foreground is default since v2.4.0)
`--gates {open\|closed\|adaptive}`	(v3.4.0) Autonomy-checkpoint policy. Default `adaptive`

Flags combine: --project <dir> --local, --external --quick.

/ultraplan-local modes

Flag	Behavior
`--project <dir>`	Required path A — read `{dir}/brief.md`, auto-discover `{dir}/research/*.md`, write `{dir}/plan.md`
`--brief <path>`	Required path B — plan from a specific brief file; write to `.claude/plans/ultraplan-{date}-{slug}.md`
`--research <brief> [brief2]`	Enrich with extra research briefs beyond what is in `{project_dir}/research/`
`--fg`	No-op alias (foreground is default since v2.4.0)
`--quick`	Plan directly (no agent swarm)
`--export <pr\|issue\|markdown\|headless> <plan>`	Generate shareable output from existing plan
`--decompose <plan>`	Split plan into self-contained headless sessions
`--gates {open\|closed\|adaptive}`	(v3.4.0) Autonomy-checkpoint policy. Default `adaptive`

Breaking change (v2.0): one of --brief or --project is required. There is no interview inside /ultraplan-local. The --spec flag has been removed — use /ultrabrief-local to produce a brief instead.

If {project_dir}/architecture/overview.md exists (typically produced by an opt-in upstream architect plugin, not bundled), the plan command auto-discovers it and treats cc_features_proposed as priors. Missing file is fine — discovery is additive, not required.

/ultraexecute-local modes

Flag	Behavior
(default)	Execute plan — auto-detects Execution Strategy for multi-session
`--project <dir>`	Read `{dir}/plan.md`, write `{dir}/progress.json`
`--resume`	Resume from last progress checkpoint
`--dry-run`	Validate plan structure without executing
`--validate`	Schema-only check — parse steps + manifests, report `READY \| FAIL`, no execution
`--step N`	Execute only step N
`--fg`	Force foreground — run all steps sequentially, ignore Execution Strategy
`--session N`	Execute only session N from plan's Execution Strategy
`--gates {open\|closed\|adaptive}`	(v3.4.0) Autonomy-checkpoint policy. Default `adaptive`

/ultrareview-local modes

Flag	Behavior
(default)	Run brief-conformance + code-correctness reviewers in parallel, coordinator dedup + verdict, write `{project_dir}/review.md`
`--project <dir>`	Required. Path to ultraplan-local project folder containing `brief.md`. Review is written to `{dir}/review.md`
`--since <ref>`	Override "before" SHA for the diff range. Validated via `git rev-parse --verify`
`--quick`	Skip brief-conformance reviewer; skip coordinator's reasonableness filter — fast correctness-only pass
`--validate`	Schema-only check on existing `{dir}/review.md`. No LLM calls
`--dry-run`	Print discovered scope + triage map; skip writes
`--fg`	No-op alias (foreground is default)

The triage gate is deterministic — path-pattern classifier produces {file → deep-review|summary-only|skip}. Hard refuse-with-suggestion above 100 files / 100K diff tokens.

Agents

Agent	Model	Role
planning-orchestrator	opus	Inline reference documentation for the planning pipeline workflow (brief-driven)
research-orchestrator	opus	Inline reference documentation for the research pipeline workflow
review-orchestrator	opus	Inline reference documentation for the review pipeline workflow
architecture-mapper	sonnet	Codebase structure, tech stack, patterns
dependency-tracer	sonnet	Import chains, data flow, side effects
task-finder	sonnet	Task-relevant files, functions, reuse candidates
risk-assessor	sonnet	Risks, edge cases, failure modes
test-strategist	sonnet	Test patterns, coverage gaps, strategy
git-historian	sonnet	Recent changes, ownership, hot files
research-scout	sonnet	External docs for unfamiliar tech (conditional, planning only)
convention-scanner	sonnet	Coding conventions: naming, style, error handling, test patterns
brief-reviewer	sonnet	Task brief quality (5 dimensions: completeness, consistency, testability, scope clarity, research plan validity)
brief-conformance-reviewer	sonnet	Brief conformance review (SC + Non-Goal traceability)
code-correctness-reviewer	sonnet	Code correctness review (7 dimensions)
review-coordinator	sonnet	Judge Agent — dedup + reasonableness filter + verdict
plan-critic	sonnet	Adversarial plan review (9 dimensions)
scope-guardian	sonnet	Scope alignment (creep + gaps)
session-decomposer	sonnet	Splits plans into headless sessions with dependency graph
docs-researcher	sonnet	Official documentation, RFCs, vendor docs (Tavily, MS Learn)
community-researcher	sonnet	Community experience: issues, blogs, discussions
security-researcher	sonnet	CVEs, audit history, supply chain risks
contrarian-researcher	sonnet	Counter-evidence, overlooked alternatives
gemini-bridge	sonnet	Gemini Deep Research second opinion (conditional)

Quality infrastructure (v3.4.0)

lib/ contains zero-dep validators, parsers, and autonomy primitives wired into the four commands:

lib/util/{frontmatter,result,atomic-write,autonomy-gate}.mjs — shared YAML-frontmatter parser + Result helpers + atomicWriteJson(path, obj) for tmp+rename writes + autonomy-gate state machine (v3.4.0)
lib/parsers/{plan-schema,manifest-yaml,project-discovery,arg-parser,bash-normalize,jaccard,finding-id}.mjs — pure parsers (no I/O), unit-tested. manifest-yaml extended in v3.4.0 with additive skip_commit_check + memory_write flags (forward-compat: unknown keys ignored)
lib/review/{rule-catalogue,plan-review-dedup}.mjs — version-pinned rule catalogue (12 keys) + Phase 9 inline dedup helpers (v3.4.0)
lib/stats/event-emit.mjs — single-source stats event emitter for autonomy-gate transitions and main-merge-gate (v3.4.0)
lib/validators/{brief,research,plan,progress,session-state}-validator.mjs — schema validators with CLI shims (node lib/validators/X.mjs --json <path>)
lib/validators/architecture-discovery.mjs — drift-WARN external-contract discovery for architecture/overview.md

Wiring points (replaces previous prose-grep instructions):

/ultrabrief-local Phase 4g → brief-validator (post-write sanity check)
/ultraplan-local Phase 1 → brief-validator --soft, research-validator --dir, architecture-discovery
planning-orchestrator Phase 5.5 → plan-validator --strict (replaces 3 grep -cE calls)
/ultraexecute-local --validate → plan-validator --strict + progress-validator

Tests under tests/**/*.test.mjs (~290 tests, 0 deps). npm test is the fork-readiness gate. v3.4.0 adds: synthetic determinism fixtures (tests/synthetic/plan-run-*.md + review-run-*.md + companion *-determinism.test.mjs enforcing Jaccard ≥ 0.833 SC7 floor) and hook baseline regression pins (tests/hooks/{path-guard,bash-guard}.test.mjs exercising pre-write-executor.mjs + pre-bash-executor.mjs denylist BLOCK paths).

Doc-consistency test at tests/lib/doc-consistency.test.mjs pins agent-table count, command-table coverage, plan_version invariant, settings.json scope cleanliness, Handover 7 presence, and session-state-validator CLI shim.

docs/HANDOVER-CONTRACTS.md is the single source of truth for the 7 pipeline handovers (brief→research, research→plan, architecture→plan EXTERNAL, plan→execute, progress.json resume, review→plan, .session-state.local.json). Read it before changing any artifact format.

hooks/scripts/pre-compact-flush.mjs (PreCompact event, CC v2.1.105+) fixes the documented P0 in docs/ultraexecute-v2-observations-from-config-audit-v4.md: keeps progress.json in sync with git history before context compaction so --resume works after long conversations. Atomic write, monotonic only, never blocks compaction.

hooks/scripts/session-title.mjs (UserPromptSubmit, CC v2.1.94+) sets sessionTitle to ultra:<command>:<slug> for ultra-command invocations. Helps multi-session headless runs identify themselves in process lists.

hooks/scripts/post-bash-stats.mjs (PostToolUse, CC v2.1.97+) appends duration_ms for each Bash call into ${CLAUDE_PLUGIN_DATA}/ultraexecute-stats.jsonl. Useful for finding long-running verify or checkpoint commands.

hooks/scripts/post-compact-flush.mjs (PostCompact event, v3.4.0) re-injects .session-state.local.json after context compaction so multi-session work survives a compaction boundary. Companion to pre-compact-flush.mjs (which writes the state file before compaction); together they form the rehydrate cycle that keeps /ultracontinue-local reliable across long-running multi-session work.

Autonomy mode (`--gates`, v3.4.0)

All four pipeline commands accept --gates {open|closed|adaptive}:

Value	Behavior
`open`	Skip optional checkpoints; trust manifests + verify gates only
`closed`	Stop at every autonomy boundary; operator confirms each transition
`adaptive` (default)	Stop only at meaningful boundaries (manifest-audit FAIL, plan-critic BLOCKER, main-merge gate)

Under the hood: lib/util/autonomy-gate.mjs runs the state machine idle → approved → executing → merge-pending → main-merged. lib/stats/event-emit.mjs records each transition to ${CLAUDE_PLUGIN_DATA}/ultra*-stats.jsonl. The main-merge gate is the final autonomy boundary before HEAD lands on main.

Path A/B/C decision (v3.4.0; Path C closed 2026-05-05)

Three architectural options were considered for the speedup work:

Path A — cache-first (drop --allowedTools per child to recover cross-phase cache sharing): REJECTED. Inverts the security model; plugin hooks don't fire reliably in claude -p (research/06 GH #36071).
Path B — sequential --no-ff parallel waves with manifest-driven failure recovery: CHOSEN. Ships in v3.4.0. Phase 2.6 of /ultraexecute-local runs the wave executor with hardenings for plugin-in-monorepo + gitignored-state topology.
Path C — hybrid (cache-warm sentinel + identical-tool parallel): CLOSED 2026-05-05. Q3 experiment measured median cache_creation_input_tokens = 163,903 across 3 fork-children at 186K parent context (CC v2.1.128, Sonnet 4.6). Master-plan thresholds: ≤ 1,500 POSITIVE / ≥ 3,500 NEGATIVE. Result is solidly NEGATIVE — CLAUDE_CODE_FORK_SUBAGENT does not preserve cache prefix across identical-tool children at our context size. Path C migration is deferred indefinitely; reassessment is appropriate when CC v2.2.xxx ships fork-cache-relevant features. Harness: scripts/q3-cache-prefix-experiment.mjs. Companion analyser: lib/stats/cache-analyzer.mjs.

A revived Path C (post-v2.2.xxx) would require: (1) re-architecting tool-list to be identical across all wave children, (2) cache-telemetry analysis confirming the new fork-cache behaviour holds, (3) prompt-level deny re-enablement to compensate for tool scoping rollback.

Architecture

Brief: 7-phase workflow: Parse mode → Create project dir → Phase 3 completeness loop (section-driven, no question cap) → Phase 4 draft/review/revise with brief-reviewer as stop-gate (max 3 iterations; gate = all dimensions ≥ 4 and research plan = 5) → Finalize (brief.md on pass, or brief_quality: partial on cap/force-stop) → Manual/auto opt-in → Stats. Always interactive. Auto mode runs research + plan inline in the main context (v2.4.0).

Research: Foreground workflow (v2.4.0): Parse mode → Interview → Parallel research swarm (5 local + 4 external + 1 bridge, spawned from main context) → Follow-ups → Triangulation → Synthesis + brief → Stats. With --project, writes to {dir}/research/NN-slug.md.

Plan: Foreground workflow (v2.4.0): Parse mode (validate brief input) → Codebase sizing → Brief review (brief-reviewer) → Parallel exploration (6-8 agents, spawned from main context) → Deep-dives → Synthesis (with architecture-note cross-reference if present) → Planning → Adversarial review (plan-critic + scope-guardian) → Present/refine → Handoff. With --project, writes to {dir}/plan.md and auto-detects {dir}/architecture/overview.md (produced by an opt-in upstream architect plugin if installed; not bundled).

Decompose: Parse plan → Analyze step dependencies → Group into sessions → Identify parallel waves → Generate session specs + dependency graph + launch script.

Execute: Parse plan → Security scan (Phase 2.4) → Detect Execution Strategy → Single-session (step loop) or multi-session (parallel waves via claude -p with scoped --allowedTools) → Phase 7.5 manifest audit → Phase 7.6 bounded recovery (if partial) → Phase 8 atomically writes progress.json + .session-state.local.json (Handover 7) → Report. With --project, reads {dir}/plan.md. Phase 2.55 (pre-flight stop) and Phase 4 (entry-condition stop) also write .session-state.local.json so /ultracontinue can surface the stop and prompt for next steps.

Continue: /ultracontinue-local reads {dir}/.session-state.local.json (Handover 7), validates schema-v1 via session-state-validator, narrates a 3-line summary (project / next-session-label / brief-path), and immediately begins executing the next session. Auto-discovers active project state files under .claude/projects/*/.session-state.local.json if no explicit <project-dir> argument. Operator-invoked only — never auto-loaded via SessionStart. The /ultraplan-end-session-local helper is the informal-flow producer: writes the same state file for ad-hoc multi-session handovers that don't run through /ultraexecute-local.

Security: 4-layer defense-in-depth: plugin hooks (pre-bash-executor, pre-write-executor), prompt-level denylist (works in headless sessions), pre-execution plan scan (Phase 2.4), scoped --allowedTools replacing --dangerously-skip-permissions. Hard Rules 14-16 enforce verify command security, repo-boundary writes, and sensitive path protection.

Pipeline: /ultrabrief-local produces the task brief. /ultraresearch-local --project <dir> fills in {dir}/research/. /ultraplan-local --project <dir> reads brief + research to produce {dir}/plan.md (and auto-discovers {dir}/architecture/overview.md if an opt-in upstream architect plugin produced one). /ultraexecute-local --project <dir> executes and writes {dir}/progress.json. All artifacts live in one project directory.

Project-directory contract (v3.0.0): ultraplan-local owns the directory layout below. The architecture/ subdirectory is opt-in and produced by an opt-in upstream architect plugin (not bundled) — the architect plugin is no longer publicly distributed, but the architecture/overview.md slot remains available for any compatible producer.

.claude/projects/{YYYY-MM-DD}-{slug}/
  brief.md           ← ultrabrief-local writes; everyone reads
  research/*.md      ← ultraresearch-local writes; plan + architect read
  architecture/      ← OPT-IN, owned by an opt-in upstream architect plugin (not bundled)
    overview.md
    gaps.md
  plan.md            ← ultraplan-local writes; ultraexecute reads
  progress.json      ← ultraexecute-local writes

No code-level dependency between plugins — the contract is filesystem-level only.

State

All artifacts in one project directory (default):

Project root: .claude/projects/{YYYY-MM-DD}-{slug}/
- brief.md (task brief from /ultrabrief-local)
- research/{NN}-{slug}.md (research briefs from /ultraresearch-local --project)
- architecture/overview.md + architecture/gaps.md (opt-in, produced by an opt-in upstream architect plugin, not bundled)
- plan.md (from /ultraplan-local --project)
- sessions/session-*.md (from --decompose)
- progress.json (from /ultraexecute-local --project)
- review.md (from /ultrareview-local --project)
- .session-state.local.json (Handover 7 — gitignored via *.local.json; written by /ultraexecute-local Phase 8/2.55/4 or /ultraplan-end-session-local; read by /ultracontinue)

Legacy paths (still work without --project):

Research briefs: .claude/research/ultraresearch-{date}-{slug}.md
Plans: .claude/plans/ultraplan-{date}-{slug}.md
Sessions: .claude/ultraplan-sessions/{slug}/session-*.md
Launch scripts: .claude/ultraplan-sessions/{slug}/launch.sh
Progress: {plan-dir}/.ultraexecute-progress-{slug}.json

Stats:

Brief stats: ${CLAUDE_PLUGIN_DATA}/ultrabrief-stats.jsonl
Plan stats: ${CLAUDE_PLUGIN_DATA}/ultraplan-stats.jsonl
Exec stats: ${CLAUDE_PLUGIN_DATA}/ultraexecute-stats.jsonl
Research stats: ${CLAUDE_PLUGIN_DATA}/ultraresearch-stats.jsonl
Continue stats: ${CLAUDE_PLUGIN_DATA}/ultracontinue-stats.jsonl

Terminology

Task brief — produced by /ultrabrief-local. Declares intent, goal, and research plan. Drives planning.
Research brief — produced by /ultraresearch-local. Answers a specific research question. Feeds planning.
Architecture note — opt-in, produced by an opt-in upstream architect plugin (not bundled; the architect plugin is no longer publicly distributed, but the architecture/overview.md filesystem slot remains available for any compatible producer). Proposes which Claude Code features fit the task with brief-anchored rationale + explicit gaps. When present, enriches planning.
Review — produced by /ultrareview-local. Independent post-hoc review of delivered code against the task brief. Handover 6 (review → plan) routes BLOCKER + MAJOR findings into /ultraplan-local --brief review.md for a remediation plan. The plan's optional source_findings: frontmatter list is the audit trail back to the consumed findings. MINOR + SUGGESTION are skipped for v1.0 plan-input.
Session state — .session-state.local.json per project. Handover 7 — produced by any session-end mechanism (/ultraexecute-local Phase 8/2.55/4, /ultraplan-end-session-local helper, future graceful-handoff v2.2). Consumed by /ultracontinue-local to resume the next session in a fresh chat. Schema v1 is forward-compat (unknown top-level keys ignored). Never committed (gitignored via *.local.json).

A project typically has 1 task brief, 0–N research briefs, 0 or 1 architecture note, 0–N reviews (one per review iteration), and 0 or 1 session-state file (overwritten on every session-end).

21 KiB Raw Blame History Unescape Escape