Kjell Tore Guttormsen dfe1986f06 chore(voyage): bump version 5.0.3 → 5.1.0

2026-05-13 21:23:48 +02:00

54 KiB

Raw Permalink Blame History

trekplan — Brief, Research, Plan, Execute, Review, Continue

Solo-maintained, fork-and-own. This plugin is a starting point, not a vendor product. Issues are welcome as signals; pull requests are not accepted. See GOVERNANCE.md for the full model and what upstream provides.

AI-generated: all code produced by Claude Code through dialog-driven development. Full disclosure →

A Claude Code plugin for deep implementation planning, multi-source research, autonomous execution, independent post-hoc review, and zero-friction multi-session resumption. Six commands, one pipeline:

What's new in v5.1 — /trekbrief Phase 3.5 commits per-phase phase_signals (effort + optional model for research/plan/execute/review) to brief.md frontmatter. brief_version: 2.1 activates a validator-side sequencing gate (BRIEF_V51_MISSING_SIGNALS) so downstream commands halt with a friendly hint when signals are missing. Composition rule per downstream command: brief signal wins per-phase, profile fills gaps. effort == low activates the existing --quick-equivalent code-path in each command (/trekexecute low-effort = --gates open + sequential). Additive — no breaking changes; pre-2.1 briefs still validate.

Command	What it does
`/trekbrief`	Brief — interactive interview produces a task brief with explicit research plan
`/trekresearch`	Research — deep local + external research with triangulation
`/trekplan`	Plan — agent swarm exploration, Opus planning, adversarial review
`/trekexecute`	Execute — disciplined step-by-step implementation with failure recovery
`/trekreview`	Review — independent post-hoc review of delivered code against the brief, severity-tagged findings
`/trekcontinue`	Continue — read `.session-state.local.json` and resume the next session in a multi-session project

/trekbrief, /trekplan, and /trekreview each end by running scripts/annotate.mjs against the just-written artifact and printing the resulting file://<abs path> link. The operator opens the HTML in a browser, clicks any line of the document, writes their own note in the inline textarea, watches a sidebar of all notes (editable, deletable, persisted in browser localStorage), and clicks "Copy Prompt" to get one structured prompt that they paste back into Claude — Claude then revises the .md from the notes. The operator drives every annotation. See Reviewing and annotating artifacts.

Every artifact lives in one project directory: .claude/projects/{YYYY-MM-DD}-{slug}/ contains brief.md, research/NN-*.md, plan.md, sessions/, progress.json, and review.md.

Division of labor

Command	Responsibility	Output
`/trekbrief`	Capture intent — intent, goal, non-goals, success criteria, and a research plan with explicit topics. Interactive only.	`brief.md` (task brief)
`/trekresearch`	Gather context — code state, external docs, community, risk. Makes NO build decisions.	`research/NN-slug.md` (research brief)
`/trekplan`	Transform intent into an executable contract — per-step YAML manifest, regex-validated checkpoints, verifiable paths. Plan-critic is a hard gate. Auto-discovers `architecture/overview.md` as priors when an opt-in upstream architect plugin (not bundled) is installed.	`plan.md` with Manifest blocks + `plan_version: 1.7`
`/trekexecute`	Execute the contract disciplined — fresh verification, independent manifest audit, honest reporting. Does NOT compensate for weak plans — escalates.	`progress.json` + structured report + manifest-audit status
`/trekreview`	Close the loop — independent post-hoc reviewer reads `brief.md` and the diff produced by execute, runs brief-conformance + code-correctness reviewers in parallel, dedups via Judge Agent. Severity-tagged findings (Critical/High/Medium/Low/Info) feed back into planning via Handover 6.	`review.md` (`type: trekreview`) with stable 40-char hex finding-IDs

Principle: Each step consumes the previous step's structured artifact. If execute has to guess, the plan is weak and must be revised upstream — not patched downstream.

Two kinds of briefs

Terminology matters:

Task brief — produced by /trekbrief. Captures what we want and why. Drives planning.
Research brief — produced by /trekresearch. Captures what we learned about a topic. Feeds planning.

A project typically has one task brief and zero-to-N research briefs.

Manifest-verified steps

Every step in the plan ends with a YAML manifest: block declaring expected_paths, commit_message_pattern, bash_syntax_check, forbidden_paths, must_contain. The executor checks the manifest against the resulting commit — a step may not be marked passed if its manifest does not verify, regardless of the Verify command's exit code (Hard Rule 17).

After all steps complete, /trekexecute runs Phase 7.5 — Manifest audit (independent): re-verifies every expected path from git log + filesystem, ignoring the agent's own bookkeeping. Drift → status partial, Phase 7.6 auto-dispatches a bounded recovery session with only the missing steps (recovery_depth ≤ 2). Step 0 pre-flight (git push --dry-run) runs inside every session sandbox before any real work — exit 77 sentinel catches sandbox push-denial before the agent wastes the whole budget.

No cloud dependency. No GitHub requirement. Works on Mac, Linux, and Windows.

Autonomy mode (`--gates`, v3.4.0)

All four pipeline commands accept --gates {open|closed|adaptive} to control how many autonomy checkpoints surface to the operator on the path from brief approval to main-merge.

Value	Behavior
`open`	Skip optional checkpoints. The pipeline runs end-to-end with the fewest interruptions. Suitable for trusted briefs in clean repos.
`closed`	Stop at every checkpoint. The operator confirms each transition. Suitable for high-stakes work or unfamiliar repos.
`adaptive` (default)	Stop only when the autonomy-gate state machine reports a meaningful boundary (manifest-audit FAIL, plan-critic BLOCKER, main-merge gate). Best balance of velocity and safety.

Under the hood, lib/util/autonomy-gate.mjs runs a small state machine (idle → approved → executing → merge-pending → main-merged) and lib/stats/event-emit.mjs records each transition to ${CLAUDE_PLUGIN_DATA}/trek*-stats.jsonl. The new hooks/scripts/post-compact-flush.mjs PostCompact hook re-injects .session-state.local.json after context compaction so multi-session work survives a compaction boundary.

Quick start

# Install the marketplace, then browse and enable plugins with /plugin
claude plugin marketplace add https://git.fromaitochitta.com/open/ktg-plugin-marketplace.git

# Capture intent (interactive)
/trekbrief Add user authentication with JWT tokens
# → .claude/projects/2026-04-18-jwt-auth/brief.md

# Research each topic identified in the brief (manual default)
/trekresearch --project .claude/projects/2026-04-18-jwt-auth --external "What are current JWT best practices?"

# Plan from brief + research
/trekplan --project .claude/projects/2026-04-18-jwt-auth

# Execute
/trekexecute --project .claude/projects/2026-04-18-jwt-auth

# Review (independent post-hoc verification of the diff against brief.md)
/trekreview --project .claude/projects/2026-04-18-jwt-auth
# → .claude/projects/2026-04-18-jwt-auth/review.md

Or opt into auto-mode in /trekbrief — it will run research and planning sequentially inline in the main context, and return when plan.md is ready.

If review finds issues, feed review.md back into planning to produce a remediation plan: /trekplan --brief .claude/projects/2026-04-18-jwt-auth/review.md. The remediation plan carries source_findings: [<id>, ...] in its frontmatter — full audit trail back to the consumed findings (Handover 6).

An optional architect step can sit between research and plan — /trekplan auto-discovers an architecture/overview.md produced by an opt-in upstream architect plugin (not bundled here; the architect plugin is no longer publicly distributed, but the architecture/overview.md filesystem slot remains available for any compatible producer).

When to use it

Use it when:

The task touches 3+ files or modules and you need to understand how they connect
You're working in an unfamiliar codebase and need a map before you start
The implementation has non-obvious dependencies, ordering constraints, or risks
You want a reviewable plan before committing to an approach
You need autonomous headless execution without human intervention
You need to research a technology, library, or approach before deciding

Don't use it when:

The task is a single-file change where the fix is obvious
You already know exactly what to change and in what order

Rule of thumb: If you can describe the full implementation in one sentence and it touches 1-2 files, skip trekplan and just implement. If you need to think about it, trekplan earns its cost.

What you get

Concrete capabilities, observable in the code — not aspirations.

Across all profiles:

Strategy-to-execution on four explicit handover points. Each transition is a filesystem contract (docs/HANDOVER-CONTRACTS.md), not a conversation. You can stop after any stage and resume later without context loss.
Resume safety after long sessions. The PreCompact hook reconciles progress.json with git history before context compaction (CC v2.1.105+) — closes a documented --resume failure mode.
Schema discipline. plan-validator --strict enforces ### Step N: form and rejects narrative drift (### Fase, ### Phase, ### Stage, ### Steg) before execution.
Audit trail by construction. Every executed step records commit_sha, verify_passed, files_changed in progress.json.

Solo developer. Plans survive across sessions; adversarial review (plan-critic + scope-guardian) catches your own tunnel vision before code is written; brief-phase forces clarity on what the task actually is. examples/01-add-verbose-flag/ shows what good shape looks like.

Team (2–10). Plan files are handover-ready — a colleague can pick up a project directory without re-asking "what did you mean here?". --decompose splits a plan into self-contained headless sessions with scoped --allowedTools. The plan-critic semantic rubric gives the team a shared definition of "this plan defers decisions to the executor".

Virksomhet / regulated environment. Defense-in-depth security across four layers (plugin hooks, prompt-level denylist, pre-execution plan scan, scoped tool access). disableSkillShellExecution: true recommendation for fork-ers handling untrusted briefs. No cloud dependency, no GitHub requirement. Validators are plain-Node CLIs — invocable from CI, custom hooks, or external tools, not just from voyage commands.

What it doesn't solve:

LLM output truthfulness. Validators check shape, not facts. A plan with hallucinated paths passes schema but fails in execute. Plan-critic catches some, not all.
Multi-user concurrency on a single project directory. Two simultaneous executors will clobber progress.json.
Cost management. Opus on the orchestrator layer is expensive; documented in Cost profile, no automatic model downgrade.
Linear/Jira/Slack integrations. Intentional omission — solo project, no enterprise wiring.

One-line summary: an executable contract pipeline where each stage is filesystem-validated, session-survivable, and skill-independent — in exchange for writing an actual brief before planning and an actual plan before coding.

`/trekbrief` — Brief

Interactive requirements-gathering command. Runs a dynamic, quality-gated interview and produces a task brief with an explicit research plan. Optionally orchestrates the rest of the pipeline.

A section-driven interview loop fills required brief sections (Intent / Goal / Success Criteria / Research Plan) until each shows initial signal, then brief-reviewer scores the draft on five dimensions (completeness, consistency, testability, scope clarity, research-plan validity) and gates publication. Max 3 review iterations; force-stop yields a brief_quality: partial brief with the failing dimensions documented.

Output: .claude/projects/{YYYY-MM-DD}-{slug}/brief.md

Modes

Mode	Usage	Behavior
Default	`/trekbrief <task>`	Dynamic interview until quality gates pass. No question cap.
Quick	`/trekbrief --quick <task>`	Starts compact (optional sections get at most one probe), still escalates on weak required sections or failed review gate.
Profile	`/trekbrief --profile <name> <task>`	(v4.1.0) Pin model profile for the brief phase: `economy` / `balanced` / `premium` / `<custom>`. See Profile system below.

/trekbrief is always interactive. There is no foreground/background mode — the interview requires user input.

Force-stop

If you say "stop" or "enough" during Phase 4, the current review findings are surfaced with per-dimension scores and you choose:

Answer one more follow-up — the loop continues.
Stop now (accept partial brief) — the brief is finalized with brief_quality: partial and a ## Brief Quality section listing the failing dimensions. Downstream planning will treat these as reduced-confidence areas.

What the brief contains

Intent — why this matters, motivation, user need (load-bearing)
Goal — concrete end state in 1-3 sentences
Non-Goals — explicitly out of scope
Constraints / Preferences / NFRs — technical, time, resource limits
Success Criteria — 2-4 falsifiable commands/observations
Research Plan — N topics, each with research question, scope (local/external/both), confidence needed, cost estimate, and a ready-to-run /trekresearch command
Open Questions / Assumptions — from "I don't know" answers and implicit gaps
Prior Attempts — what worked/failed before

`/trekresearch` — Research

Deep, multi-phase research that combines local codebase analysis with external knowledge. Uses specialized agent swarms to investigate multiple dimensions in parallel, then triangulates findings.

A parallel swarm of up to 5 local + 4 external Sonnet agents investigates 3–8 research dimensions, with optional Gemini Deep Research as an independent second opinion. Findings are triangulated (local vs. external, confidence per dimension, contradictions flagged) and synthesized into a structured research brief.

Output:

With --project <dir>: {dir}/research/{NN}-{slug}.md (auto-incremented index)
Without: .claude/research/trekresearch-{date}-{slug}.md

Modes

Mode	Usage	Behavior
Default	`/trekresearch <question>`	Interview + research swarm (local + external + Gemini), foreground
Project	`/trekresearch --project <dir> <question>`	Write brief into `{dir}/research/NN-slug.md`
Quick	`/trekresearch --quick <question>`	Interview (short) + inline research, no agent swarm
Local	`/trekresearch --local <question>`	Only codebase analysis agents (skip external + Gemini)
External	`/trekresearch --external <question>`	Only external research agents (skip codebase analysis)
Foreground	`/trekresearch --fg <question>`	No-op alias (foreground is default since v2.4.0)
Profile	`/trekresearch --profile <name> <question>`	(v4.1.0) Pin model profile for the research phase. See Profile system.

Flags combine: --project <dir> --external.

Research uses up to 5 local agents (architecture-mapper, dependency-tracer, task-finder, git-historian, convention-scanner) and 4 external agents (docs-researcher, community-researcher, security-researcher, contrarian-researcher) plus the optional Gemini bridge for an independent second opinion. Per-agent details in agents/.

`/trekplan` — Planning

Produces an implementation plan detailed enough for autonomous execution. v2.0 breaking change: requires --brief or --project. There is no longer an interview inside /trekplan — use /trekbrief first.

After brief-reviewer validates the input brief, 6–8 Sonnet exploration agents analyze the codebase in parallel and merge findings into a synthesis. Optional research briefs (--research, or auto-discovered in {project_dir}/research/) enrich the plan; architecture/overview.md priors are loaded if an opt-in upstream architect plugin (not bundled) produced one. Opus then writes the plan with per-step YAML manifests, which plan-critic (9 dimensions) and scope-guardian adversarially review before handoff.

Output:

With --project <dir>: {dir}/plan.md
With --brief <path>: .claude/plans/trekplan-{date}-{slug}.md

Modes

Mode	Usage	Behavior
Project	`/trekplan --project <dir>`	Read `{dir}/brief.md` + auto-discover `{dir}/research/*.md`, write `{dir}/plan.md`
Brief	`/trekplan --brief <path>`	Plan from a specific brief file
Research-enriched	`/trekplan --project <dir> --research <brief>`	Add extra research briefs beyond what is in `research/`
Foreground	`/trekplan --project <dir> --fg`	No-op alias (foreground is default since v2.4.0)
Quick	`/trekplan --project <dir> --quick`	No agent swarm, lightweight scan only
Decompose	`/trekplan --decompose plan.md`	Split plan into headless session specs
Export	`/trekplan --export pr plan.md`	PR description, issue comment, or clean markdown
Profile	`/trekplan --profile <name> --project <dir>`	(v4.1.0) Pin model profile; emitted as `profile:` in plan.md frontmatter. See Profile system.

--brief or --project is required. /trekplan with no brief exits with an error and a pointer to /trekbrief.

What the plan contains

Every plan includes:

Context — derived from brief ## Intent + ## Goal
Architecture Diagram — Mermaid C4-style component diagram
Codebase Analysis — tech stack, patterns, relevant files, reusable code
Research Sources — findings from research briefs (when present)
Implementation Plan — ordered steps with file paths, changes, failure recovery, and git checkpoints
Per-step Manifest — YAML block with expected_paths, commit_message_pattern, bash_syntax_check, forbidden_paths, must_contain
Alternatives Considered — other approaches with pros/cons
Test Strategy — from test-strategist findings
Risks and Mitigations — from risk-assessor findings
Verification — testable end-to-end criteria
Execution Strategy — session grouping and parallel waves (plans with > 5 steps)
Plan Quality Score — quantitative grade (A-D) across 6 weighted dimensions

Every implementation step includes:

On failure: — what to do when verification fails (revert / retry / skip / escalate)
Checkpoint: — git commit after success
Manifest: — the objective completion predicate (Hard Rule 17)

Exploration uses 6–8 Sonnet agents in parallel (architecture-mapper, dependency-tracer, task-finder, test-strategist, git-historian, risk-assessor, plus convention-scanner on medium+ codebases and research-scout when unfamiliar tech is detected). Adversarial review then runs brief-reviewer, plan-critic (9 dimensions, no-placeholder enforcement, manifest audit), and scope-guardian (creep + gap detection). Per-agent details in agents/.

`/trekexecute` — Execution

Reads a plan from /trekplan and implements it with strict discipline. No guessing, no improvising — follows the plan exactly.

Per step: apply Changes exactly as written → run Verify (exit code is truth) → manifest audit (expected paths, forbidden paths, commit pattern) → follow the plan's failure clause if anything fails (revert / retry / skip / escalate) → Checkpoint commit. After all steps: independent Phase 7.5 manifest audit from git log + filesystem (ignoring agent bookkeeping); drift triggers Phase 7.6 bounded recovery.

Modes

Mode	Usage	Behavior
Project	`/trekexecute --project <dir>`	Read `{dir}/plan.md`, write `{dir}/progress.json`
Plan path	`/trekexecute plan.md`	Execute a specific plan file
Resume	`/trekexecute --project <dir> --resume`	Resume from last progress checkpoint
Dry run	`/trekexecute --project <dir> --dry-run`	Validate plan structure + preview sessions and billing
Validate	`/trekexecute --project <dir> --validate`	Schema-only check — parse steps + manifests, report `READY \| FAIL`, no execution
Single step	`/trekexecute --project <dir> --step 3`	Execute only step 3
Foreground	`/trekexecute --project <dir> --fg`	Force sequential, ignore Execution Strategy
Single session	`/trekexecute --project <dir> --session 2`	Execute only session 2 from Execution Strategy
Profile	`/trekexecute --profile <name> --project <dir>`	(v4.1.0) Pin model profile for execution. Plan-frontmatter `profile:` is honored when this flag is omitted. See Profile system.

Session-aware parallel execution (worktree-isolated)

When a plan has an ## Execution Strategy section (auto-generated by /trekplan for plans with > 5 steps), /trekexecute automatically:

Pre-flight checks — validates clean working tree, plan file tracked in git, no scope fence overlaps between parallel sessions, no stale worktrees
Creates git worktrees — each parallel session gets its own isolated worktree and branch (trek/{slug}/session-{N})
Launches claude -p per session per wave, each in its own worktree
Merges branches back sequentially with --no-ff after each wave completes
Cleans up worktrees and branches unconditionally (even on failure)
Runs master verification on the merged result

Wave 1: Session 1 (worktree-1) + Session 2 (worktree-2)  -- parallel
         ↓ both complete → sequential merge to main
Wave 2: Session 3 (worktree-3)                             -- sequential
         ↓ complete → merge to main
Cleanup worktrees + Master verification

Each session operates in complete filesystem isolation — no shared git index, no race conditions, no data loss. If a merge produces conflicts, the merge is aborted and conflicting files are reported.

Use --fg to force sequential execution even when a plan has an Execution Strategy.

Billing safety

Before launching parallel claude -p sessions, /trekexecute checks whether ANTHROPIC_API_KEY is set in your environment. If it is, parallel sessions will bill your API account (pay-per-token), not your Claude subscription (Max/Pro). This can be expensive — parallel Opus sessions can cost $50-100+ per run.

When an API key is detected, you are asked how to proceed:

Use --fg instead (recommended) — run sequentially in the current session using your subscription
Continue with API billing — launch parallel sessions on your API account
Stop — cancel and unset the API key first

If no API key is set, parallel sessions use your subscription and proceed without asking.

Failure recovery

3-attempt retry cap — retries twice, then stops (never loops forever)
On failure: revert — undo changes, stop
On failure: retry — try alternative approach, then revert if still failing
On failure: skip — non-critical step, continue
On failure: escalate — stop everything, needs human judgment

Security hardening

The executor implements defense-in-depth security across four layers:

Plugin hooks — pre-bash-executor.mjs blocks 13 categories of destructive commands (rm -rf /, chmod 777, pipe-to-shell, eval injection, disk wipe, shutdown, fork bombs, cron persistence, process killing, history destruction) with bash evasion normalization. pre-write-executor.mjs blocks writes to .git/hooks/, .claude/settings.json, shell configs, .ssh/, .aws/, and .env files
Prompt-level denylist — Security rules embedded in the executor command and session spec template that work even in headless claude -p sessions where hooks don't run
Pre-execution plan scan — Phase 2.4 scans all Verify: and Checkpoint: commands against the denylist before execution begins, catching dangerous commands before they reach the executor
Scoped tool access — Headless child sessions use --allowedTools "Read,Write,Edit,Bash,Glob,Grep" instead of --dangerously-skip-permissions, blocking Agent spawning, MCP tools, and web access in parallel sessions

Recommended: disable Skill shell execution (CC v2.1.91+)

For fork-ers handling untrusted task briefs or plans from external sources, set disableSkillShellExecution: true in ~/.claude/settings.json or in the project's .claude/settings.json:

{
  "disableSkillShellExecution": true
}

This prevents Skills from invoking arbitrary shell, which closes a prompt-injection vector that the plugin's own hooks cannot fully mitigate (Skills can fire before pre-bash-executor matches). See SECURITY.md for the full hardening list.

Headless execution

/trekexecute is designed for claude -p headless sessions:

No questions asked — all recovery decisions come from the plan
Progress file — crash recovery via {project_dir}/progress.json (or .trekexecute-progress-{slug}.json for legacy plans)
Scope fence enforcement — never touches files outside the session's scope
JSON summary — machine-parseable trekexecute_summary block for log parsing

Headless multi-session tuning (CC v2.1.89+)

When running multiple parallel claude -p sessions (decomposed plans or wave-based execution), set MCP_CONNECTION_NONBLOCKING=true in the launching environment so MCP server connection latency does not serialize startup across waves:

export MCP_CONNECTION_NONBLOCKING=true
bash .claude/projects/{slug}/sessions/launch.sh

Without this, each child session can spend 1-3 s blocking on MCP connect, multiplying across waves. Setting it lets MCP connect lazily on first tool call.

Session titles for voyage commands (CC v2.1.94+)

A UserPromptSubmit hook (hooks/scripts/session-title.mjs) sets the session title to voyage:<command>:<slug> whenever you invoke one of the four voyage commands. This makes multi-session headless runs and session-picker output trivially identifiable. Slug derivation:

Invocation	Session title
`/trekplan --project .claude/projects/2026-04-18-jwt-auth`	`voyage:plan:jwt-auth`
`/trekbrief --quick`	`voyage:brief:ad-hoc`
`/trekexecute --project .claude/projects/2026-05-10-cleanup --resume`	`voyage:execute:cleanup`

The hook is fail-open — any error → title is left untouched.

Per-step timing (CC v2.1.97+)

A PostToolUse hook (hooks/scripts/post-bash-stats.mjs) appends duration_ms from each Bash tool call to ${CLAUDE_PLUGIN_DATA}/trekexecute-stats.jsonl. One line per Bash call; useful for identifying long-running verify or checkpoint commands across executions.

`/trekreview` — Review

Independent post-hoc review of delivered code against the brief. Reads brief.md from scratch and treats research/plan as supplementary context. The output review.md is a new artifact type (type: trekreview) with its own validator and a contracted Handover 6 (review → plan) so findings can be fed back into /trekplan --brief review.md to produce a remediation plan — closing the iteration loop without ad-hoc conventions.

Modes

Mode	Command	Description
Default	`/trekreview --project <dir>`	brief-conformance + code-correctness reviewers in parallel, coordinator dedup + verdict, write `{dir}/review.md`
Since ref	`/trekreview --project <dir> --since <ref>`	Override "before" SHA for the diff range. Validated via `git rev-parse --verify`
Quick	`/trekreview --project <dir> --quick`	Skip brief-conformance reviewer; skip coordinator's reasonableness filter — fast correctness-only pass
Validate	`/trekreview --project <dir> --validate`	Schema-only check on existing `review.md`. No LLM calls
Dry run	`/trekreview --project <dir> --dry-run`	Print discovered scope + triage map; skip writes
Profile	`/trekreview --profile <name> --project <dir>`	(v4.1.0) Pin model profile for the review phase. See Profile system.

What review produces

review.md carries a flat array of finding-IDs in frontmatter (40-char hex from lib/parsers/finding-id.mjs) plus the full finding objects in the body under per-severity headings:

## Findings (BLOCKER) — must-fix before merge
## Findings (MAJOR) — should-fix
## Findings (MINOR) — nice-to-fix
## Findings (SUGGESTION) — opinion-only

Required body sections: ## Executive Summary, ## Coverage, ## Remediation Summary. The Coverage section enumerates which files were deep-reviewed, summary-only, or skipped (with reason) — explicit triage to avoid silent skips.

Triage gate (deterministic)

A path-pattern classifier produces {file → deep-review|summary-only|skip} before any LLM runs. Hardcoded skip patterns: *.lock, *.svg, dist/**, build/**, node_modules/**, generated markers. Deep-review patterns: auth/**, crypto/**, **/security/**. Hard refuse-with-suggestion above 100 files / 100K diff tokens.

Feedback loop (Handover 6)

/trekreview --project <dir>
# → review.md (BLOCKER + MAJOR findings)

/trekplan --brief <dir>/review.md
# → plan.md with `source_findings: [<id>, ...]` audit trail
#   (BLOCKER + MAJOR findings become plan goals; MINOR + SUGGESTION skipped for v1.0)

The plan's optional source_findings: frontmatter list is the audit trail back to consumed findings. See docs/HANDOVER-CONTRACTS.md for the full Handover 6 contract.

`/trekcontinue` — Resume

Zero-friction multi-session resumption. Type /trekcontinue in a fresh Claude Code session — the command reads the per-project state file (.claude/projects/<project>/.session-state.local.json), prints a 3-line summary, and immediately begins executing the next session.

The state file is the contract — any session-end mechanism may write it (/trekexecute Phase 8 / Phase 2.55 / Phase 4 do so automatically; the /trekendsession helper writes it for informal flows; graceful-handoff may converge on it in a future release). /trekcontinue only reads. See Handover 7 in docs/HANDOVER-CONTRACTS.md for the full schema and producer/consumer contract.

Modes

Mode	Command	Description
Default	`/trekcontinue`	Auto-discover `.session-state.local.json` under cwd, validate, narrate, and begin executing the next session
Explicit	`/trekcontinue <project-dir>`	Use the named project directory; helpful when several active projects coexist under cwd
Help	`/trekcontinue --help`	Print usage block and the schema-v1 reference
Profile	`/trekcontinue --profile <name> [<project-dir>]`	(v4.1.0) Pin model profile for the resumed session. Plan-frontmatter `profile:` from the previous session is honored when this flag is omitted. See Profile system.

Schema v1 — `.session-state.local.json`

Field	Type	Required	Notes
`schema_version`	number	yes	Must be `1`
`project`	string	yes	Project directory path
`next_session_brief_path`	string	yes	Validator soft-checks file existence (warning, not error)
`next_session_label`	string	yes	Human-readable label for the next session (e.g. "Session 2 of 5")
`status`	enum	yes	`in_progress` \| `partial` \| `failed` \| `stopped` \| `completed` (`completed` → no further sessions to resume)
`updated_at`	string	yes	ISO-8601 timestamp

Forward-compat: unknown top-level keys are ignored (no errors, no warnings) — the same drift-WARN principle as Handover 3, so future producers (e.g. graceful-handoff v2.2) can extend the schema additively.

`/trekendsession` helper

For informal multi-session flows that don't run through /trekexecute (ad-hoc release runs, manual handovers), use the helper to write the state file at session-end:

/trekendsession .claude/projects/2026-05-01-feature/brief.md "Session 2 of 3"
# Writes .session-state.local.json with status=in_progress.
# Then in a fresh chat: /trekcontinue

Both arguments are required. No interactive prompt — headless-safe.

Typical flow

# Session 1 (long-running formal pipeline)
/trekplan --project .claude/projects/2026-05-01-feature
/trekexecute --project .claude/projects/2026-05-01-feature
# ... trekexecute Phase 8 writes .session-state.local.json on session-end ...

# (chat boundary — fresh Claude Code session)
/trekcontinue
# → reads state, prints 3-line summary, begins next session

Reviewing and annotating artifacts (v5.0.3)

/trekbrief, /trekplan, and /trekreview each end by running scripts/annotate.mjs against the just-written .md and printing the resulting file://<abs path> link. After they finish you see something like:

Brief written:    .claude/projects/2026-05-13-foo/brief.md
Annotation HTML:  file:///abs/path/.claude/projects/2026-05-13-foo/brief.html

────────────────────────────────────────────────────────────────────
To review and annotate this brief, open the HTML above in a browser:

    open file:///abs/path/.claude/projects/2026-05-13-foo/brief.html

Click any line to add YOUR OWN note. The sidebar collects every note,
the "Copy Prompt" button gathers them into one structured prompt.
Paste that prompt back into this chat and Claude revises brief.md
from your notes. Annotations persist in your browser if you close
the tab and reopen the same file.
────────────────────────────────────────────────────────────────────

You run open (or click the file:// link in your terminal), the HTML opens in your default browser. The annotation UX is modelled on claude-code-100x/build-site.js:

Topbar: pencil-toggle button — annotation mode default ON. Click to turn off (then you read the article normally, follow links, etc.). A second button opens the sidebar panel.
Article body: the artifact rendered as a proper article — headings, paragraphs, lists, tables, code blocks, blockquotes. Hover any element in mode and it highlights. To anchor on a specific phrase, select the text first, then click. Otherwise the whole element becomes the anchor.
Form popover appears at the cursor with:
- Section (auto-detected from the nearest h1/h2 above).
- Anchored to — the exact text you selected, or the element's first ~200 chars if you didn't select.
- Three intent buttons: Fiks (something is wrong — fix it), Endre (change the wording / content), Spørsmål (an open question — clarify or answer). Colored: red / orange / blue.
- Comment textarea (optional but helpful).
- Cancel / Save. Save stays disabled until you pick an intent. Shortcut: ⌘Enter to save, Esc to cancel.
Annotated elements get an amber highlight + a number badge in the margin showing how many annotations target that element.
Sidebar panel (Show annotations) — every annotation grouped by section, in document order. Each card shows the intent badge (colored), the anchored snippet (mono-quote), the comment text, and a delete button. Click a card to scroll the article to that element and flash it.
Copy Prompt at the foot of the panel — assembles every annotation into one structured markdown prompt and copies it to your clipboard.
Clear all wipes every annotation (after confirm).
Persistence: every annotation is saved to browser localStorage keyed on the artifact's absolute path (voyage-annotate:v2:<abs path>). Refresh the tab or close the browser and re-open — your work is there.

You select / click, pick intent, write comment, repeat. When you're done, Copy Prompt, paste back into this chat. Claude revises the .md freehand from your notes. The operator drives every annotation. Claude never pre-generates suggestions in this flow.

What v5.0.3 changed from v5.0.2. v5.0.2 was operator-led but the UX was too thin — click a line, type a freeform note, save. The reference the operator pointed at (~/repos/claude-code-100x/claude-code-100x/build-site.js) already had the right pattern: pencil-toggle, selection capture, three intent categories, popover form, structured markdown export. v5.0.3 rebuilds scripts/annotate.mjs against that reference. v5.0.0 / v5.0.1 / v5.0.2 are all superseded; only the v5.0.0 removals (bespoke playground SPA, /trekrevise, Handover 8, supporting lib/ modules, Playwright e2e + devDeps) stay. See CHANGELOG.md § v5.0.0 → § v5.0.3.

The full pipeline

 /trekbrief   /trekresearch    /trekplan         /trekexecute
 ┌──────────────┐    ┌───────────────────┐   ┌─────────────────────┐  ┌─────────────────────┐
 │ Interview    │    │ 5 local agents    │   │ brief-reviewer      │  │ Parse plan          │
 │ ↓            │    │ 4 external agents │   │ ↓                   │  │ ↓                   │
 │ Intent/Goal  │    │ + Gemini bridge   │   │ 6-8 exploration     │  │ Detect sessions     │
 │ ↓            │    │ ↓                 │   │ agents (parallel)   │  │ ↓                   │
 │ Research     │    │ Triangulation     │   │ ↓                   │  │ Execute steps       │
 │ topics       │    │ ↓                 │   │ Opus planning       │  │ (verify + manifest  │
 │ ↓            │ → brief → → → → → → → → → → → ↓                   │→ │  + checkpoint)      │
 │ brief.md     │    │ research/*.md     │   │ plan-critic +       │  │ ↓                   │
 └──────────────┘    └───────────────────┘   │ scope-guardian      │  │ Phase 7.5 manifest  │
                                             │ ↓                   │  │ audit + 7.6 recovery│
                                             │ plan.md             │  │ ↓                   │
                                             └─────────────────────┘  │ progress.json + done│
                                                                      └─────────────────────┘

All artifacts live under .claude/projects/{YYYY-MM-DD}-{slug}/.

An opt-in upstream architect plugin (not bundled) can insert a Claude-Code-specific architecture-matching step between research and plan — /trekplan auto-discovers its architecture/overview.md output as priors when present.

Example workflows

Standard pipeline (manual control):

/trekbrief Add session caching with Redis
# → .claude/projects/2026-04-18-redis-session-caching/brief.md
# Interview identifies 2 research topics.

/trekresearch --project .claude/projects/2026-04-18-redis-session-caching --external "What are Redis session-caching best practices?"
/trekresearch --project .claude/projects/2026-04-18-redis-session-caching --local "How is caching currently handled in the codebase?"
# → .claude/projects/2026-04-18-redis-session-caching/research/01-*.md, 02-*.md

/trekplan --project .claude/projects/2026-04-18-redis-session-caching
# → .claude/projects/2026-04-18-redis-session-caching/plan.md

/trekexecute --project .claude/projects/2026-04-18-redis-session-caching
# → progress.json + code changes

Auto-mode (Claude manages the pipeline):

/trekbrief Add session caching with Redis
# Interview identifies topics. Choose "Auto (managed by Claude Code)" when asked.
# Claude runs research in parallel, then planning in foreground.
# Returns when plan.md is ready.

/trekexecute --project .claude/projects/2026-04-18-redis-session-caching

Standalone research (no planning):

/trekresearch What are the security implications of using JWT for session management?
# Read the brief, share with team, use for decision-making.

Quick plan for small tasks:

/trekbrief --quick Fix the login redirect bug
/trekplan --project .claude/projects/2026-04-18-login-redirect-fix --quick
/trekexecute --project .claude/projects/2026-04-18-login-redirect-fix

Dry run + validate before executing:

/trekexecute --project <dir> --validate   # schema check, no execution
/trekexecute --project <dir> --dry-run    # preview sessions and billing
/trekexecute --project <dir>              # execute

Review feedback loop (Handover 6):

/trekreview --project <dir>
# → review.md with severity-tagged findings + verdict (BLOCK / WARN / ALLOW)

# If verdict is BLOCK or WARN, feed findings back into a remediation plan:
/trekplan --brief <dir>/review.md
# → plan.md with source_findings: [<id>, ...] audit trail

/trekexecute --project <dir>              # execute the remediation plan

/trekreview --project <dir>               # re-review (overwrites review.md)

Upgrading

Migration notes for breaking changes (v1.x → v2.0, v2.x → v3.0) live in CHANGELOG.md and MIGRATION.md. v3.x non-breaking — minor bumps within v3 add features without changing pipeline contracts.

Quality infrastructure (since v3.1.0)

The plugin ships with node:test-based unit tests and a lib/ directory of pure-JS validators wired into the commands. Forking the plugin for internal use? Run npm test to confirm the parsers, validators, and doc-consistency invariants still hold:

cd plugins/trekplan
npm test    # runs all tests under tests/**/*.test.mjs

Validators (zero npm deps, hand-rolled YAML subset):

Module	Purpose
`lib/validators/brief-validator.mjs`	brief.md frontmatter + state machine (research_topics + status coherence) + body sections
`lib/validators/research-validator.mjs`	research-brief frontmatter (confidence ∈ [0,1], dimensions ≥ 1) + body sections; `--dir` mode validates a whole `research/` folder
`lib/validators/plan-validator.mjs`	wraps plan-schema + manifest-yaml; enforces v1.7 step heading, manifest count match, and forbidden-narrative-form denylist (`### Fase/Phase/Stage/Steg N`) — replaces the Phase 5.5 grep checks
`lib/validators/progress-validator.mjs`	progress.json shape (schema_version, status enum, current_step in range) + resume-readiness check
`lib/validators/architecture-discovery.mjs`	EXTERNAL CONTRACT — drift-WARN, never drift-FAIL. Discovers `architecture/overview.md` (owned by an opt-in upstream architect plugin, not bundled) and tolerates non-canonical filenames with warnings.

Each module exposes a CLI: node lib/validators/<name>.mjs --json <path> returns structured {valid, errors, warnings, parsed}. Commands invoke the CLI as their schema check.

A doc-consistency test (tests/lib/doc-consistency.test.mjs) pins prose-vs-source invariants — the agent table in CLAUDE.md must match the agents/*.md file count, every command's frontmatter name: must match its filename, and templates/plan-template.md must declare plan_version: 1.7.

Borrowed pattern from llm-security (commit 97c5c9d); extending the plugin should preserve the invariants the test pins.

Handover contracts

docs/HANDOVER-CONTRACTS.md is the single source of truth for the file formats that pass between the pipeline commands (seven handovers: brief → research, research → plan, architecture → plan, plan → execute, progress.json resume, review → plan, .session-state.local.json). When you fork the plugin or extend a stage, that document tells you what every producer must write and what every consumer is allowed to assume. It also documents the external contract for architecture/overview.md (owned by an opt-in upstream architect plugin, not bundled) — discovery only, drift-warn never drift-fail.

PreCompact resume integrity (CC v2.1.105+)

The pre-compact-flush.mjs hook directly fixes the documented P0 in docs/trekexecute-v2-observations-from-config-audit-v4.md: in skill-driven execution, progress.json could fall behind git reality before context compaction, breaking /trekexecute --resume after long conversations. The hook fires on every PreCompact event, locates any progress.json under .claude/projects/, compares stored current_step against git log --oneline {session_start_sha}..HEAD, and atomically writes a fresh checkpoint (tmp + rename, monotonic only) when git is ahead. Never blocks compaction.

Known limitations

Infrastructure-as-code (IaC) gets reduced value. The exploration agents are designed for application code. Terraform, Helm, Pulumi, CDK projects will get a plan, but agents like architecture-mapper and test-strategist produce less useful output for IaC. Use trekplan for the structural plan, then supplement IaC-specific steps manually.

Annotation HTML requires a desktop browser. scripts/annotate.mjs produces a single self-contained .html file you open with file:// in any modern browser (Chrome / Safari / Firefox / Edge — last two versions). No CDN, no server, no npm runtime deps. State persists in localStorage so closing and re-opening the tab keeps your work, but it's local to one browser on one machine — not synced anywhere. If you want to annotate without a browser, paste the .md into Claude with "comments inline below" and write notes in chat — same end result, just without the visual surface.

Installation

Add the marketplace and browse plugins with /plugin:

claude plugin marketplace add https://git.fromaitochitta.com/open/ktg-plugin-marketplace.git

Or enable directly in ~/.claude/settings.json:

{
  "enabledPlugins": {
    "voyage@ktg-plugin-marketplace": true
  }
}

An optional architect step between research and plan was previously available via a separate plugin; that architect plugin is no longer publicly distributed. The architecture/overview.md filesystem slot remains supported by /trekplan for any compatible producer.

Profile system (v4.1.0)

Three built-in model profiles plus operator-defined <custom>.yaml (drop in lib/profiles/). Each profile pins phase_models for the six pipeline phases. The active profile is recorded in plan.md frontmatter as profile: <name> and emitted to JSONL stats for cost-attribution.

Profile	Brief	Research	Plan	Execute	Review	Continue	Use case
`economy`	sonnet	sonnet	sonnet	sonnet	sonnet	sonnet	Lowest cost; high-confidence small-scope tasks
`balanced` (default)	sonnet	sonnet	opus	sonnet	opus	sonnet	Default — opus where reasoning depth pays off
`premium`	opus	sonnet	opus	sonnet	opus	sonnet	Critical-path planning + review when budget allows

Lookup order:

Explicit --profile <name> flag passed to the command
Plan-file frontmatter profile: (when resuming via /trekexecute --resume or /trekcontinue)
VOYAGE_PROFILE environment variable
Default balanced

See docs/profiles.md for the decision tree, custom-profile authoring, and cost estimation disclaimer (the per-profile cost numbers are anslag, not contractual SLAs).

Observability (v4.1.0)

Stop-hook OTel/Prometheus export — opt-in via VOYAGE_EXPORT_MODE:

Mode	Output	Endpoint env-var
`off` (default)	(no export)	—
`textfile`	`voyage.prom` (Prometheus exposition format)	`VOYAGE_TEXTFILE_DIR`
`otlp`	OTLP/JSON POST	`VOYAGE_OTEL_ENDPOINT`

Default JSONL stats stream (${CLAUDE_PLUGIN_DATA}/trek*-stats.jsonl) is unchanged — OTel is additive. Local Docker Compose stack: examples/observability/. Operator docs: docs/observability.md. Both pin minimum versions per CVE history.

Cost profile

Opus runs the orchestrators (one per command) and the executor (one per plan session). Sonnet runs the exploration and review swarms (5–10 agents per command, with effort/turn limits). The pipeline front-loads cheap Sonnet work so Opus only does synthesis and execution. Typical total: comparable to a long single Claude Code session — the per-command cost is published in ${CLAUDE_PLUGIN_DATA}/trek*-stats.jsonl if you want exact numbers.

For per-profile cost estimates, see docs/profiles.md.

Requirements

Claude Code (CLI, desktop app, or web app)
Claude subscription with Opus access (Max plan recommended)
Optional: Tavily MCP server for enhanced external research
Optional: Gemini MCP server for independent second opinion via Gemini Deep Research

Architecture

Top-level layout:

trekplan/
├── agents/        23 specialized agents (sonnet for exploration + review, opus for orchestration)
├── commands/      6 slash commands (trekbrief, trekresearch, trekplan, trekexecute, trekreview, trekcontinue) + trekplan-end-session helper
├── templates/     Frontmatter templates for brief, research, plan, session, launch
├── hooks/         5 hooks (pre-bash, pre-write, session-title, post-bash-stats, pre-compact-flush)
├── lib/           Zero-dep parsers and validators (CLI shims under lib/validators/)
├── tests/         109 node:test cases — `npm test` is the fork-readiness gate
├── docs/          HANDOVER-CONTRACTS.md + architect-bridge-test.md
└── examples/      01-add-verbose-flag/ — calibrated end-to-end pipeline demo

Pure markdown commands and agents. Hooks and validators are self-contained Node.js with zero npm dependencies. See CONTRIBUTING.md for the full file inventory.

Extending the plugin

Common modifications fork-ers make. None require touching lib/ — all of these are surface-level changes to commands, agents, or settings.

Add a new exploration agent

Exploration agents run in parallel during /trekplan Phase 5. They read the codebase and contribute structured findings to plan synthesis.

Copy agents/architecture-mapper.md as a template:

cp agents/architecture-mapper.md agents/my-new-agent.md

Update the frontmatter name, description, tools, and model. Use sonnet unless the agent needs deep reasoning (most don't).
Add the agent to the swarm in agents/planning-orchestrator.md Phase 5 — register it under the codebase-size bucket where it should fire (always / medium+ / large only).
Update the agent table in CLAUDE.md and README.md to keep the doc-consistency test green:
```
npm test -- tests/lib/doc-consistency.test.mjs
```

Switch the planning model

The default for /trekbrief, /trekresearch, /trekplan, and /trekexecute is opus (deep reasoning). To run on Sonnet for cost or latency, search-and-replace the frontmatter in three files:

sed -i.bak 's/^model: opus$/model: sonnet/' \
  commands/trekbrief.md \
  commands/trekresearch.md \
  commands/trekplan.md \
  commands/trekexecute.md

The exploration agents stay on Sonnet — only the orchestrator is bumped.

Disable external research

/trekresearch --local skips Tavily, Microsoft Learn, and the Gemini bridge. To make --local the default, edit the front of commands/trekresearch.md Phase 1 and flip the default branch of the --local argument check. Or just always pass --local and document it in your team's CLAUDE.md.

Plugin data contract

The four commands write to a single project directory (.claude/projects/{date}-{slug}/). The full schema for every artifact is in docs/HANDOVER-CONTRACTS.md. That document is the single source of truth for:

File paths each command reads/writes
Frontmatter schema for brief.md, research/*.md, plan.md
progress.json schema
Validator → handover mapping
Versioning and breaking-change protocol

If you fork the plugin and change the schema for any artifact, update that doc and the corresponding lib/validators/*.mjs and run npm test — the validators and doc-consistency tests will catch schema drift.

Disable the architect bridge

/trekplan auto-discovers architecture/overview.md if an opt-in upstream architect plugin (not bundled) produced one. To suppress this, leave the architecture/ directory absent from your project directory. Discovery is additive — missing file is fine, no error.

Contributing

See CONTRIBUTING.md.

License

MIT

54 KiB Raw Permalink Blame History Unescape Escape

trekplan — Brief, Research, Plan, Execute, Review, Continue

Division of labor

Two kinds of briefs

Manifest-verified steps

Autonomy mode (--gates, v3.4.0)

Quick start

When to use it

What you get

/trekbrief — Brief

Modes

Force-stop

What the brief contains

/trekresearch — Research

Modes

/trekplan — Planning

Modes

What the plan contains

/trekexecute — Execution

Modes

Session-aware parallel execution (worktree-isolated)

Billing safety

Failure recovery

Security hardening

Recommended: disable Skill shell execution (CC v2.1.91+)

Headless execution

Headless multi-session tuning (CC v2.1.89+)

Session titles for voyage commands (CC v2.1.94+)

Per-step timing (CC v2.1.97+)

/trekreview — Review

Modes

What review produces

Triage gate (deterministic)

Feedback loop (Handover 6)

/trekcontinue — Resume

Modes

Schema v1 — .session-state.local.json

/trekendsession helper

Typical flow

Reviewing and annotating artifacts (v5.0.3)

The full pipeline

Example workflows

Upgrading

Quality infrastructure (since v3.1.0)

Handover contracts

PreCompact resume integrity (CC v2.1.105+)

Known limitations

Installation

Profile system (v4.1.0)

Observability (v4.1.0)

Cost profile

Requirements

Architecture

Extending the plugin

Add a new exploration agent

Switch the planning model

Disable external research

Plugin data contract

Disable the architect bridge

Contributing

License

54 KiB

Raw Permalink Blame History

Autonomy mode (`--gates`, v3.4.0)

`/trekbrief` — Brief

`/trekresearch` — Research

`/trekplan` — Planning

`/trekexecute` — Execution

`/trekreview` — Review

`/trekcontinue` — Resume

Schema v1 — `.session-state.local.json`

`/trekendsession` helper