54 KiB
trekplan — Brief, Research, Plan, Execute, Review, Continue
Solo-maintained, fork-and-own. This plugin is a starting point, not a vendor product. Issues are welcome as signals; pull requests are not accepted. See GOVERNANCE.md for the full model and what upstream provides.
AI-generated: all code produced by Claude Code through dialog-driven development. Full disclosure →
A Claude Code plugin for deep implementation planning, multi-source research, autonomous execution, independent post-hoc review, and zero-friction multi-session resumption. Six commands, one pipeline:
What's new in v5.1 —
/trekbriefPhase 3.5 commits per-phasephase_signals(effort + optional model forresearch/plan/execute/review) tobrief.mdfrontmatter.brief_version: 2.1activates a validator-side sequencing gate (BRIEF_V51_MISSING_SIGNALS) so downstream commands halt with a friendly hint when signals are missing. Composition rule per downstream command: brief signal wins per-phase, profile fills gaps.effort == lowactivates the existing--quick-equivalent code-path in each command (/trekexecutelow-effort =--gates open+ sequential). Additive — no breaking changes; pre-2.1 briefs still validate.
| Command | What it does |
|---|---|
/trekbrief |
Brief — interactive interview produces a task brief with explicit research plan |
/trekresearch |
Research — deep local + external research with triangulation |
/trekplan |
Plan — agent swarm exploration, Opus planning, adversarial review |
/trekexecute |
Execute — disciplined step-by-step implementation with failure recovery |
/trekreview |
Review — independent post-hoc review of delivered code against the brief, severity-tagged findings |
/trekcontinue |
Continue — read .session-state.local.json and resume the next session in a multi-session project |
/trekbrief, /trekplan, and /trekreview each end by running scripts/annotate.mjs against the just-written artifact and printing the resulting file://<abs path> link. The operator opens the HTML in a browser, clicks any line of the document, writes their own note in the inline textarea, watches a sidebar of all notes (editable, deletable, persisted in browser localStorage), and clicks "Copy Prompt" to get one structured prompt that they paste back into Claude — Claude then revises the .md from the notes. The operator drives every annotation. See Reviewing and annotating artifacts.
Every artifact lives in one project directory: .claude/projects/{YYYY-MM-DD}-{slug}/ contains brief.md, research/NN-*.md, plan.md, sessions/, progress.json, and review.md.
Division of labor
| Command | Responsibility | Output |
|---|---|---|
/trekbrief |
Capture intent — intent, goal, non-goals, success criteria, and a research plan with explicit topics. Interactive only. | brief.md (task brief) |
/trekresearch |
Gather context — code state, external docs, community, risk. Makes NO build decisions. | research/NN-slug.md (research brief) |
/trekplan |
Transform intent into an executable contract — per-step YAML manifest, regex-validated checkpoints, verifiable paths. Plan-critic is a hard gate. Auto-discovers architecture/overview.md as priors when an opt-in upstream architect plugin (not bundled) is installed. |
plan.md with Manifest blocks + plan_version: 1.7 |
/trekexecute |
Execute the contract disciplined — fresh verification, independent manifest audit, honest reporting. Does NOT compensate for weak plans — escalates. | progress.json + structured report + manifest-audit status |
/trekreview |
Close the loop — independent post-hoc reviewer reads brief.md and the diff produced by execute, runs brief-conformance + code-correctness reviewers in parallel, dedups via Judge Agent. Severity-tagged findings (Critical/High/Medium/Low/Info) feed back into planning via Handover 6. |
review.md (type: trekreview) with stable 40-char hex finding-IDs |
Principle: Each step consumes the previous step's structured artifact. If execute has to guess, the plan is weak and must be revised upstream — not patched downstream.
Two kinds of briefs
Terminology matters:
- Task brief — produced by
/trekbrief. Captures what we want and why. Drives planning. - Research brief — produced by
/trekresearch. Captures what we learned about a topic. Feeds planning.
A project typically has one task brief and zero-to-N research briefs.
Manifest-verified steps
Every step in the plan ends with a YAML manifest: block declaring expected_paths, commit_message_pattern, bash_syntax_check, forbidden_paths, must_contain. The executor checks the manifest against the resulting commit — a step may not be marked passed if its manifest does not verify, regardless of the Verify command's exit code (Hard Rule 17).
After all steps complete, /trekexecute runs Phase 7.5 — Manifest audit (independent): re-verifies every expected path from git log + filesystem, ignoring the agent's own bookkeeping. Drift → status partial, Phase 7.6 auto-dispatches a bounded recovery session with only the missing steps (recovery_depth ≤ 2). Step 0 pre-flight (git push --dry-run) runs inside every session sandbox before any real work — exit 77 sentinel catches sandbox push-denial before the agent wastes the whole budget.
No cloud dependency. No GitHub requirement. Works on Mac, Linux, and Windows.
Autonomy mode (--gates, v3.4.0)
All four pipeline commands accept --gates {open|closed|adaptive} to control how many autonomy checkpoints surface to the operator on the path from brief approval to main-merge.
| Value | Behavior |
|---|---|
open |
Skip optional checkpoints. The pipeline runs end-to-end with the fewest interruptions. Suitable for trusted briefs in clean repos. |
closed |
Stop at every checkpoint. The operator confirms each transition. Suitable for high-stakes work or unfamiliar repos. |
adaptive (default) |
Stop only when the autonomy-gate state machine reports a meaningful boundary (manifest-audit FAIL, plan-critic BLOCKER, main-merge gate). Best balance of velocity and safety. |
Under the hood, lib/util/autonomy-gate.mjs runs a small state machine (idle → approved → executing → merge-pending → main-merged) and lib/stats/event-emit.mjs records each transition to ${CLAUDE_PLUGIN_DATA}/trek*-stats.jsonl. The new hooks/scripts/post-compact-flush.mjs PostCompact hook re-injects .session-state.local.json after context compaction so multi-session work survives a compaction boundary.
Quick start
# Install the marketplace, then browse and enable plugins with /plugin
claude plugin marketplace add https://git.fromaitochitta.com/open/ktg-plugin-marketplace.git
# Capture intent (interactive)
/trekbrief Add user authentication with JWT tokens
# → .claude/projects/2026-04-18-jwt-auth/brief.md
# Research each topic identified in the brief (manual default)
/trekresearch --project .claude/projects/2026-04-18-jwt-auth --external "What are current JWT best practices?"
# Plan from brief + research
/trekplan --project .claude/projects/2026-04-18-jwt-auth
# Execute
/trekexecute --project .claude/projects/2026-04-18-jwt-auth
# Review (independent post-hoc verification of the diff against brief.md)
/trekreview --project .claude/projects/2026-04-18-jwt-auth
# → .claude/projects/2026-04-18-jwt-auth/review.md
Or opt into auto-mode in /trekbrief — it will run research and planning sequentially inline in the main context, and return when plan.md is ready.
If review finds issues, feed review.md back into planning to produce a remediation plan: /trekplan --brief .claude/projects/2026-04-18-jwt-auth/review.md. The remediation plan carries source_findings: [<id>, ...] in its frontmatter — full audit trail back to the consumed findings (Handover 6).
An optional architect step can sit between research and plan — /trekplan auto-discovers an architecture/overview.md produced by an opt-in upstream architect plugin (not bundled here; the architect plugin is no longer publicly distributed, but the architecture/overview.md filesystem slot remains available for any compatible producer).
When to use it
Use it when:
- The task touches 3+ files or modules and you need to understand how they connect
- You're working in an unfamiliar codebase and need a map before you start
- The implementation has non-obvious dependencies, ordering constraints, or risks
- You want a reviewable plan before committing to an approach
- You need autonomous headless execution without human intervention
- You need to research a technology, library, or approach before deciding
Don't use it when:
- The task is a single-file change where the fix is obvious
- You already know exactly what to change and in what order
Rule of thumb: If you can describe the full implementation in one sentence and it touches 1-2 files, skip trekplan and just implement. If you need to think about it, trekplan earns its cost.
What you get
Concrete capabilities, observable in the code — not aspirations.
Across all profiles:
- Strategy-to-execution on four explicit handover points. Each transition is a filesystem contract (
docs/HANDOVER-CONTRACTS.md), not a conversation. You can stop after any stage and resume later without context loss. - Resume safety after long sessions. The PreCompact hook reconciles
progress.jsonwith git history before context compaction (CC v2.1.105+) — closes a documented--resumefailure mode. - Schema discipline.
plan-validator --strictenforces### Step N:form and rejects narrative drift (### Fase,### Phase,### Stage,### Steg) before execution. - Audit trail by construction. Every executed step records
commit_sha,verify_passed,files_changedinprogress.json.
Solo developer. Plans survive across sessions; adversarial review (plan-critic + scope-guardian) catches your own tunnel vision before code is written; brief-phase forces clarity on what the task actually is. examples/01-add-verbose-flag/ shows what good shape looks like.
Team (2–10). Plan files are handover-ready — a colleague can pick up a project directory without re-asking "what did you mean here?". --decompose splits a plan into self-contained headless sessions with scoped --allowedTools. The plan-critic semantic rubric gives the team a shared definition of "this plan defers decisions to the executor".
Virksomhet / regulated environment. Defense-in-depth security across four layers (plugin hooks, prompt-level denylist, pre-execution plan scan, scoped tool access). disableSkillShellExecution: true recommendation for fork-ers handling untrusted briefs. No cloud dependency, no GitHub requirement. Validators are plain-Node CLIs — invocable from CI, custom hooks, or external tools, not just from voyage commands.
What it doesn't solve:
- LLM output truthfulness. Validators check shape, not facts. A plan with hallucinated paths passes schema but fails in execute. Plan-critic catches some, not all.
- Multi-user concurrency on a single project directory. Two simultaneous executors will clobber
progress.json. - Cost management. Opus on the orchestrator layer is expensive; documented in Cost profile, no automatic model downgrade.
- Linear/Jira/Slack integrations. Intentional omission — solo project, no enterprise wiring.
One-line summary: an executable contract pipeline where each stage is filesystem-validated, session-survivable, and skill-independent — in exchange for writing an actual brief before planning and an actual plan before coding.
/trekbrief — Brief
Interactive requirements-gathering command. Runs a dynamic, quality-gated interview and produces a task brief with an explicit research plan. Optionally orchestrates the rest of the pipeline.
A section-driven interview loop fills required brief sections (Intent / Goal / Success Criteria / Research Plan) until each shows initial signal, then brief-reviewer scores the draft on five dimensions (completeness, consistency, testability, scope clarity, research-plan validity) and gates publication. Max 3 review iterations; force-stop yields a brief_quality: partial brief with the failing dimensions documented.
Output: .claude/projects/{YYYY-MM-DD}-{slug}/brief.md
Modes
| Mode | Usage | Behavior |
|---|---|---|
| Default | /trekbrief <task> |
Dynamic interview until quality gates pass. No question cap. |
| Quick | /trekbrief --quick <task> |
Starts compact (optional sections get at most one probe), still escalates on weak required sections or failed review gate. |
| Profile | /trekbrief --profile <name> <task> |
(v4.1.0) Pin model profile for the brief phase: economy / balanced / premium / <custom>. See Profile system below. |
/trekbrief is always interactive. There is no foreground/background mode — the interview requires user input.
Force-stop
If you say "stop" or "enough" during Phase 4, the current review findings are surfaced with per-dimension scores and you choose:
- Answer one more follow-up — the loop continues.
- Stop now (accept partial brief) — the brief is finalized with
brief_quality: partialand a## Brief Qualitysection listing the failing dimensions. Downstream planning will treat these as reduced-confidence areas.
What the brief contains
- Intent — why this matters, motivation, user need (load-bearing)
- Goal — concrete end state in 1-3 sentences
- Non-Goals — explicitly out of scope
- Constraints / Preferences / NFRs — technical, time, resource limits
- Success Criteria — 2-4 falsifiable commands/observations
- Research Plan — N topics, each with research question, scope (local/external/both), confidence needed, cost estimate, and a ready-to-run
/trekresearchcommand - Open Questions / Assumptions — from "I don't know" answers and implicit gaps
- Prior Attempts — what worked/failed before
/trekresearch — Research
Deep, multi-phase research that combines local codebase analysis with external knowledge. Uses specialized agent swarms to investigate multiple dimensions in parallel, then triangulates findings.
A parallel swarm of up to 5 local + 4 external Sonnet agents investigates 3–8 research dimensions, with optional Gemini Deep Research as an independent second opinion. Findings are triangulated (local vs. external, confidence per dimension, contradictions flagged) and synthesized into a structured research brief.
Output:
- With
--project <dir>:{dir}/research/{NN}-{slug}.md(auto-incremented index) - Without:
.claude/research/trekresearch-{date}-{slug}.md
Modes
| Mode | Usage | Behavior |
|---|---|---|
| Default | /trekresearch <question> |
Interview + research swarm (local + external + Gemini), foreground |
| Project | /trekresearch --project <dir> <question> |
Write brief into {dir}/research/NN-slug.md |
| Quick | /trekresearch --quick <question> |
Interview (short) + inline research, no agent swarm |
| Local | /trekresearch --local <question> |
Only codebase analysis agents (skip external + Gemini) |
| External | /trekresearch --external <question> |
Only external research agents (skip codebase analysis) |
| Foreground | /trekresearch --fg <question> |
No-op alias (foreground is default since v2.4.0) |
| Profile | /trekresearch --profile <name> <question> |
(v4.1.0) Pin model profile for the research phase. See Profile system. |
Flags combine: --project <dir> --external.
Research uses up to 5 local agents (architecture-mapper, dependency-tracer, task-finder, git-historian, convention-scanner) and 4 external agents (docs-researcher, community-researcher, security-researcher, contrarian-researcher) plus the optional Gemini bridge for an independent second opinion. Per-agent details in agents/.
/trekplan — Planning
Produces an implementation plan detailed enough for autonomous execution. v2.0 breaking change: requires --brief or --project. There is no longer an interview inside /trekplan — use /trekbrief first.
After brief-reviewer validates the input brief, 6–8 Sonnet exploration agents analyze the codebase in parallel and merge findings into a synthesis. Optional research briefs (--research, or auto-discovered in {project_dir}/research/) enrich the plan; architecture/overview.md priors are loaded if an opt-in upstream architect plugin (not bundled) produced one. Opus then writes the plan with per-step YAML manifests, which plan-critic (9 dimensions) and scope-guardian adversarially review before handoff.
Output:
- With
--project <dir>:{dir}/plan.md - With
--brief <path>:.claude/plans/trekplan-{date}-{slug}.md
Modes
| Mode | Usage | Behavior |
|---|---|---|
| Project | /trekplan --project <dir> |
Read {dir}/brief.md + auto-discover {dir}/research/*.md, write {dir}/plan.md |
| Brief | /trekplan --brief <path> |
Plan from a specific brief file |
| Research-enriched | /trekplan --project <dir> --research <brief> |
Add extra research briefs beyond what is in research/ |
| Foreground | /trekplan --project <dir> --fg |
No-op alias (foreground is default since v2.4.0) |
| Quick | /trekplan --project <dir> --quick |
No agent swarm, lightweight scan only |
| Decompose | /trekplan --decompose plan.md |
Split plan into headless session specs |
| Export | /trekplan --export pr plan.md |
PR description, issue comment, or clean markdown |
| Profile | /trekplan --profile <name> --project <dir> |
(v4.1.0) Pin model profile; emitted as profile: in plan.md frontmatter. See Profile system. |
--brief or --project is required. /trekplan with no brief exits with an error and a pointer to /trekbrief.
What the plan contains
Every plan includes:
- Context — derived from brief
## Intent+## Goal - Architecture Diagram — Mermaid C4-style component diagram
- Codebase Analysis — tech stack, patterns, relevant files, reusable code
- Research Sources — findings from research briefs (when present)
- Implementation Plan — ordered steps with file paths, changes, failure recovery, and git checkpoints
- Per-step Manifest — YAML block with
expected_paths,commit_message_pattern,bash_syntax_check,forbidden_paths,must_contain - Alternatives Considered — other approaches with pros/cons
- Test Strategy — from test-strategist findings
- Risks and Mitigations — from risk-assessor findings
- Verification — testable end-to-end criteria
- Execution Strategy — session grouping and parallel waves (plans with > 5 steps)
- Plan Quality Score — quantitative grade (A-D) across 6 weighted dimensions
Every implementation step includes:
- On failure: — what to do when verification fails (revert / retry / skip / escalate)
- Checkpoint: — git commit after success
- Manifest: — the objective completion predicate (Hard Rule 17)
Exploration uses 6–8 Sonnet agents in parallel (architecture-mapper, dependency-tracer, task-finder, test-strategist, git-historian, risk-assessor, plus convention-scanner on medium+ codebases and research-scout when unfamiliar tech is detected). Adversarial review then runs brief-reviewer, plan-critic (9 dimensions, no-placeholder enforcement, manifest audit), and scope-guardian (creep + gap detection). Per-agent details in agents/.
/trekexecute — Execution
Reads a plan from /trekplan and implements it with strict discipline. No guessing, no improvising — follows the plan exactly.
Per step: apply Changes exactly as written → run Verify (exit code is truth) → manifest audit (expected paths, forbidden paths, commit pattern) → follow the plan's failure clause if anything fails (revert / retry / skip / escalate) → Checkpoint commit. After all steps: independent Phase 7.5 manifest audit from git log + filesystem (ignoring agent bookkeeping); drift triggers Phase 7.6 bounded recovery.
Modes
| Mode | Usage | Behavior |
|---|---|---|
| Project | /trekexecute --project <dir> |
Read {dir}/plan.md, write {dir}/progress.json |
| Plan path | /trekexecute plan.md |
Execute a specific plan file |
| Resume | /trekexecute --project <dir> --resume |
Resume from last progress checkpoint |
| Dry run | /trekexecute --project <dir> --dry-run |
Validate plan structure + preview sessions and billing |
| Validate | /trekexecute --project <dir> --validate |
Schema-only check — parse steps + manifests, report READY | FAIL, no execution |
| Single step | /trekexecute --project <dir> --step 3 |
Execute only step 3 |
| Foreground | /trekexecute --project <dir> --fg |
Force sequential, ignore Execution Strategy |
| Single session | /trekexecute --project <dir> --session 2 |
Execute only session 2 from Execution Strategy |
| Profile | /trekexecute --profile <name> --project <dir> |
(v4.1.0) Pin model profile for execution. Plan-frontmatter profile: is honored when this flag is omitted. See Profile system. |
Session-aware parallel execution (worktree-isolated)
When a plan has an ## Execution Strategy section (auto-generated by /trekplan for plans with > 5 steps), /trekexecute automatically:
- Pre-flight checks — validates clean working tree, plan file tracked in git, no scope fence overlaps between parallel sessions, no stale worktrees
- Creates git worktrees — each parallel session gets its own isolated worktree and branch (
trek/{slug}/session-{N}) - Launches
claude -pper session per wave, each in its own worktree - Merges branches back sequentially with
--no-ffafter each wave completes - Cleans up worktrees and branches unconditionally (even on failure)
- Runs master verification on the merged result
Wave 1: Session 1 (worktree-1) + Session 2 (worktree-2) -- parallel
↓ both complete → sequential merge to main
Wave 2: Session 3 (worktree-3) -- sequential
↓ complete → merge to main
Cleanup worktrees + Master verification
Each session operates in complete filesystem isolation — no shared git index, no race conditions, no data loss. If a merge produces conflicts, the merge is aborted and conflicting files are reported.
Use --fg to force sequential execution even when a plan has an Execution Strategy.
Billing safety
Before launching parallel claude -p sessions, /trekexecute checks whether ANTHROPIC_API_KEY is set in your environment. If it is, parallel sessions will bill your API account (pay-per-token), not your Claude subscription (Max/Pro). This can be expensive — parallel Opus sessions can cost $50-100+ per run.
When an API key is detected, you are asked how to proceed:
- Use --fg instead (recommended) — run sequentially in the current session using your subscription
- Continue with API billing — launch parallel sessions on your API account
- Stop — cancel and unset the API key first
If no API key is set, parallel sessions use your subscription and proceed without asking.
Failure recovery
- 3-attempt retry cap — retries twice, then stops (never loops forever)
- On failure: revert — undo changes, stop
- On failure: retry — try alternative approach, then revert if still failing
- On failure: skip — non-critical step, continue
- On failure: escalate — stop everything, needs human judgment
Security hardening
The executor implements defense-in-depth security across four layers:
- Plugin hooks —
pre-bash-executor.mjsblocks 13 categories of destructive commands (rm -rf /, chmod 777, pipe-to-shell, eval injection, disk wipe, shutdown, fork bombs, cron persistence, process killing, history destruction) with bash evasion normalization.pre-write-executor.mjsblocks writes to.git/hooks/,.claude/settings.json, shell configs,.ssh/,.aws/, and.envfiles - Prompt-level denylist — Security rules embedded in the executor command and session spec template that work even in headless
claude -psessions where hooks don't run - Pre-execution plan scan — Phase 2.4 scans all
Verify:andCheckpoint:commands against the denylist before execution begins, catching dangerous commands before they reach the executor - Scoped tool access — Headless child sessions use
--allowedTools "Read,Write,Edit,Bash,Glob,Grep"instead of--dangerously-skip-permissions, blocking Agent spawning, MCP tools, and web access in parallel sessions
Recommended: disable Skill shell execution (CC v2.1.91+)
For fork-ers handling untrusted task briefs or plans from external
sources, set disableSkillShellExecution: true in ~/.claude/settings.json
or in the project's .claude/settings.json:
{
"disableSkillShellExecution": true
}
This prevents Skills from invoking arbitrary shell, which closes a
prompt-injection vector that the plugin's own hooks cannot fully mitigate
(Skills can fire before pre-bash-executor matches). See
SECURITY.md for the full hardening list.
Headless execution
/trekexecute is designed for claude -p headless sessions:
- No questions asked — all recovery decisions come from the plan
- Progress file — crash recovery via
{project_dir}/progress.json(or.trekexecute-progress-{slug}.jsonfor legacy plans) - Scope fence enforcement — never touches files outside the session's scope
- JSON summary — machine-parseable
trekexecute_summaryblock for log parsing
Headless multi-session tuning (CC v2.1.89+)
When running multiple parallel claude -p sessions (decomposed plans
or wave-based execution), set MCP_CONNECTION_NONBLOCKING=true in the
launching environment so MCP server connection latency does not
serialize startup across waves:
export MCP_CONNECTION_NONBLOCKING=true
bash .claude/projects/{slug}/sessions/launch.sh
Without this, each child session can spend 1-3 s blocking on MCP connect, multiplying across waves. Setting it lets MCP connect lazily on first tool call.
Session titles for voyage commands (CC v2.1.94+)
A UserPromptSubmit hook (hooks/scripts/session-title.mjs) sets the
session title to voyage:<command>:<slug> whenever you invoke one of
the four voyage commands. This makes multi-session headless runs and
session-picker output trivially identifiable. Slug derivation:
| Invocation | Session title |
|---|---|
/trekplan --project .claude/projects/2026-04-18-jwt-auth |
voyage:plan:jwt-auth |
/trekbrief --quick |
voyage:brief:ad-hoc |
/trekexecute --project .claude/projects/2026-05-10-cleanup --resume |
voyage:execute:cleanup |
The hook is fail-open — any error → title is left untouched.
Per-step timing (CC v2.1.97+)
A PostToolUse hook (hooks/scripts/post-bash-stats.mjs) appends
duration_ms from each Bash tool call to
${CLAUDE_PLUGIN_DATA}/trekexecute-stats.jsonl. One line per Bash
call; useful for identifying long-running verify or checkpoint commands
across executions.
/trekreview — Review
Independent post-hoc review of delivered code against the brief. Reads brief.md
from scratch and treats research/plan as supplementary context. The output
review.md is a new artifact type (type: trekreview) with its own validator
and a contracted Handover 6 (review → plan) so findings can be fed back into
/trekplan --brief review.md to produce a remediation plan — closing
the iteration loop without ad-hoc conventions.
Modes
| Mode | Command | Description |
|---|---|---|
| Default | /trekreview --project <dir> |
brief-conformance + code-correctness reviewers in parallel, coordinator dedup + verdict, write {dir}/review.md |
| Since ref | /trekreview --project <dir> --since <ref> |
Override "before" SHA for the diff range. Validated via git rev-parse --verify |
| Quick | /trekreview --project <dir> --quick |
Skip brief-conformance reviewer; skip coordinator's reasonableness filter — fast correctness-only pass |
| Validate | /trekreview --project <dir> --validate |
Schema-only check on existing review.md. No LLM calls |
| Dry run | /trekreview --project <dir> --dry-run |
Print discovered scope + triage map; skip writes |
| Profile | /trekreview --profile <name> --project <dir> |
(v4.1.0) Pin model profile for the review phase. See Profile system. |
What review produces
review.md carries a flat array of finding-IDs in frontmatter (40-char hex from lib/parsers/finding-id.mjs) plus the full finding objects in the body under per-severity headings:
## Findings (BLOCKER)— must-fix before merge## Findings (MAJOR)— should-fix## Findings (MINOR)— nice-to-fix## Findings (SUGGESTION)— opinion-only
Required body sections: ## Executive Summary, ## Coverage, ## Remediation Summary. The Coverage section enumerates which files were deep-reviewed, summary-only, or skipped (with reason) — explicit triage to avoid silent skips.
Triage gate (deterministic)
A path-pattern classifier produces {file → deep-review|summary-only|skip} before any LLM runs. Hardcoded skip patterns: *.lock, *.svg, dist/**, build/**, node_modules/**, generated markers. Deep-review patterns: auth/**, crypto/**, **/security/**. Hard refuse-with-suggestion above 100 files / 100K diff tokens.
Feedback loop (Handover 6)
/trekreview --project <dir>
# → review.md (BLOCKER + MAJOR findings)
/trekplan --brief <dir>/review.md
# → plan.md with `source_findings: [<id>, ...]` audit trail
# (BLOCKER + MAJOR findings become plan goals; MINOR + SUGGESTION skipped for v1.0)
The plan's optional source_findings: frontmatter list is the audit trail back to consumed findings. See docs/HANDOVER-CONTRACTS.md for the full Handover 6 contract.
/trekcontinue — Resume
Zero-friction multi-session resumption. Type /trekcontinue in a fresh
Claude Code session — the command reads the per-project state file
(.claude/projects/<project>/.session-state.local.json), prints a 3-line
summary, and immediately begins executing the next session.
The state file is the contract — any session-end mechanism may write it
(/trekexecute Phase 8 / Phase 2.55 / Phase 4 do so automatically;
the /trekendsession helper writes it for informal flows;
graceful-handoff may converge on it in a future release). /trekcontinue
only reads. See Handover 7 in docs/HANDOVER-CONTRACTS.md for the full
schema and producer/consumer contract.
Modes
| Mode | Command | Description |
|---|---|---|
| Default | /trekcontinue |
Auto-discover .session-state.local.json under cwd, validate, narrate, and begin executing the next session |
| Explicit | /trekcontinue <project-dir> |
Use the named project directory; helpful when several active projects coexist under cwd |
| Help | /trekcontinue --help |
Print usage block and the schema-v1 reference |
| Profile | /trekcontinue --profile <name> [<project-dir>] |
(v4.1.0) Pin model profile for the resumed session. Plan-frontmatter profile: from the previous session is honored when this flag is omitted. See Profile system. |
Schema v1 — .session-state.local.json
| Field | Type | Required | Notes |
|---|---|---|---|
schema_version |
number | yes | Must be 1 |
project |
string | yes | Project directory path |
next_session_brief_path |
string | yes | Validator soft-checks file existence (warning, not error) |
next_session_label |
string | yes | Human-readable label for the next session (e.g. "Session 2 of 5") |
status |
enum | yes | in_progress | partial | failed | stopped | completed (completed → no further sessions to resume) |
updated_at |
string | yes | ISO-8601 timestamp |
Forward-compat: unknown top-level keys are ignored (no errors, no warnings) — the same drift-WARN principle as Handover 3, so future producers (e.g. graceful-handoff v2.2) can extend the schema additively.
/trekendsession helper
For informal multi-session flows that don't run through /trekexecute
(ad-hoc release runs, manual handovers), use the helper to write the state
file at session-end:
/trekendsession .claude/projects/2026-05-01-feature/brief.md "Session 2 of 3"
# Writes .session-state.local.json with status=in_progress.
# Then in a fresh chat: /trekcontinue
Both arguments are required. No interactive prompt — headless-safe.
Typical flow
# Session 1 (long-running formal pipeline)
/trekplan --project .claude/projects/2026-05-01-feature
/trekexecute --project .claude/projects/2026-05-01-feature
# ... trekexecute Phase 8 writes .session-state.local.json on session-end ...
# (chat boundary — fresh Claude Code session)
/trekcontinue
# → reads state, prints 3-line summary, begins next session
Reviewing and annotating artifacts (v5.0.3)
/trekbrief, /trekplan, and /trekreview each end by running
scripts/annotate.mjs against the just-written .md and printing the
resulting file://<abs path> link. After they finish you see something
like:
Brief written: .claude/projects/2026-05-13-foo/brief.md
Annotation HTML: file:///abs/path/.claude/projects/2026-05-13-foo/brief.html
────────────────────────────────────────────────────────────────────
To review and annotate this brief, open the HTML above in a browser:
open file:///abs/path/.claude/projects/2026-05-13-foo/brief.html
Click any line to add YOUR OWN note. The sidebar collects every note,
the "Copy Prompt" button gathers them into one structured prompt.
Paste that prompt back into this chat and Claude revises brief.md
from your notes. Annotations persist in your browser if you close
the tab and reopen the same file.
────────────────────────────────────────────────────────────────────
You run open (or click the file:// link in your terminal), the HTML
opens in your default browser. The annotation UX is modelled on
claude-code-100x/build-site.js:
- Topbar: pencil-toggle button — annotation mode default ON. Click to turn off (then you read the article normally, follow links, etc.). A second button opens the sidebar panel.
- Article body: the artifact rendered as a proper article — headings, paragraphs, lists, tables, code blocks, blockquotes. Hover any element in mode and it highlights. To anchor on a specific phrase, select the text first, then click. Otherwise the whole element becomes the anchor.
- Form popover appears at the cursor with:
- Section (auto-detected from the nearest h1/h2 above).
- Anchored to — the exact text you selected, or the element's first ~200 chars if you didn't select.
- Three intent buttons: Fiks (something is wrong — fix it), Endre (change the wording / content), Spørsmål (an open question — clarify or answer). Colored: red / orange / blue.
- Comment textarea (optional but helpful).
- Cancel / Save. Save stays disabled until you pick an intent.
Shortcut:
⌘Enterto save,Escto cancel.
- Annotated elements get an amber highlight + a number badge in the margin showing how many annotations target that element.
- Sidebar panel (Show annotations) — every annotation grouped by section, in document order. Each card shows the intent badge (colored), the anchored snippet (mono-quote), the comment text, and a delete button. Click a card to scroll the article to that element and flash it.
- Copy Prompt at the foot of the panel — assembles every annotation into one structured markdown prompt and copies it to your clipboard.
- Clear all wipes every annotation (after confirm).
- Persistence: every annotation is saved to browser
localStoragekeyed on the artifact's absolute path (voyage-annotate:v2:<abs path>). Refresh the tab or close the browser and re-open — your work is there.
You select / click, pick intent, write comment, repeat. When you're
done, Copy Prompt, paste back into this chat. Claude revises the .md
freehand from your notes. The operator drives every annotation.
Claude never pre-generates suggestions in this flow.
What v5.0.3 changed from v5.0.2. v5.0.2 was operator-led but the UX was too thin — click a line, type a freeform note, save. The reference the operator pointed at (
~/repos/claude-code-100x/claude-code-100x/build-site.js) already had the right pattern: pencil-toggle, selection capture, three intent categories, popover form, structured markdown export. v5.0.3 rebuildsscripts/annotate.mjsagainst that reference. v5.0.0 / v5.0.1 / v5.0.2 are all superseded; only the v5.0.0 removals (bespoke playground SPA,/trekrevise, Handover 8, supportinglib/modules, Playwright e2e + devDeps) stay. See CHANGELOG.md § v5.0.0 → § v5.0.3.
The full pipeline
/trekbrief /trekresearch /trekplan /trekexecute
┌──────────────┐ ┌───────────────────┐ ┌─────────────────────┐ ┌─────────────────────┐
│ Interview │ │ 5 local agents │ │ brief-reviewer │ │ Parse plan │
│ ↓ │ │ 4 external agents │ │ ↓ │ │ ↓ │
│ Intent/Goal │ │ + Gemini bridge │ │ 6-8 exploration │ │ Detect sessions │
│ ↓ │ │ ↓ │ │ agents (parallel) │ │ ↓ │
│ Research │ │ Triangulation │ │ ↓ │ │ Execute steps │
│ topics │ │ ↓ │ │ Opus planning │ │ (verify + manifest │
│ ↓ │ → brief → → → → → → → → → → → ↓ │→ │ + checkpoint) │
│ brief.md │ │ research/*.md │ │ plan-critic + │ │ ↓ │
└──────────────┘ └───────────────────┘ │ scope-guardian │ │ Phase 7.5 manifest │
│ ↓ │ │ audit + 7.6 recovery│
│ plan.md │ │ ↓ │
└─────────────────────┘ │ progress.json + done│
└─────────────────────┘
All artifacts live under .claude/projects/{YYYY-MM-DD}-{slug}/.
An opt-in upstream architect plugin (not bundled) can insert a Claude-Code-specific architecture-matching step between research and plan — /trekplan auto-discovers its architecture/overview.md output as priors when present.
Example workflows
Standard pipeline (manual control):
/trekbrief Add session caching with Redis
# → .claude/projects/2026-04-18-redis-session-caching/brief.md
# Interview identifies 2 research topics.
/trekresearch --project .claude/projects/2026-04-18-redis-session-caching --external "What are Redis session-caching best practices?"
/trekresearch --project .claude/projects/2026-04-18-redis-session-caching --local "How is caching currently handled in the codebase?"
# → .claude/projects/2026-04-18-redis-session-caching/research/01-*.md, 02-*.md
/trekplan --project .claude/projects/2026-04-18-redis-session-caching
# → .claude/projects/2026-04-18-redis-session-caching/plan.md
/trekexecute --project .claude/projects/2026-04-18-redis-session-caching
# → progress.json + code changes
Auto-mode (Claude manages the pipeline):
/trekbrief Add session caching with Redis
# Interview identifies topics. Choose "Auto (managed by Claude Code)" when asked.
# Claude runs research in parallel, then planning in foreground.
# Returns when plan.md is ready.
/trekexecute --project .claude/projects/2026-04-18-redis-session-caching
Standalone research (no planning):
/trekresearch What are the security implications of using JWT for session management?
# Read the brief, share with team, use for decision-making.
Quick plan for small tasks:
/trekbrief --quick Fix the login redirect bug
/trekplan --project .claude/projects/2026-04-18-login-redirect-fix --quick
/trekexecute --project .claude/projects/2026-04-18-login-redirect-fix
Dry run + validate before executing:
/trekexecute --project <dir> --validate # schema check, no execution
/trekexecute --project <dir> --dry-run # preview sessions and billing
/trekexecute --project <dir> # execute
Review feedback loop (Handover 6):
/trekreview --project <dir>
# → review.md with severity-tagged findings + verdict (BLOCK / WARN / ALLOW)
# If verdict is BLOCK or WARN, feed findings back into a remediation plan:
/trekplan --brief <dir>/review.md
# → plan.md with source_findings: [<id>, ...] audit trail
/trekexecute --project <dir> # execute the remediation plan
/trekreview --project <dir> # re-review (overwrites review.md)
Upgrading
Migration notes for breaking changes (v1.x → v2.0, v2.x → v3.0) live in CHANGELOG.md and MIGRATION.md. v3.x non-breaking — minor bumps within v3 add features without changing pipeline contracts.
Quality infrastructure (since v3.1.0)
The plugin ships with node:test-based unit tests and a lib/ directory of pure-JS validators wired into the commands. Forking the plugin for internal use? Run npm test to confirm the parsers, validators, and doc-consistency invariants still hold:
cd plugins/trekplan
npm test # runs all tests under tests/**/*.test.mjs
Validators (zero npm deps, hand-rolled YAML subset):
| Module | Purpose |
|---|---|
lib/validators/brief-validator.mjs |
brief.md frontmatter + state machine (research_topics + status coherence) + body sections |
lib/validators/research-validator.mjs |
research-brief frontmatter (confidence ∈ [0,1], dimensions ≥ 1) + body sections; --dir mode validates a whole research/ folder |
lib/validators/plan-validator.mjs |
wraps plan-schema + manifest-yaml; enforces v1.7 step heading, manifest count match, and forbidden-narrative-form denylist (### Fase/Phase/Stage/Steg N) — replaces the Phase 5.5 grep checks |
lib/validators/progress-validator.mjs |
progress.json shape (schema_version, status enum, current_step in range) + resume-readiness check |
lib/validators/architecture-discovery.mjs |
EXTERNAL CONTRACT — drift-WARN, never drift-FAIL. Discovers architecture/overview.md (owned by an opt-in upstream architect plugin, not bundled) and tolerates non-canonical filenames with warnings. |
Each module exposes a CLI: node lib/validators/<name>.mjs --json <path> returns structured {valid, errors, warnings, parsed}. Commands invoke the CLI as their schema check.
A doc-consistency test (tests/lib/doc-consistency.test.mjs) pins prose-vs-source invariants — the agent table in CLAUDE.md must match the agents/*.md file count, every command's frontmatter name: must match its filename, and templates/plan-template.md must declare plan_version: 1.7.
Borrowed pattern from llm-security (commit 97c5c9d); extending the plugin should preserve the invariants the test pins.
Handover contracts
docs/HANDOVER-CONTRACTS.md is the single source of truth for the file formats that pass between the pipeline commands (seven handovers: brief → research, research → plan, architecture → plan, plan → execute, progress.json resume, review → plan, .session-state.local.json). When you fork the plugin or extend a stage, that document tells you what every producer must write and what every consumer is allowed to assume. It also documents the external contract for architecture/overview.md (owned by an opt-in upstream architect plugin, not bundled) — discovery only, drift-warn never drift-fail.
PreCompact resume integrity (CC v2.1.105+)
The pre-compact-flush.mjs hook directly fixes the documented P0 in docs/trekexecute-v2-observations-from-config-audit-v4.md: in skill-driven execution, progress.json could fall behind git reality before context compaction, breaking /trekexecute --resume after long conversations. The hook fires on every PreCompact event, locates any progress.json under .claude/projects/, compares stored current_step against git log --oneline {session_start_sha}..HEAD, and atomically writes a fresh checkpoint (tmp + rename, monotonic only) when git is ahead. Never blocks compaction.
Known limitations
Infrastructure-as-code (IaC) gets reduced value. The exploration agents are designed for application code. Terraform, Helm, Pulumi, CDK projects will get a plan, but agents like architecture-mapper and test-strategist produce less useful output for IaC. Use trekplan for the structural plan, then supplement IaC-specific steps manually.
Annotation HTML requires a desktop browser. scripts/annotate.mjs produces a single self-contained .html file you open with file:// in any modern browser (Chrome / Safari / Firefox / Edge — last two versions). No CDN, no server, no npm runtime deps. State persists in localStorage so closing and re-opening the tab keeps your work, but it's local to one browser on one machine — not synced anywhere. If you want to annotate without a browser, paste the .md into Claude with "comments inline below" and write notes in chat — same end result, just without the visual surface.
Installation
Add the marketplace and browse plugins with /plugin:
claude plugin marketplace add https://git.fromaitochitta.com/open/ktg-plugin-marketplace.git
Or enable directly in ~/.claude/settings.json:
{
"enabledPlugins": {
"voyage@ktg-plugin-marketplace": true
}
}
An optional architect step between research and plan was previously available via a separate plugin; that architect plugin is no longer publicly distributed. The architecture/overview.md filesystem slot remains supported by /trekplan for any compatible producer.
Profile system (v4.1.0)
Three built-in model profiles plus operator-defined <custom>.yaml (drop in lib/profiles/). Each profile pins phase_models for the six pipeline phases. The active profile is recorded in plan.md frontmatter as profile: <name> and emitted to JSONL stats for cost-attribution.
| Profile | Brief | Research | Plan | Execute | Review | Continue | Use case |
|---|---|---|---|---|---|---|---|
economy |
sonnet | sonnet | sonnet | sonnet | sonnet | sonnet | Lowest cost; high-confidence small-scope tasks |
balanced (default) |
sonnet | sonnet | opus | sonnet | opus | sonnet | Default — opus where reasoning depth pays off |
premium |
opus | sonnet | opus | sonnet | opus | sonnet | Critical-path planning + review when budget allows |
Lookup order:
- Explicit
--profile <name>flag passed to the command - Plan-file frontmatter
profile:(when resuming via/trekexecute --resumeor/trekcontinue) VOYAGE_PROFILEenvironment variable- Default
balanced
See docs/profiles.md for the decision tree, custom-profile authoring, and cost estimation disclaimer (the per-profile cost numbers are anslag, not contractual SLAs).
Observability (v4.1.0)
Stop-hook OTel/Prometheus export — opt-in via VOYAGE_EXPORT_MODE:
| Mode | Output | Endpoint env-var |
|---|---|---|
off (default) |
(no export) | — |
textfile |
voyage.prom (Prometheus exposition format) |
VOYAGE_TEXTFILE_DIR |
otlp |
OTLP/JSON POST | VOYAGE_OTEL_ENDPOINT |
Default JSONL stats stream (${CLAUDE_PLUGIN_DATA}/trek*-stats.jsonl) is unchanged — OTel is additive. Local Docker Compose stack: examples/observability/. Operator docs: docs/observability.md. Both pin minimum versions per CVE history.
Cost profile
Opus runs the orchestrators (one per command) and the executor (one per plan session). Sonnet runs the exploration and review swarms (5–10 agents per command, with effort/turn limits). The pipeline front-loads cheap Sonnet work so Opus only does synthesis and execution. Typical total: comparable to a long single Claude Code session — the per-command cost is published in ${CLAUDE_PLUGIN_DATA}/trek*-stats.jsonl if you want exact numbers.
For per-profile cost estimates, see docs/profiles.md.
Requirements
- Claude Code (CLI, desktop app, or web app)
- Claude subscription with Opus access (Max plan recommended)
- Optional: Tavily MCP server for enhanced external research
- Optional: Gemini MCP server for independent second opinion via Gemini Deep Research
Architecture
Top-level layout:
trekplan/
├── agents/ 23 specialized agents (sonnet for exploration + review, opus for orchestration)
├── commands/ 6 slash commands (trekbrief, trekresearch, trekplan, trekexecute, trekreview, trekcontinue) + trekplan-end-session helper
├── templates/ Frontmatter templates for brief, research, plan, session, launch
├── hooks/ 5 hooks (pre-bash, pre-write, session-title, post-bash-stats, pre-compact-flush)
├── lib/ Zero-dep parsers and validators (CLI shims under lib/validators/)
├── tests/ 109 node:test cases — `npm test` is the fork-readiness gate
├── docs/ HANDOVER-CONTRACTS.md + architect-bridge-test.md
└── examples/ 01-add-verbose-flag/ — calibrated end-to-end pipeline demo
Pure markdown commands and agents. Hooks and validators are self-contained Node.js with zero npm dependencies. See CONTRIBUTING.md for the full file inventory.
Extending the plugin
Common modifications fork-ers make. None require touching lib/ —
all of these are surface-level changes to commands, agents, or settings.
Add a new exploration agent
Exploration agents run in parallel during /trekplan Phase 5.
They read the codebase and contribute structured findings to plan
synthesis.
- Copy
agents/architecture-mapper.mdas a template:cp agents/architecture-mapper.md agents/my-new-agent.md - Update the frontmatter
name,description,tools, andmodel. Usesonnetunless the agent needs deep reasoning (most don't). - Add the agent to the swarm in
agents/planning-orchestrator.mdPhase 5 — register it under the codebase-size bucket where it should fire (always / medium+ / large only). - Update the agent table in
CLAUDE.mdandREADME.mdto keep the doc-consistency test green:npm test -- tests/lib/doc-consistency.test.mjs
Switch the planning model
The default for /trekbrief, /trekresearch,
/trekplan, and /trekexecute is opus (deep
reasoning). To run on Sonnet for cost or latency, search-and-replace
the frontmatter in three files:
sed -i.bak 's/^model: opus$/model: sonnet/' \
commands/trekbrief.md \
commands/trekresearch.md \
commands/trekplan.md \
commands/trekexecute.md
The exploration agents stay on Sonnet — only the orchestrator is bumped.
Disable external research
/trekresearch --local skips Tavily, Microsoft Learn, and the
Gemini bridge. To make --local the default, edit the front of
commands/trekresearch.md Phase 1 and flip the default branch
of the --local argument check. Or just always pass --local and
document it in your team's CLAUDE.md.
Plugin data contract
The four commands write to a single project directory (.claude/projects/{date}-{slug}/).
The full schema for every artifact is in docs/HANDOVER-CONTRACTS.md.
That document is the single source of truth for:
- File paths each command reads/writes
- Frontmatter schema for
brief.md,research/*.md,plan.md progress.jsonschema- Validator → handover mapping
- Versioning and breaking-change protocol
If you fork the plugin and change the schema for any artifact, update
that doc and the corresponding lib/validators/*.mjs and run
npm test — the validators and doc-consistency tests will catch
schema drift.
Disable the architect bridge
/trekplan auto-discovers architecture/overview.md if an
opt-in upstream architect plugin (not bundled) produced one. To
suppress this, leave the architecture/ directory absent from your
project directory. Discovery is additive — missing file is fine, no
error.
Contributing
See CONTRIBUTING.md.