# trekplan — Brief, Research, Plan, Execute, Review, Continue

![Version](https://img.shields.io/badge/version-3.4.1-blue)
![License](https://img.shields.io/badge/license-MIT-green)
![Platform](https://img.shields.io/badge/platform-Claude%20Code-purple)

> **Solo-maintained, fork-and-own.** This plugin is a starting point, not a vendor product. Issues are welcome as signals; pull requests are not accepted. See [GOVERNANCE.md](GOVERNANCE.md) for the full model and what upstream provides.

*AI-generated: all code produced by Claude Code through dialog-driven development. [Full disclosure →](../../README.md#ai-generated-code-disclosure)*

A [Claude Code](https://docs.anthropic.com/en/docs/claude-code) plugin for deep implementation planning, multi-source research, autonomous execution, independent post-hoc review, operator-driven artifact annotation, and zero-friction multi-session resumption. Seven commands, one pipeline:

| Command | What it does |
|---------|-------------|
| **`/trekbrief`** | Brief — interactive interview produces a task brief with explicit research plan |
| **`/trekresearch`** | Research — deep local + external research with triangulation |
| **`/trekplan`** | Plan — agent swarm exploration, Opus planning, adversarial review |
| **`/trekexecute`** | Execute — disciplined step-by-step implementation with failure recovery |
| **`/trekreview`** | Review — independent post-hoc review of delivered code against the brief, severity-tagged findings |
| **`/trekrevise`** | Revise — apply operator annotations from the playground back into brief/plan/review with audit trail (Handover 8, v4.2) |
| **`/trekcontinue`** | Continue — read `.session-state.local.json` and resume the next session in a multi-session project |

Every artifact lives in one project directory: `.claude/projects/{YYYY-MM-DD}-{slug}/` contains `brief.md`, `research/NN-*.md`, `plan.md`, `sessions/`, `progress.json`, and `review.md`.

### Division of labor

| Command | Responsibility | Output |
|---|---|---|
| `/trekbrief` | **Capture intent** — intent, goal, non-goals, success criteria, and a research plan with explicit topics. Interactive only. | `brief.md` (task brief) |
| `/trekresearch` | **Gather context** — code state, external docs, community, risk. Makes NO build decisions. | `research/NN-slug.md` (research brief) |
| `/trekplan` | **Transform intent into an executable contract** — per-step YAML manifest, regex-validated checkpoints, verifiable paths. Plan-critic is a hard gate. Auto-discovers `architecture/overview.md` as priors when an opt-in upstream architect plugin (not bundled) is installed. | `plan.md` with Manifest blocks + `plan_version: 1.7` |
| `/trekexecute` | **Execute the contract disciplined** — fresh verification, independent manifest audit, honest reporting. Does NOT compensate for weak plans — escalates. | `progress.json` + structured report + manifest-audit status |
| `/trekreview` | **Close the loop** — independent post-hoc reviewer reads `brief.md` and the diff produced by execute, runs brief-conformance + code-correctness reviewers in parallel, dedups via Judge Agent. Severity-tagged findings (Critical/High/Medium/Low/Info) feed back into planning via Handover 6. | `review.md` (`type: trekreview`) with stable 40-char hex finding-IDs |
| `/trekrevise` | **Re-fold operator annotations** — operator opens the playground, anchors comments on a brief/plan/review, exports the batch, pastes back as `/trekrevise --apply`. Round-trippable in-place revision: `revision:` counter, `source_annotations:` audit trail, deterministic `annotation_digest`. Single-iteration MVP per Handover 8. | revised `brief.md` / `plan.md` / `review.md` with frontmatter audit trail |

**Principle:** Each step consumes the previous step's structured artifact. If execute has to guess, the plan is weak and must be revised upstream — not patched downstream.

### Two kinds of briefs

Terminology matters:
- **Task brief** — produced by `/trekbrief`. Captures *what we want and why*. Drives planning.
- **Research brief** — produced by `/trekresearch`. Captures *what we learned about a topic*. Feeds planning.

A project typically has one task brief and zero-to-N research briefs.

### Manifest-verified steps

Every step in the plan ends with a YAML `manifest:` block declaring `expected_paths`, `commit_message_pattern`, `bash_syntax_check`, `forbidden_paths`, `must_contain`. The executor checks the manifest against the resulting commit — a step may not be marked passed if its manifest does not verify, regardless of the Verify command's exit code (Hard Rule 17).

After all steps complete, `/trekexecute` runs **Phase 7.5 — Manifest audit (independent)**: re-verifies every expected path from git log + filesystem, ignoring the agent's own bookkeeping. Drift → status `partial`, **Phase 7.6** auto-dispatches a bounded recovery session with only the missing steps (`recovery_depth ≤ 2`). Step 0 pre-flight (`git push --dry-run`) runs inside every session sandbox before any real work — exit 77 sentinel catches sandbox push-denial before the agent wastes the whole budget.

No cloud dependency. No GitHub requirement. Works on **Mac, Linux, and Windows**.

### Autonomy mode (`--gates`, v3.4.0)

All four pipeline commands accept `--gates {open|closed|adaptive}` to control how many autonomy checkpoints surface to the operator on the path from brief approval to main-merge.

| Value | Behavior |
|-------|----------|
| `open` | Skip optional checkpoints. The pipeline runs end-to-end with the fewest interruptions. Suitable for trusted briefs in clean repos. |
| `closed` | Stop at every checkpoint. The operator confirms each transition. Suitable for high-stakes work or unfamiliar repos. |
| `adaptive` (default) | Stop only when the autonomy-gate state machine reports a meaningful boundary (manifest-audit FAIL, plan-critic BLOCKER, main-merge gate). Best balance of velocity and safety. |

Under the hood, `lib/util/autonomy-gate.mjs` runs a small state machine (`idle → approved → executing → merge-pending → main-merged`) and `lib/stats/event-emit.mjs` records each transition to `${CLAUDE_PLUGIN_DATA}/trek*-stats.jsonl`. The new `hooks/scripts/post-compact-flush.mjs` PostCompact hook re-injects `.session-state.local.json` after context compaction so multi-session work survives a compaction boundary.

## Quick start

```bash
# Install the marketplace, then browse and enable plugins with /plugin
claude plugin marketplace add https://git.fromaitochitta.com/open/ktg-plugin-marketplace.git

# Capture intent (interactive)
/trekbrief Add user authentication with JWT tokens
# → .claude/projects/2026-04-18-jwt-auth/brief.md

# Research each topic identified in the brief (manual default)
/trekresearch --project .claude/projects/2026-04-18-jwt-auth --external "What are current JWT best practices?"

# Plan from brief + research
/trekplan --project .claude/projects/2026-04-18-jwt-auth

# Execute
/trekexecute --project .claude/projects/2026-04-18-jwt-auth

# Review (independent post-hoc verification of the diff against brief.md)
/trekreview --project .claude/projects/2026-04-18-jwt-auth
# → .claude/projects/2026-04-18-jwt-auth/review.md
```

Or opt into auto-mode in `/trekbrief` — it will run research and planning sequentially inline in the main context, and return when `plan.md` is ready.

If review finds issues, feed `review.md` back into planning to produce a remediation plan: `/trekplan --brief .claude/projects/2026-04-18-jwt-auth/review.md`. The remediation plan carries `source_findings: [<id>, ...]` in its frontmatter — full audit trail back to the consumed findings (Handover 6).

An optional architect step can sit between research and plan — `/trekplan` auto-discovers an `architecture/overview.md` produced by an opt-in upstream architect plugin (not bundled here; the architect plugin is no longer publicly distributed, but the `architecture/overview.md` filesystem slot remains available for any compatible producer).

## When to use it

**Use it when:**
- The task touches 3+ files or modules and you need to understand how they connect
- You're working in an unfamiliar codebase and need a map before you start
- The implementation has non-obvious dependencies, ordering constraints, or risks
- You want a reviewable plan before committing to an approach
- You need autonomous headless execution without human intervention
- You need to research a technology, library, or approach before deciding

**Don't use it when:**
- The task is a single-file change where the fix is obvious
- You already know exactly what to change and in what order

**Rule of thumb:** If you can describe the full implementation in one sentence and it touches 1-2 files, skip trekplan and just implement. If you need to think about it, trekplan earns its cost.

## What you get

Concrete capabilities, observable in the code — not aspirations.

**Across all profiles:**
- Strategy-to-execution on four explicit handover points. Each transition is a filesystem contract (`docs/HANDOVER-CONTRACTS.md`), not a conversation. You can stop after any stage and resume later without context loss.
- Resume safety after long sessions. The PreCompact hook reconciles `progress.json` with git history before context compaction (CC v2.1.105+) — closes a documented `--resume` failure mode.
- Schema discipline. `plan-validator --strict` enforces `### Step N:` form and rejects narrative drift (`### Fase`, `### Phase`, `### Stage`, `### Steg`) before execution.
- Audit trail by construction. Every executed step records `commit_sha`, `verify_passed`, `files_changed` in `progress.json`.

**Solo developer.** Plans survive across sessions; adversarial review (plan-critic + scope-guardian) catches your own tunnel vision before code is written; brief-phase forces clarity on what the task actually is. `examples/01-add-verbose-flag/` shows what good shape looks like.

**Team (2–10).** Plan files are handover-ready — a colleague can pick up a project directory without re-asking "what did you mean here?". `--decompose` splits a plan into self-contained headless sessions with scoped `--allowedTools`. The plan-critic semantic rubric gives the team a shared definition of "this plan defers decisions to the executor".

**Virksomhet / regulated environment.** Defense-in-depth security across four layers (plugin hooks, prompt-level denylist, pre-execution plan scan, scoped tool access). `disableSkillShellExecution: true` recommendation for fork-ers handling untrusted briefs. No cloud dependency, no GitHub requirement. Validators are plain-Node CLIs — invocable from CI, custom hooks, or external tools, not just from voyage commands.

**What it doesn't solve:**
- LLM output truthfulness. Validators check shape, not facts. A plan with hallucinated paths passes schema but fails in execute. Plan-critic catches some, not all.
- Multi-user concurrency on a single project directory. Two simultaneous executors will clobber `progress.json`.
- Cost management. Opus on the orchestrator layer is expensive; documented in [Cost profile](#cost-profile), no automatic model downgrade.
- Linear/Jira/Slack integrations. Intentional omission — solo project, no enterprise wiring.

**One-line summary:** an executable contract pipeline where each stage is filesystem-validated, session-survivable, and skill-independent — in exchange for writing an actual brief before planning and an actual plan before coding.

---

## `/trekbrief` — Brief

Interactive requirements-gathering command. Runs a **dynamic, quality-gated interview** and produces a **task brief** with an explicit research plan. Optionally orchestrates the rest of the pipeline.

A section-driven interview loop fills required brief sections (Intent / Goal / Success Criteria / Research Plan) until each shows initial signal, then `brief-reviewer` scores the draft on five dimensions (completeness, consistency, testability, scope clarity, research-plan validity) and gates publication. Max 3 review iterations; force-stop yields a `brief_quality: partial` brief with the failing dimensions documented.

Output: `.claude/projects/{YYYY-MM-DD}-{slug}/brief.md`

### Modes

| Mode | Usage | Behavior |
|------|-------|----------|
| **Default** | `/trekbrief <task>` | Dynamic interview until quality gates pass. No question cap. |
| **Quick** | `/trekbrief --quick <task>` | Starts compact (optional sections get at most one probe), still escalates on weak required sections or failed review gate. |
| **Profile** | `/trekbrief --profile <name> <task>` | (v4.1.0) Pin model profile for the brief phase: `economy` / `balanced` / `premium` / `<custom>`. See [Profile system](#profile-system-v410) below. |

`/trekbrief` is **always interactive**. There is no foreground/background mode — the interview requires user input.

### Force-stop

If you say "stop" or "enough" during Phase 4, the current review findings are surfaced with per-dimension scores and you choose:
- **Answer one more follow-up** — the loop continues.
- **Stop now (accept partial brief)** — the brief is finalized with `brief_quality: partial` and a `## Brief Quality` section listing the failing dimensions. Downstream planning will treat these as reduced-confidence areas.

### What the brief contains

- **Intent** — why this matters, motivation, user need (load-bearing)
- **Goal** — concrete end state in 1-3 sentences
- **Non-Goals** — explicitly out of scope
- **Constraints / Preferences / NFRs** — technical, time, resource limits
- **Success Criteria** — 2-4 falsifiable commands/observations
- **Research Plan** — N topics, each with research question, scope (local/external/both), confidence needed, cost estimate, and a ready-to-run `/trekresearch` command
- **Open Questions / Assumptions** — from "I don't know" answers and implicit gaps
- **Prior Attempts** — what worked/failed before

---

## `/trekresearch` — Research

Deep, multi-phase research that combines local codebase analysis with external knowledge. Uses specialized agent swarms to investigate multiple dimensions in parallel, then triangulates findings.

A parallel swarm of up to 5 local + 4 external Sonnet agents investigates 3–8 research dimensions, with optional Gemini Deep Research as an independent second opinion. Findings are triangulated (local vs. external, confidence per dimension, contradictions flagged) and synthesized into a structured research brief.

Output:
- With `--project <dir>`: `{dir}/research/{NN}-{slug}.md` (auto-incremented index)
- Without: `.claude/research/trekresearch-{date}-{slug}.md`

### Modes

| Mode | Usage | Behavior |
|------|-------|----------|
| **Default** | `/trekresearch <question>` | Interview + research swarm (local + external + Gemini), foreground |
| **Project** | `/trekresearch --project <dir> <question>` | Write brief into `{dir}/research/NN-slug.md` |
| **Quick** | `/trekresearch --quick <question>` | Interview (short) + inline research, no agent swarm |
| **Local** | `/trekresearch --local <question>` | Only codebase analysis agents (skip external + Gemini) |
| **External** | `/trekresearch --external <question>` | Only external research agents (skip codebase analysis) |
| **Foreground** | `/trekresearch --fg <question>` | No-op alias (foreground is default since v2.4.0) |
| **Profile** | `/trekresearch --profile <name> <question>` | (v4.1.0) Pin model profile for the research phase. See [Profile system](#profile-system-v410). |

Flags combine: `--project <dir> --external`.

Research uses up to 5 local agents (architecture-mapper, dependency-tracer, task-finder, git-historian, convention-scanner) and 4 external agents (docs-researcher, community-researcher, security-researcher, contrarian-researcher) plus the optional Gemini bridge for an independent second opinion. Per-agent details in [`agents/`](agents/).

---

## `/trekplan` — Planning

Produces an implementation plan detailed enough for autonomous execution. **v2.0 breaking change:** requires `--brief` or `--project`. There is no longer an interview inside `/trekplan` — use `/trekbrief` first.

After `brief-reviewer` validates the input brief, 6–8 Sonnet exploration agents analyze the codebase in parallel and merge findings into a synthesis. Optional research briefs (`--research`, or auto-discovered in `{project_dir}/research/`) enrich the plan; `architecture/overview.md` priors are loaded if an opt-in upstream architect plugin (not bundled) produced one. Opus then writes the plan with per-step YAML manifests, which `plan-critic` (9 dimensions) and `scope-guardian` adversarially review before handoff.

Output:
- With `--project <dir>`: `{dir}/plan.md`
- With `--brief <path>`: `.claude/plans/trekplan-{date}-{slug}.md`

### Modes

| Mode | Usage | Behavior |
|------|-------|----------|
| **Project** | `/trekplan --project <dir>` | Read `{dir}/brief.md` + auto-discover `{dir}/research/*.md`, write `{dir}/plan.md` |
| **Brief** | `/trekplan --brief <path>` | Plan from a specific brief file |
| **Research-enriched** | `/trekplan --project <dir> --research <brief>` | Add extra research briefs beyond what is in `research/` |
| **Foreground** | `/trekplan --project <dir> --fg` | No-op alias (foreground is default since v2.4.0) |
| **Quick** | `/trekplan --project <dir> --quick` | No agent swarm, lightweight scan only |
| **Decompose** | `/trekplan --decompose plan.md` | Split plan into headless session specs |
| **Export** | `/trekplan --export pr plan.md` | PR description, issue comment, or clean markdown |
| **Profile** | `/trekplan --profile <name> --project <dir>` | (v4.1.0) Pin model profile; emitted as `profile:` in plan.md frontmatter. See [Profile system](#profile-system-v410). |

`--brief` or `--project` is **required**. `/trekplan` with no brief exits with an error and a pointer to `/trekbrief`.

### What the plan contains

Every plan includes:

- **Context** — derived from brief `## Intent` + `## Goal`
- **Architecture Diagram** — Mermaid C4-style component diagram
- **Codebase Analysis** — tech stack, patterns, relevant files, reusable code
- **Research Sources** — findings from research briefs (when present)
- **Implementation Plan** — ordered steps with file paths, changes, failure recovery, and git checkpoints
- **Per-step Manifest** — YAML block with `expected_paths`, `commit_message_pattern`, `bash_syntax_check`, `forbidden_paths`, `must_contain`
- **Alternatives Considered** — other approaches with pros/cons
- **Test Strategy** — from test-strategist findings
- **Risks and Mitigations** — from risk-assessor findings
- **Verification** — testable end-to-end criteria
- **Execution Strategy** — session grouping and parallel waves (plans with > 5 steps)
- **Plan Quality Score** — quantitative grade (A-D) across 6 weighted dimensions

Every implementation step includes:
- **On failure:** — what to do when verification fails (revert / retry / skip / escalate)
- **Checkpoint:** — git commit after success
- **Manifest:** — the objective completion predicate (Hard Rule 17)

Exploration uses 6–8 Sonnet agents in parallel (architecture-mapper, dependency-tracer, task-finder, test-strategist, git-historian, risk-assessor, plus convention-scanner on medium+ codebases and research-scout when unfamiliar tech is detected). Adversarial review then runs `brief-reviewer`, `plan-critic` (9 dimensions, no-placeholder enforcement, manifest audit), and `scope-guardian` (creep + gap detection). Per-agent details in [`agents/`](agents/).

---

## `/trekexecute` — Execution

Reads a plan from `/trekplan` and implements it with strict discipline. No guessing, no improvising — follows the plan exactly.

Per step: apply Changes exactly as written → run Verify (exit code is truth) → manifest audit (expected paths, forbidden paths, commit pattern) → follow the plan's failure clause if anything fails (revert / retry / skip / escalate) → Checkpoint commit. After all steps: independent Phase 7.5 manifest audit from git log + filesystem (ignoring agent bookkeeping); drift triggers Phase 7.6 bounded recovery.

### Modes

| Mode | Usage | Behavior |
|------|-------|----------|
| **Project** | `/trekexecute --project <dir>` | Read `{dir}/plan.md`, write `{dir}/progress.json` |
| **Plan path** | `/trekexecute plan.md` | Execute a specific plan file |
| **Resume** | `/trekexecute --project <dir> --resume` | Resume from last progress checkpoint |
| **Dry run** | `/trekexecute --project <dir> --dry-run` | Validate plan structure + preview sessions and billing |
| **Validate** | `/trekexecute --project <dir> --validate` | Schema-only check — parse steps + manifests, report `READY \| FAIL`, no execution |
| **Single step** | `/trekexecute --project <dir> --step 3` | Execute only step 3 |
| **Foreground** | `/trekexecute --project <dir> --fg` | Force sequential, ignore Execution Strategy |
| **Single session** | `/trekexecute --project <dir> --session 2` | Execute only session 2 from Execution Strategy |
| **Profile** | `/trekexecute --profile <name> --project <dir>` | (v4.1.0) Pin model profile for execution. Plan-frontmatter `profile:` is honored when this flag is omitted. See [Profile system](#profile-system-v410). |

### Session-aware parallel execution (worktree-isolated)

When a plan has an `## Execution Strategy` section (auto-generated by `/trekplan` for plans with > 5 steps), `/trekexecute` automatically:

1. **Pre-flight checks** — validates clean working tree, plan file tracked in git, no scope fence overlaps between parallel sessions, no stale worktrees
2. **Creates git worktrees** — each parallel session gets its own isolated worktree and branch (`trek/{slug}/session-{N}`)
3. Launches `claude -p` per session per wave, each in its own worktree
4. **Merges branches back** sequentially with `--no-ff` after each wave completes
5. **Cleans up** worktrees and branches unconditionally (even on failure)
6. Runs master verification on the merged result

```
Wave 1: Session 1 (worktree-1) + Session 2 (worktree-2)  -- parallel
         ↓ both complete → sequential merge to main
Wave 2: Session 3 (worktree-3)                             -- sequential
         ↓ complete → merge to main
Cleanup worktrees + Master verification
```

Each session operates in complete filesystem isolation — no shared git index, no race conditions, no data loss. If a merge produces conflicts, the merge is aborted and conflicting files are reported.

Use `--fg` to force sequential execution even when a plan has an Execution Strategy.

### Billing safety

Before launching parallel `claude -p` sessions, `/trekexecute` checks whether `ANTHROPIC_API_KEY` is set in your environment. If it is, parallel sessions will bill your **API account** (pay-per-token), not your Claude subscription (Max/Pro). This can be expensive — parallel Opus sessions can cost $50-100+ per run.

When an API key is detected, you are asked how to proceed:
- **Use --fg instead** (recommended) — run sequentially in the current session using your subscription
- **Continue with API billing** — launch parallel sessions on your API account
- **Stop** — cancel and unset the API key first

If no API key is set, parallel sessions use your subscription and proceed without asking.

### Failure recovery

- **3-attempt retry cap** — retries twice, then stops (never loops forever)
- **On failure: revert** — undo changes, stop
- **On failure: retry** — try alternative approach, then revert if still failing
- **On failure: skip** — non-critical step, continue
- **On failure: escalate** — stop everything, needs human judgment

### Security hardening

The executor implements defense-in-depth security across four layers:

1. **Plugin hooks** — `pre-bash-executor.mjs` blocks 13 categories of destructive commands (rm -rf /, chmod 777, pipe-to-shell, eval injection, disk wipe, shutdown, fork bombs, cron persistence, process killing, history destruction) with bash evasion normalization. `pre-write-executor.mjs` blocks writes to `.git/hooks/`, `.claude/settings.json`, shell configs, `.ssh/`, `.aws/`, and `.env` files
2. **Prompt-level denylist** — Security rules embedded in the executor command and session spec template that work even in headless `claude -p` sessions where hooks don't run
3. **Pre-execution plan scan** — Phase 2.4 scans all `Verify:` and `Checkpoint:` commands against the denylist before execution begins, catching dangerous commands before they reach the executor
4. **Scoped tool access** — Headless child sessions use `--allowedTools "Read,Write,Edit,Bash,Glob,Grep"` instead of `--dangerously-skip-permissions`, blocking Agent spawning, MCP tools, and web access in parallel sessions

#### Recommended: disable Skill shell execution (CC v2.1.91+)

For fork-ers handling untrusted task briefs or plans from external
sources, set `disableSkillShellExecution: true` in `~/.claude/settings.json`
or in the project's `.claude/settings.json`:

```json
{
  "disableSkillShellExecution": true
}
```

This prevents Skills from invoking arbitrary shell, which closes a
prompt-injection vector that the plugin's own hooks cannot fully mitigate
(Skills can fire before `pre-bash-executor` matches). See
[SECURITY.md](SECURITY.md) for the full hardening list.

### Headless execution

`/trekexecute` is designed for `claude -p` headless sessions:
- **No questions asked** — all recovery decisions come from the plan
- **Progress file** — crash recovery via `{project_dir}/progress.json` (or `.trekexecute-progress-{slug}.json` for legacy plans)
- **Scope fence enforcement** — never touches files outside the session's scope
- **JSON summary** — machine-parseable `trekexecute_summary` block for log parsing

#### Headless multi-session tuning (CC v2.1.89+)

When running multiple parallel `claude -p` sessions (decomposed plans
or wave-based execution), set `MCP_CONNECTION_NONBLOCKING=true` in the
launching environment so MCP server connection latency does not
serialize startup across waves:

```bash
export MCP_CONNECTION_NONBLOCKING=true
bash .claude/projects/{slug}/sessions/launch.sh
```

Without this, each child session can spend 1-3 s blocking on MCP
connect, multiplying across waves. Setting it lets MCP connect lazily
on first tool call.

### Session titles for voyage commands (CC v2.1.94+)

A `UserPromptSubmit` hook (`hooks/scripts/session-title.mjs`) sets the
session title to `voyage:<command>:<slug>` whenever you invoke one of
the four voyage commands. This makes multi-session headless runs and
session-picker output trivially identifiable. Slug derivation:

| Invocation | Session title |
|-----------|---------------|
| `/trekplan --project .claude/projects/2026-04-18-jwt-auth` | `voyage:plan:jwt-auth` |
| `/trekbrief --quick` | `voyage:brief:ad-hoc` |
| `/trekexecute --project .claude/projects/2026-05-10-cleanup --resume` | `voyage:execute:cleanup` |

The hook is fail-open — any error → title is left untouched.

### Per-step timing (CC v2.1.97+)

A `PostToolUse` hook (`hooks/scripts/post-bash-stats.mjs`) appends
`duration_ms` from each Bash tool call to
`${CLAUDE_PLUGIN_DATA}/trekexecute-stats.jsonl`. One line per Bash
call; useful for identifying long-running verify or checkpoint commands
across executions.

---

## `/trekreview` — Review

Independent post-hoc review of delivered code against the brief. Reads `brief.md`
from scratch and treats research/plan as supplementary context. The output
`review.md` is a new artifact type (`type: trekreview`) with its own validator
and a contracted **Handover 6 (review → plan)** so findings can be fed back into
`/trekplan --brief review.md` to produce a remediation plan — closing
the iteration loop without ad-hoc conventions.

### Modes

| Mode | Command | Description |
|------|---------|-------------|
| **Default** | `/trekreview --project <dir>` | brief-conformance + code-correctness reviewers in parallel, coordinator dedup + verdict, write `{dir}/review.md` |
| **Since ref** | `/trekreview --project <dir> --since <ref>` | Override "before" SHA for the diff range. Validated via `git rev-parse --verify` |
| **Quick** | `/trekreview --project <dir> --quick` | Skip brief-conformance reviewer; skip coordinator's reasonableness filter — fast correctness-only pass |
| **Validate** | `/trekreview --project <dir> --validate` | Schema-only check on existing `review.md`. No LLM calls |
| **Dry run** | `/trekreview --project <dir> --dry-run` | Print discovered scope + triage map; skip writes |
| **Profile** | `/trekreview --profile <name> --project <dir>` | (v4.1.0) Pin model profile for the review phase. See [Profile system](#profile-system-v410). |

### What review produces

`review.md` carries a flat array of finding-IDs in frontmatter (40-char hex from `lib/parsers/finding-id.mjs`) plus the full finding objects in the body under per-severity headings:

- `## Findings (BLOCKER)` — must-fix before merge
- `## Findings (MAJOR)` — should-fix
- `## Findings (MINOR)` — nice-to-fix
- `## Findings (SUGGESTION)` — opinion-only

Required body sections: `## Executive Summary`, `## Coverage`, `## Remediation Summary`. The Coverage section enumerates which files were deep-reviewed, summary-only, or skipped (with reason) — explicit triage to avoid silent skips.

### Triage gate (deterministic)

A path-pattern classifier produces `{file → deep-review|summary-only|skip}` before any LLM runs. Hardcoded skip patterns: `*.lock`, `*.svg`, `dist/**`, `build/**`, `node_modules/**`, generated markers. Deep-review patterns: `auth/**`, `crypto/**`, `**/security/**`. Hard refuse-with-suggestion above 100 files / 100K diff tokens.

### Feedback loop (Handover 6)

```bash
/trekreview --project <dir>
# → review.md (BLOCKER + MAJOR findings)

/trekplan --brief <dir>/review.md
# → plan.md with `source_findings: [<id>, ...]` audit trail
#   (BLOCKER + MAJOR findings become plan goals; MINOR + SUGGESTION skipped for v1.0)
```

The plan's optional `source_findings:` frontmatter list is the audit trail back to consumed findings. See `docs/HANDOVER-CONTRACTS.md` for the full Handover 6 contract.

---

## `/trekcontinue` — Resume

Zero-friction multi-session resumption. Type `/trekcontinue` in a fresh
Claude Code session — the command reads the per-project state file
(`.claude/projects/<project>/.session-state.local.json`), prints a 3-line
summary, and immediately begins executing the next session.

The state file is the contract — any session-end mechanism may write it
(`/trekexecute` Phase 8 / Phase 2.55 / Phase 4 do so automatically;
the `/trekendsession` helper writes it for informal flows;
`graceful-handoff` may converge on it in a future release). `/trekcontinue`
only reads. See **Handover 7** in `docs/HANDOVER-CONTRACTS.md` for the full
schema and producer/consumer contract.

### Modes

| Mode | Command | Description |
|------|---------|-------------|
| **Default** | `/trekcontinue` | Auto-discover `.session-state.local.json` under cwd, validate, narrate, and begin executing the next session |
| **Explicit** | `/trekcontinue <project-dir>` | Use the named project directory; helpful when several active projects coexist under cwd |
| **Help** | `/trekcontinue --help` | Print usage block and the schema-v1 reference |
| **Profile** | `/trekcontinue --profile <name> [<project-dir>]` | (v4.1.0) Pin model profile for the resumed session. Plan-frontmatter `profile:` from the previous session is honored when this flag is omitted. See [Profile system](#profile-system-v410). |

### Schema v1 — `.session-state.local.json`

| Field | Type | Required | Notes |
|---|---|---|---|
| `schema_version` | number | yes | Must be `1` |
| `project` | string | yes | Project directory path |
| `next_session_brief_path` | string | yes | Validator soft-checks file existence (warning, not error) |
| `next_session_label` | string | yes | Human-readable label for the next session (e.g. "Session 2 of 5") |
| `status` | enum | yes | `in_progress` \| `partial` \| `failed` \| `stopped` \| `completed` (`completed` → no further sessions to resume) |
| `updated_at` | string | yes | ISO-8601 timestamp |

Forward-compat: unknown top-level keys are ignored (no errors, no warnings) — the same drift-WARN principle as Handover 3, so future producers (e.g. graceful-handoff v2.2) can extend the schema additively.

### `/trekendsession` helper

For informal multi-session flows that don't run through `/trekexecute`
(ad-hoc release runs, manual handovers), use the helper to write the state
file at session-end:

```bash
/trekendsession .claude/projects/2026-05-01-feature/brief.md "Session 2 of 3"
# Writes .session-state.local.json with status=in_progress.
# Then in a fresh chat: /trekcontinue
```

Both arguments are required. No interactive prompt — headless-safe.

### Typical flow

```bash
# Session 1 (long-running formal pipeline)
/trekplan --project .claude/projects/2026-05-01-feature
/trekexecute --project .claude/projects/2026-05-01-feature
# ... trekexecute Phase 8 writes .session-state.local.json on session-end ...

# (chat boundary — fresh Claude Code session)
/trekcontinue
# → reads state, prints 3-line summary, begins next session
```

---

## `/trekrevise` — Annotation playground (v4.2)

Voyage's first interactive playground. Open `playground/voyage-playground.html`
in any modern browser, paste your `brief.md`, `plan.md`, or `review.md`, and
annotate the rendered artifact directly. The playground is a single
self-contained HTML file with vendored `markdown-it` + `highlight.js` —
no build step, no network calls, no telemetry.

The annotation lifecycle is **Handover 8 — annotation → revision** (see
`docs/HANDOVER-CONTRACTS.md`). Three stages:

1. **Annotate** — drag-select text or hover-anchor a paragraph; fill the
   modal with comment + intent (`change` / `add` / `remove` / `clarify` /
   `risk`). Anchors are placed only at block boundaries; the playground
   refuses placement inside list items, mid-paragraph, or at line-start
   collision points.
2. **Export** — click "Eksporter batch" to copy a complete
   `/trekrevise --project <dir> --apply '{...JSON...}'` invocation to your
   clipboard.
3. **Apply** — paste the command in Claude Code chat. `/trekrevise` parses
   the batch, re-runs placement validation, atomically writes anchor comment
   blocks back into the source artifact, increments `revision:`, appends to
   `source_annotations:`, and recomputes the deterministic `annotation_digest`.

The artifact's body remains byte-identical outside anchor blocks (SC2). Re-applying
the same batch yields the same digest (idempotence — SC3). The frontmatter
audit trail makes every revision traceable in git.

```bash
# Open the playground (one-time)
open plugins/voyage/playground/voyage-playground.html
# Or render an artifact server-side via the CLI helper:
node plugins/voyage/scripts/render-artifact.mjs \
  .claude/projects/2026-05-09-feature/plan.md \
  --out /tmp/plan.html
```

For a step-by-step walkthrough, see
[`docs/annotation-quickstart.md`](docs/annotation-quickstart.md).

**Single-iteration MVP:** Each operator batch produces one `revision:` bump.
Multi-iteration loops (revise → re-review → revise again) are deferred
indefinitely per research-05; the brief's SC4 wording is single-revision.

---

## The full pipeline

```
 /trekbrief   /trekresearch    /trekplan         /trekexecute
 ┌──────────────┐    ┌───────────────────┐   ┌─────────────────────┐  ┌─────────────────────┐
 │ Interview    │    │ 5 local agents    │   │ brief-reviewer      │  │ Parse plan          │
 │ ↓            │    │ 4 external agents │   │ ↓                   │  │ ↓                   │
 │ Intent/Goal  │    │ + Gemini bridge   │   │ 6-8 exploration     │  │ Detect sessions     │
 │ ↓            │    │ ↓                 │   │ agents (parallel)   │  │ ↓                   │
 │ Research     │    │ Triangulation     │   │ ↓                   │  │ Execute steps       │
 │ topics       │    │ ↓                 │   │ Opus planning       │  │ (verify + manifest  │
 │ ↓            │ → brief → → → → → → → → → → → ↓                   │→ │  + checkpoint)      │
 │ brief.md     │    │ research/*.md     │   │ plan-critic +       │  │ ↓                   │
 └──────────────┘    └───────────────────┘   │ scope-guardian      │  │ Phase 7.5 manifest  │
                                             │ ↓                   │  │ audit + 7.6 recovery│
                                             │ plan.md             │  │ ↓                   │
                                             └─────────────────────┘  │ progress.json + done│
                                                                      └─────────────────────┘
```

All artifacts live under `.claude/projects/{YYYY-MM-DD}-{slug}/`.

An opt-in upstream architect plugin (not bundled) can insert a Claude-Code-specific architecture-matching step between research and plan — `/trekplan` auto-discovers its `architecture/overview.md` output as priors when present.

### Example workflows

**Standard pipeline (manual control):**
```bash
/trekbrief Add session caching with Redis
# → .claude/projects/2026-04-18-redis-session-caching/brief.md
# Interview identifies 2 research topics.

/trekresearch --project .claude/projects/2026-04-18-redis-session-caching --external "What are Redis session-caching best practices?"
/trekresearch --project .claude/projects/2026-04-18-redis-session-caching --local "How is caching currently handled in the codebase?"
# → .claude/projects/2026-04-18-redis-session-caching/research/01-*.md, 02-*.md

/trekplan --project .claude/projects/2026-04-18-redis-session-caching
# → .claude/projects/2026-04-18-redis-session-caching/plan.md

/trekexecute --project .claude/projects/2026-04-18-redis-session-caching
# → progress.json + code changes
```

**Auto-mode (Claude manages the pipeline):**
```bash
/trekbrief Add session caching with Redis
# Interview identifies topics. Choose "Auto (managed by Claude Code)" when asked.
# Claude runs research in parallel, then planning in foreground.
# Returns when plan.md is ready.

/trekexecute --project .claude/projects/2026-04-18-redis-session-caching
```

**Standalone research (no planning):**
```bash
/trekresearch What are the security implications of using JWT for session management?
# Read the brief, share with team, use for decision-making.
```

**Quick plan for small tasks:**
```bash
/trekbrief --quick Fix the login redirect bug
/trekplan --project .claude/projects/2026-04-18-login-redirect-fix --quick
/trekexecute --project .claude/projects/2026-04-18-login-redirect-fix
```

**Dry run + validate before executing:**
```bash
/trekexecute --project <dir> --validate   # schema check, no execution
/trekexecute --project <dir> --dry-run    # preview sessions and billing
/trekexecute --project <dir>              # execute
```

**Review feedback loop (Handover 6):**
```bash
/trekreview --project <dir>
# → review.md with severity-tagged findings + verdict (BLOCK / WARN / ALLOW)

# If verdict is BLOCK or WARN, feed findings back into a remediation plan:
/trekplan --brief <dir>/review.md
# → plan.md with source_findings: [<id>, ...] audit trail

/trekexecute --project <dir>              # execute the remediation plan

/trekreview --project <dir>               # re-review (overwrites review.md)
```

---

## Upgrading

Migration notes for breaking changes (v1.x → v2.0, v2.x → v3.0) live in [CHANGELOG.md](CHANGELOG.md) and [MIGRATION.md](MIGRATION.md). v3.x non-breaking — minor bumps within v3 add features without changing pipeline contracts.

## Quality infrastructure (since v3.1.0)

The plugin ships with `node:test`-based unit tests and a `lib/` directory of pure-JS validators wired into the commands. Forking the plugin for internal use? Run `npm test` to confirm the parsers, validators, and doc-consistency invariants still hold:

```bash
cd plugins/trekplan
npm test    # runs all tests under tests/**/*.test.mjs
```

Validators (zero npm deps, hand-rolled YAML subset):

| Module | Purpose |
|---|---|
| `lib/validators/brief-validator.mjs` | brief.md frontmatter + state machine (research_topics + status coherence) + body sections |
| `lib/validators/research-validator.mjs` | research-brief frontmatter (confidence ∈ [0,1], dimensions ≥ 1) + body sections; `--dir` mode validates a whole `research/` folder |
| `lib/validators/plan-validator.mjs` | wraps plan-schema + manifest-yaml; enforces v1.7 step heading, manifest count match, and forbidden-narrative-form denylist (`### Fase/Phase/Stage/Steg N`) — replaces the Phase 5.5 grep checks |
| `lib/validators/progress-validator.mjs` | progress.json shape (schema_version, status enum, current_step in range) + resume-readiness check |
| `lib/validators/architecture-discovery.mjs` | EXTERNAL CONTRACT — drift-WARN, never drift-FAIL. Discovers `architecture/overview.md` (owned by an opt-in upstream architect plugin, not bundled) and tolerates non-canonical filenames with warnings. |

Each module exposes a CLI: `node lib/validators/<name>.mjs --json <path>` returns structured `{valid, errors, warnings, parsed}`. Commands invoke the CLI as their schema check.

A doc-consistency test (`tests/lib/doc-consistency.test.mjs`) pins prose-vs-source invariants — the agent table in `CLAUDE.md` must match the `agents/*.md` file count, every command's frontmatter `name:` must match its filename, and `templates/plan-template.md` must declare `plan_version: 1.7`.

Borrowed pattern from `llm-security` (commit `97c5c9d`); extending the plugin should preserve the invariants the test pins.

### Handover contracts

`docs/HANDOVER-CONTRACTS.md` is the single source of truth for the file formats that pass between the four pipeline commands (brief → research → plan → execute). When you fork the plugin or extend a stage, that document tells you what every producer must write and what every consumer is allowed to assume. It also documents the *external* contract for `architecture/overview.md` (owned by an opt-in upstream architect plugin, not bundled) — discovery only, drift-warn never drift-fail.

### PreCompact resume integrity (CC v2.1.105+)

The `pre-compact-flush.mjs` hook directly fixes the documented P0 in `docs/trekexecute-v2-observations-from-config-audit-v4.md`: in skill-driven execution, `progress.json` could fall behind git reality before context compaction, breaking `/trekexecute --resume` after long conversations. The hook fires on every PreCompact event, locates any `progress.json` under `.claude/projects/`, compares stored `current_step` against `git log --oneline {session_start_sha}..HEAD`, and atomically writes a fresh checkpoint (`tmp + rename`, monotonic only) when git is ahead. Never blocks compaction.

## Known limitations

**Infrastructure-as-code (IaC) gets reduced value.** The exploration agents are designed for application code. Terraform, Helm, Pulumi, CDK projects will get a plan, but agents like `architecture-mapper` and `test-strategist` produce less useful output for IaC. Use trekplan for the structural plan, then supplement IaC-specific steps manually.

## Installation

Add the marketplace and browse plugins with `/plugin`:

```bash
claude plugin marketplace add https://git.fromaitochitta.com/open/ktg-plugin-marketplace.git
```

Or enable directly in `~/.claude/settings.json`:

```json
{
  "enabledPlugins": {
    "trekplan@ktg-plugin-marketplace": true
  }
}
```

An optional architect step between research and plan was previously available via a separate plugin; that architect plugin is no longer publicly distributed. The `architecture/overview.md` filesystem slot remains supported by `/trekplan` for any compatible producer.

## Profile system (v4.1.0)

Three built-in model profiles plus operator-defined `<custom>.yaml` (drop in `lib/profiles/`). Each profile pins `phase_models` for the six pipeline phases. The active profile is recorded in plan.md frontmatter as `profile: <name>` and emitted to JSONL stats for cost-attribution.

| Profile | Brief | Research | Plan | Execute | Review | Continue | Use case |
|---------|-------|----------|------|---------|--------|----------|----------|
| `economy` | sonnet | sonnet | sonnet | sonnet | sonnet | sonnet | Lowest cost; high-confidence small-scope tasks |
| `balanced` (default) | sonnet | sonnet | opus | sonnet | opus | sonnet | Default — opus where reasoning depth pays off |
| `premium` | opus | sonnet | opus | sonnet | opus | sonnet | Critical-path planning + review when budget allows |

Lookup order:

1. Explicit `--profile <name>` flag passed to the command
2. Plan-file frontmatter `profile:` (when resuming via `/trekexecute --resume` or `/trekcontinue`)
3. `VOYAGE_PROFILE` environment variable
4. Default `balanced`

See [`docs/profiles.md`](docs/profiles.md) for the decision tree, custom-profile authoring, and cost estimation disclaimer (the per-profile cost numbers are *anslag*, not contractual SLAs).

## Observability (v4.1.0)

Stop-hook OTel/Prometheus export — opt-in via `VOYAGE_EXPORT_MODE`:

| Mode | Output | Endpoint env-var |
|------|--------|------------------|
| `off` (default) | _(no export)_ | — |
| `textfile` | `voyage.prom` (Prometheus exposition format) | `VOYAGE_TEXTFILE_DIR` |
| `otlp` | OTLP/JSON POST | `VOYAGE_OTEL_ENDPOINT` |

Default JSONL stats stream (`${CLAUDE_PLUGIN_DATA}/trek*-stats.jsonl`) is unchanged — OTel is additive. Local Docker Compose stack: [`examples/observability/`](examples/observability/). Operator docs: [`docs/observability.md`](docs/observability.md). Both pin minimum versions per CVE history.

## Cost profile

Opus runs the orchestrators (one per command) and the executor (one per plan session). Sonnet runs the exploration and review swarms (5–10 agents per command, with effort/turn limits). The pipeline front-loads cheap Sonnet work so Opus only does synthesis and execution. Typical total: comparable to a long single Claude Code session — the per-command cost is published in `${CLAUDE_PLUGIN_DATA}/trek*-stats.jsonl` if you want exact numbers.

For per-profile cost estimates, see [`docs/profiles.md`](docs/profiles.md).

## Requirements

- [Claude Code](https://docs.anthropic.com/en/docs/claude-code) (CLI, desktop app, or web app)
- Claude subscription with Opus access (Max plan recommended)
- Optional: [Tavily MCP server](https://github.com/tavily-ai/tavily-mcp) for enhanced external research
- Optional: [Gemini MCP server](https://github.com/anthropics/anthropic-cookbook/tree/main/tool-use/gemini-mcp) for independent second opinion via Gemini Deep Research

## Architecture

Top-level layout:

```
trekplan/
├── agents/        23 specialized agents (sonnet for exploration + review, opus for orchestration)
├── commands/      6 slash commands (trekbrief, trekresearch, trekplan, trekexecute, trekreview, trekcontinue) + trekplan-end-session helper
├── templates/     Frontmatter templates for brief, research, plan, session, launch
├── hooks/         5 hooks (pre-bash, pre-write, session-title, post-bash-stats, pre-compact-flush)
├── lib/           Zero-dep parsers and validators (CLI shims under lib/validators/)
├── tests/         109 node:test cases — `npm test` is the fork-readiness gate
├── docs/          HANDOVER-CONTRACTS.md + architect-bridge-test.md
└── examples/      01-add-verbose-flag/ — calibrated end-to-end pipeline demo
```

Pure markdown commands and agents. Hooks and validators are self-contained Node.js with zero npm dependencies. See [CONTRIBUTING.md](CONTRIBUTING.md) for the full file inventory.

## Extending the plugin

Common modifications fork-ers make. None require touching `lib/` —
all of these are surface-level changes to commands, agents, or settings.

### Add a new exploration agent

Exploration agents run in parallel during `/trekplan` Phase 5.
They read the codebase and contribute structured findings to plan
synthesis.

1. Copy `agents/architecture-mapper.md` as a template:
   ```bash
   cp agents/architecture-mapper.md agents/my-new-agent.md
   ```
2. Update the frontmatter `name`, `description`, `tools`, and `model`.
   Use `sonnet` unless the agent needs deep reasoning (most don't).
3. Add the agent to the swarm in `agents/planning-orchestrator.md`
   Phase 5 — register it under the codebase-size bucket where it
   should fire (always / medium+ / large only).
4. Update the agent table in `CLAUDE.md` and `README.md` to keep the
   doc-consistency test green:
   ```bash
   npm test -- tests/lib/doc-consistency.test.mjs
   ```

### Switch the planning model

The default for `/trekbrief`, `/trekresearch`,
`/trekplan`, and `/trekexecute` is `opus` (deep
reasoning). To run on Sonnet for cost or latency, search-and-replace
the frontmatter in three files:

```bash
sed -i.bak 's/^model: opus$/model: sonnet/' \
  commands/trekbrief.md \
  commands/trekresearch.md \
  commands/trekplan.md \
  commands/trekexecute.md
```

The exploration agents stay on Sonnet — only the orchestrator is bumped.

### Disable external research

`/trekresearch --local` skips Tavily, Microsoft Learn, and the
Gemini bridge. To make `--local` the default, edit the front of
`commands/trekresearch.md` Phase 1 and flip the default branch
of the `--local` argument check. Or just always pass `--local` and
document it in your team's CLAUDE.md.

### Plugin data contract

The four commands write to a single project directory (`.claude/projects/{date}-{slug}/`).
The full schema for every artifact is in [docs/HANDOVER-CONTRACTS.md](docs/HANDOVER-CONTRACTS.md).
That document is the single source of truth for:

- File paths each command reads/writes
- Frontmatter schema for `brief.md`, `research/*.md`, `plan.md`
- `progress.json` schema
- Validator → handover mapping
- Versioning and breaking-change protocol

If you fork the plugin and change the schema for any artifact, update
that doc *and* the corresponding `lib/validators/*.mjs` *and* run
`npm test` — the validators and doc-consistency tests will catch
schema drift.

### Disable the architect bridge

`/trekplan` auto-discovers `architecture/overview.md` if an
opt-in upstream architect plugin (not bundled) produced one. To
suppress this, leave the `architecture/` directory absent from your
project directory. Discovery is additive — missing file is fine, no
error.

## Contributing

See [CONTRIBUTING.md](CONTRIBUTING.md).

## License

[MIT](LICENSE)