feat(voyage)!: marketplace handoff — rename plugins/ultraplan-local to plugins/voyage [skip-docs]

Session 5 of voyage-rebrand (V6). Operator-authorized cross-plugin scope.

- git mv plugins/ultraplan-local plugins/voyage (rename detected, history preserved)
- .claude-plugin/marketplace.json: voyage entry replaces ultraplan-local
- CLAUDE.md: voyage row in plugin list, voyage in design-system consumer list
- README.md: bulk rename ultra*-local commands -> trek* commands; ultraplan-local refs -> voyage; type discriminators (type: trekbrief/trekreview); session-title pattern (voyage:<command>:<slug>); v4.0.0 release-note paragraph
- plugins/voyage/.claude-plugin/plugin.json: homepage/repository URLs point to monorepo voyage path
- plugins/voyage/verify.sh: drop URL whitelist exception (no longer needed)

Closes voyage-rebrand. bash plugins/voyage/verify.sh PASS 7/7. npm test 361/361.
This commit is contained in:
Kjell Tore Guttormsen 2026-05-05 15:37:52 +02:00
commit 7a90d348ad
149 changed files with 26 additions and 33 deletions

View file

@ -0,0 +1,223 @@
# Headless Launch Script Template
This template is used by the session-decomposer agent to generate a launch script
for headless execution of decomposed sessions.
## Template
```bash
#!/usr/bin/env bash
# Headless launch script — generated by trekplan
# Master plan: {plan_path}
# Generated: {date}
# Sessions: {total_sessions} ({parallel_count} parallel, {sequential_count} sequential)
set -euo pipefail
# Prevent accidental API billing — remove this line if you intend to use API credits
unset ANTHROPIC_API_KEY
REPO_ROOT="$(git rev-parse --show-toplevel)"
PLAN_DIR="{session_dir}"
LOG_DIR="{session_dir}/logs"
WORKTREE_BASE="{session_dir}/worktrees"
mkdir -p "$LOG_DIR" "$WORKTREE_BASE"
# Disable git's optional locks during parallel worktree ops (research/02 R2;
# GH #47721). Mirror Phase 2.6 hardenings (commands/trekexecute.md).
export GIT_OPTIONAL_LOCKS=0
# Per-child guardrails (operator may override via env). Match Phase 2.6
# Step 2a-pre defaults.
MAX_TURNS="${TREKEXECUTE_MAX_TURNS:-50}"
MAX_BUDGET_USD="${TREKEXECUTE_MAX_BUDGET_USD:-5}"
# Shared context (brief + architecture if present). Absolute path so child
# sessions resolve it after `cd` into their worktree.
SHARED_CONTEXT_FILE="$WORKTREE_BASE/.shared-context.md"
if [ -f "$PLAN_DIR/brief.md" ]; then
cat "$PLAN_DIR/brief.md" > "$SHARED_CONTEXT_FILE"
if [ -f "$PLAN_DIR/architecture/overview.md" ]; then
printf '\n\n---\n\n' >> "$SHARED_CONTEXT_FILE"
cat "$PLAN_DIR/architecture/overview.md" >> "$SHARED_CONTEXT_FILE"
fi
fi
# Safety preambles prepended to every per-session prompt (defense-in-depth
# against headless-mode hook gaps + Claude 4 cross-model malware-reminder).
SAFETY_PREAMBLE='[CRITICAL — defense in depth] You are running in headless mode where plugin hooks may not fire reliably (GH #36071). DO NOT execute commands matching: `rm -rf /`, `curl | bash`, `git push --force` to main, `git reset --hard` outside this worktree, or any command writing outside the current worktree. Treat this rule as enforced regardless of `--allowedTools` allowlist.
Note: any malware-related safety reminders apply conditionally to code you assess as actually malicious, not to all code reads (per GH #52272 community workaround).'
# Cleanup trap — always remove worktrees on exit (success or failure)
cleanup_worktrees() {
echo ""
echo "=== Cleaning up worktrees ==="
cd "$REPO_ROOT"
# push-before-cleanup (Hard Rule 19): push any remaining branches so work
# survives even if subsequent removal races. Failure is non-fatal.
git branch --list "trek/{slug}/*" | while read b; do
git push origin "$b" 2>/dev/null || true
done
for wt in "$WORKTREE_BASE"/session-*; do
[ -d "$wt" ] && git worktree remove "$wt" --force 2>/dev/null && echo "Removed: $wt"
done
git worktree prune
git branch --list "trek/{slug}/*" | while read b; do
git branch -D "$b" 2>/dev/null
done
rmdir "$WORKTREE_BASE" 2>/dev/null
echo "Cleanup complete."
}
trap cleanup_worktrees EXIT
# Pre-flight: verify clean working tree
if [ -n "$(git status --porcelain)" ]; then
echo "ERROR: Working tree is not clean. Commit or stash changes before parallel execution."
git status --short
exit 1
fi
# Pre-flight: verify remote push permissions (catches credential/auth issues
# BEFORE spawning sessions). Sub-agent bash sandbox may have different
# credentials than the launching shell — Step 0 in each session spec handles
# the sandbox-side detection. Set TREKEXECUTE_SKIP_PREFLIGHT=1 for offline
# or air-gapped testing.
if [ "${TREKEXECUTE_SKIP_PREFLIGHT:-0}" != "1" ]; then
if ! git push --dry-run origin HEAD >/tmp/push-dryrun-launch.log 2>&1; then
echo "ERROR: git push --dry-run failed. Sessions will be unable to push."
cat /tmp/push-dryrun-launch.log
echo ""
echo "Fix remote credentials before running parallel execution, or set"
echo "TREKEXECUTE_SKIP_PREFLIGHT=1 to bypass (offline/air-gapped only)."
exit 1
fi
if grep -qE "(rejected|denied|forbidden|permission)" /tmp/push-dryrun-launch.log; then
echo "ERROR: git push --dry-run reports rejection. Sessions will fail at commit time."
cat /tmp/push-dryrun-launch.log
exit 1
fi
fi
echo "=== Voyage Headless Execution (Worktree-Isolated) ==="
echo "Plan: {plan_path}"
echo "Sessions: {total_sessions}"
echo "Repo root: $REPO_ROOT"
echo ""
# --- Wave {N}: Parallel sessions (no dependencies) ---
echo "--- Wave {N}: {description} ---"
{# For each parallel session in this wave, create worktree: }
git worktree add -b "trek/{slug}/session-{n}" "$WORKTREE_BASE/session-{n}" HEAD
echo "Worktree created: session-{n} (branch: trek/{slug}/session-{n})"
{# Launch session in its worktree (with safety preamble + budget caps + shared context): }
cd "$WORKTREE_BASE/session-{n}" && claude -p "${SAFETY_PREAMBLE}
$(cat "$PLAN_DIR/session-{n}-{slug}.md")" \
--allowedTools "Read,Write,Edit,Bash,Glob,Grep" \
--permission-mode bypassPermissions \
--max-turns "$MAX_TURNS" \
--max-budget-usd "$MAX_BUDGET_USD" \
--append-system-prompt-file "$SHARED_CONTEXT_FILE" \
> "$LOG_DIR/session-{n}.log" 2>&1 &
PID_{n}=$!
cd "$REPO_ROOT"
echo "Started session {n}: {title} (PID $PID_{n})"
{# After all parallel sessions in this wave: }
echo "Waiting for Wave {N} to complete..."
wait $PID_{n1} $PID_{n2}
echo "Wave {N} complete."
echo ""
# --- Merge wave results (sequential) ---
echo "--- Merging Wave {N} ---"
cd "$REPO_ROOT"
{# For each session in the wave: push BEFORE merge (Hard Rule 19 — push-before-cleanup). }
git push origin "trek/{slug}/session-{n}" 2>/dev/null || true
git merge --no-ff "trek/{slug}/session-{n}" \
-m "merge: trekplan session {n} — {title}"
if [ $? -ne 0 ]; then
echo "MERGE CONFLICT: session {n}. Conflicting files:"
git diff --name-only --diff-filter=U
git merge --abort
echo "Aborting. Earlier sessions in this wave are already merged."
exit 1
fi
git worktree remove "$WORKTREE_BASE/session-{n}" --force
git branch -d "trek/{slug}/session-{n}"
echo "Merged and cleaned: session {n}"
git worktree prune
# --- Verify wave results ---
echo "--- Verifying Wave {N} ---"
{# For each session in the wave, run its exit condition commands }
{verify_commands}
# --- Wave {N+1}: Sequential sessions (depends on previous wave) ---
{# Repeat wave pattern for dependent sessions }
echo ""
echo "=== All sessions complete ==="
echo "Review logs in $LOG_DIR/"
echo "Run final verification: {final_verify_command}"
```
## Rules for the session-decomposer
When generating a launch script from this template:
1. **Group sessions into waves** by dependency. Sessions with no dependencies
or whose dependencies are all in earlier waves can run in the same wave.
2. **Each wave waits for completion** before the next wave starts.
3. **Verification runs after each wave** — if verification fails, the script
stops and reports which session failed.
4. **Log each session** to a separate file for debugging.
5. **Use `claude -p`** with the session spec file as the prompt.
6. **Use `--allowedTools "Read,Write,Edit,Bash,Glob,Grep"`** with
`--permission-mode bypassPermissions` for child sessions. This limits the
tool surface to what the executor needs and prevents agent spawning, MCP
access, and external web requests in headless sessions.
7. **Final verification** at the end runs the master plan's verification section.
8. **Never include secrets** in the generated script.
9. **Wave verification must be independent.** After each wave completes, run
verification commands fresh via Bash — never parse session log files as proof
of success. Log files contain executor self-reporting, not ground truth. The
command's exit code is the only authoritative verification signal.
10. **Billing preamble.** Prepend `unset ANTHROPIC_API_KEY` with a comment at
the top of the script to prevent accidental API billing. Users who intend
to use API credits can remove this line.
11. **Worktree isolation is mandatory.** Every parallel wave MUST use git
worktrees. Each session gets its own worktree and branch. Never launch
parallel `claude -p` sessions in the same working directory.
12. **Cleanup trap on EXIT.** The generated script MUST include a `trap` on
EXIT that removes all worktrees (`git worktree remove --force`) and prunes
branches, even if the script fails or is interrupted.
13. **Sequential merge after each wave.** After all sessions in a wave complete,
merge their branches back to the main branch one at a time. Abort on merge
conflict — do not force-resolve.
14. **Clean working tree before worktrees.** Add a `git status --porcelain`
check at the top of the script. Fail if the working tree is dirty.
15. **Absolute paths for logs.** Log file paths must be absolute (resolved from
`$REPO_ROOT`), not relative to any worktree.
16. **Per-child guardrails (mirrors Phase 2.6 Step 2b).** Every `claude -p`
invocation must include `--max-turns "$MAX_TURNS"`,
`--max-budget-usd "$MAX_BUDGET_USD"`, and
`--append-system-prompt-file "$SHARED_CONTEXT_FILE"`. The shared context
must be built once with an absolute path (resolved from `$WORKTREE_BASE`)
so child sessions can read it after `cd`.
17. **Safety preamble.** Every per-session prompt must be prefixed with the
`$SAFETY_PREAMBLE` string defined at the top of the script. This is the
primary defense when plugin hooks do not fire reliably (GH #36071), and
includes the GH #52272 malware-reminder clarification for AUTO mode.
18. **GIT_OPTIONAL_LOCKS=0.** The script must export `GIT_OPTIONAL_LOCKS=0`
once at the top so every git invocation (worktree add/remove/prune,
branch -d, merge, push) avoids the index.lock background-poll race
(research/02 R2; GH #47721).
19. **push-before-cleanup (Hard Rule 19).** After successful `git merge --no-ff`,
run `git push origin <branch>` BEFORE `git worktree remove` and
`git branch -d`. Push failure is non-fatal — cleanup proceeds. Converts
unrecoverable branch loss into recoverable remote state (research/02 R3).

View file

@ -0,0 +1,259 @@
<!--
Optional YAML frontmatter — include ONLY when the plan was generated from a
`type: trekreview` input (Handover 6). Lists the 40-char hex IDs of the
BLOCKER + MAJOR findings consumed from `review.md`. Use block-style YAML;
the frontmatter parser does not support flow-style arrays.
Plans generated from a `type: brief` input omit this block entirely. No
plan_version bump — the field is additive and backwards compatible.
---
source_findings:
- 0123456789abcdef0123456789abcdef01234567
- fedcba9876543210fedcba9876543210fedcba98
---
-->
# {Task Title}
> **Plan quality: {grade}** ({score}/100) — {APPROVE | APPROVE_WITH_NOTES | REVISE | REPLAN}
>
> Generated by trekplan v{version} on {YYYY-MM-DD} — `plan_version: 1.7`
## Context
Why this change is needed. The problem or need it addresses, what prompted it,
and the intended outcome. Reference the spec file if one was used.
## Architecture Diagram
```mermaid
graph TD
subgraph "Changes in this plan"
%% C4-style component diagram showing what the plan touches
%% Highlight modified components, new components, and connections
end
```
*Replace with actual Mermaid diagram showing the components this plan modifies,
their relationships, and the data flow between them.*
## Codebase Analysis
- **Tech stack:** {languages, frameworks, build tools}
- **Key patterns:** {architecture patterns, conventions observed}
- **Relevant files:** {paths to files that will be read or modified}
- **Reusable code:** {existing functions, utilities, abstractions to leverage}
- **External tech (researched):** {technologies that were looked up via research-scout}
- **Recent git activity:** {relevant recent commits, active branches, code ownership}
## Research Sources
*Omit this section when no external research was conducted.*
| Technology | Source | Key Findings | Confidence |
|-----------|--------|--------------|------------|
| {name} | {URL} | {summary} | {high/med/low} |
## Implementation Plan
Each step targets 12 files and one focused change. Steps follow TDD structure
when the project has tests.
### Step 1: {description}
- **Files:** `path/to/file.ts`
- **Changes:** {exactly what to modify — no placeholders, no "update as needed"}
- **Reuses:** {existing function/pattern from codebase, with file path}
- **Test first:**
- File: `path/to/test.ts` *(existing | new)*
- Verifies: {what the test checks}
- Pattern: `path/to/existing-test.ts` *(follow this style)*
- **Verify:** `{exact command}` → expected: `{output}`
- **On failure:** {revert | retry | skip | escalate} — {specific instructions}
- **Checkpoint:** `git commit -m "{conventional commit message}"`
- **Manifest:**
```yaml
manifest:
expected_paths:
- path/to/file.ts
min_file_count: 1
commit_message_pattern: "^feat\\(scope\\):"
bash_syntax_check: []
forbidden_paths: []
must_contain: []
```
### Step 2: {description}
- **Files:** `path/to/file.ts`
- **Changes:** {exactly what to modify}
- **Reuses:** {existing function/pattern}
- **Test first:**
- File: `path/to/test.ts` *(existing | new)*
- Verifies: {what the test checks}
- Pattern: `path/to/existing-test.ts`
- **Verify:** `{exact command}` → expected: `{output}`
- **On failure:** {revert | retry | skip | escalate} — {specific instructions}
- **Checkpoint:** `git commit -m "{conventional commit message}"`
- **Manifest:**
```yaml
manifest:
expected_paths:
- path/to/file.ts
min_file_count: 1
commit_message_pattern: "^feat\\(scope\\):"
bash_syntax_check: []
forbidden_paths: []
must_contain:
- path: path/to/file.ts
pattern: "expected content marker"
```
*For projects without tests: omit "Test first" and keep "Verify" with a
concrete command (e.g., run the app, check output, curl an endpoint).*
### Manifest — objective completion predicate
Every step MUST have a Manifest block. This is the machine-checkable contract
that trekexecute verifies after the Verify command passes. A step is
not considered complete until its manifest verifies — regardless of Verify
command exit code.
- **expected_paths** — files that must exist after this step. Existing files
must be present in repo; new files must be marked `(new file)` in prose.
- **min_file_count** — minimum number of expected_paths that must exist.
Typically equal to `len(expected_paths)`.
- **commit_message_pattern** — regex that MUST match the HEAD commit message
after Checkpoint runs. Use escaped regex syntax (e.g., `\\(scope\\)`).
- **bash_syntax_check** — list of `.sh` files that must pass `bash -n`.
Auto-include any `.sh` in expected_paths.
- **forbidden_paths** — files this step must NOT modify (defense-in-depth
beyond Scope Fence).
- **must_contain** — optional grep assertions: `path` + `pattern` pairs that
must match in created/modified files.
### Failure recovery rules
- **On failure: revert** — undo this step's changes (`git checkout -- {files}`), do NOT proceed
- **On failure: retry** — attempt once more with the alternative approach described, then revert if still failing
- **On failure: skip** — this step is non-critical; continue to next step and note the skip
- **On failure: escalate** — stop execution entirely; the issue requires human judgment
- **Checkpoint** — after each step succeeds, commit changes so subsequent failures cannot corrupt completed work
## Alternatives Considered
| Approach | Pros | Cons | Why rejected |
|----------|------|------|--------------|
| {name} | ... | ... | ... |
## Test Strategy
- **Framework:** {test framework and runner}
- **Existing patterns:** {how tests are structured in this codebase}
- **New tests in this plan:** {N} tests across {N} steps
### Tests to write
| Type | File | Verifies | Model test |
|------|------|----------|------------|
| Unit | `path/to/test` | {what it tests} | `path/to/existing-test` |
*For projects without tests: describe manual verification approach instead.*
## Risks and Mitigations
| Priority | Risk | Location | Impact | Mitigation |
|----------|------|----------|--------|------------|
| {Critical/High/Medium/Low} | {description} | `file:line` | {what happens} | {how to handle} |
## Assumptions
*Things the planner could not verify from codebase or research. Each assumption
is a risk — review before executing.*
| # | Assumption | Why unverifiable | Impact if wrong |
|---|-----------|-----------------|-----------------|
| 1 | {what we assumed} | {why we couldn't check} | {what breaks} |
*If this list has 3+ items, the plan may need additional investigation
before execution.*
## Verification
*Per-step manifest verification runs automatically during execution (every
step's Manifest block is objectively checked by trekexecute before the
step is marked passed). This section is for end-to-end integration checks
that cross step boundaries — complete workflows, system-level behavior.*
- [ ] `{exact command}` → expected: `{exact output or behavior}`
- [ ] `{exact command}` → expected: `{exact output or behavior}`
## Estimated Scope
- **Files to modify:** {N}
- **Files to create:** {N}
- **Complexity:** {low | medium | high}
## Execution Strategy
*Include this section when the plan has more than 5 implementation steps.
Omit for small plans (≤ 5 steps) — trekexecute will run them sequentially
in a single session.*
*The execution strategy groups steps into sessions and organizes sessions
into waves. Sessions in the same wave can run in parallel. Sessions in
later waves depend on earlier waves completing first.*
### Session 1: {title}
- **Steps:** {step numbers, e.g., 1, 2, 3}
- **Wave:** {wave number}
- **Depends on:** {session numbers, or "none"}
- **Scope fence:**
- Touch: {files this session may modify}
- Never touch: {files reserved for other sessions}
### Session 2: {title}
- **Steps:** {step numbers}
- **Wave:** {wave number}
- **Depends on:** {session numbers, or "none"}
- **Scope fence:**
- Touch: {files}
- Never touch: {files}
### Execution Order
- **Wave 1:** {session list} (parallel)
- **Wave 2:** {session list} (after Wave 1)
### Grouping rules applied
- Steps sharing files → same session
- Steps in independent modules → separate sessions (parallelizable)
- 35 steps per session (target)
- Sessions ordered by dependency, waves by independence
## Plan Quality Score
| Dimension | Weight | Score | Notes |
|-----------|--------|-------|-------|
| Structural integrity | 0.15 | {0100} | {step ordering, dependencies} |
| Step quality | 0.20 | {0100} | {granularity, specificity, TDD} |
| Coverage completeness | 0.20 | {0100} | {spec → steps, no gaps} |
| Specification quality | 0.15 | {0100} | {no placeholders, clear criteria} |
| Risk & pre-mortem | 0.15 | {0100} | {failure modes addressed} |
| Headless readiness | 0.10 | {0100} | {On failure + Checkpoint per step} |
| Manifest quality | 0.05 | {0100} | {all steps have valid, checkable manifests} |
| **Weighted total** | **1.00** | **{score}** | **Grade: {A/B/C/D}** |
**Adversarial review:**
- **Plan critic:** {verdict — findings count by severity, key issues}
- **Scope guardian:** {verdict — ALIGNED / CREEP / GAP / MIXED}
## Revisions
*Added by adversarial review. Omit if no revisions were needed.*
| # | Finding | Severity | Resolution |
|---|---------|----------|------------|
| 1 | {what was wrong} | {blocker/major/minor} | {how it was fixed} |

View file

@ -0,0 +1,122 @@
---
type: trekresearch-brief
created: {YYYY-MM-DD}
question: "{research question}"
confidence: {0.0-1.0}
dimensions: {N}
mcp_servers_used: [{list}]
local_agents_used: [{list}]
external_agents_used: [{list}]
---
# {Research Question Title}
> Generated by trekresearch v{version} on {YYYY-MM-DD}
## Research Question
{The full research question as clarified during interview.}
## Executive Summary
{3 sentences maximum. The answer, the confidence level, and the key caveat.}
## Dimensions
*Each dimension represents one facet of the research question, explored by both
local and external agents. Confidence is rated per dimension.*
### {Dimension Name} -- Confidence: {high | medium | low | contradictory}
**Local findings:**
- {Finding with source citation (file path or agent name)}
**External findings:**
- {Finding with source citation (URL)}
**Contradictions:**
- {If local and external disagree, explain both sides with evidence.
Omit this sub-section if no contradictions exist for this dimension.}
*Repeat for each dimension.*
## Local Context
*Findings from codebase analysis agents. Omit sub-sections where no relevant
findings exist.*
### Architecture
{Architecture patterns, tech stack, relevant components from architecture-mapper}
### Dependencies
{Import chains, data flow, external integrations from dependency-tracer}
### Conventions
{Coding patterns, naming, test conventions from convention-scanner}
### History
{Recent changes, code ownership, hot files from git-historian}
## External Knowledge
*Findings from external research agents. Omit sub-sections where no relevant
findings exist.*
### Best Practice
{Official documentation, recommended patterns from docs-researcher}
### Alternatives
{Other approaches, competing solutions from community-researcher + contrarian-researcher}
### Security
{CVEs, audit history, supply chain risks from security-researcher}
### Known Issues
{Common pitfalls, gotchas, real-world problems from community-researcher}
## Gemini Second Opinion
*Independent research result from Gemini Deep Research. Provides a second
perspective for triangulation. Omit this section if gemini-bridge was not used
or was unavailable.*
{Gemini findings reformatted into key findings, sources cited, and areas of
agreement/disagreement with other agents.}
## Synthesis
*Cross-cutting insights that emerge from combining local and external knowledge.
This is NOT a summary of the sections above. It is NEW insight from triangulation
-- things that only become visible when local context meets external knowledge.*
{Example: "The codebase uses pattern X (local), but best practice has shifted to
pattern Y (external). However, our dependency on Z (local) makes a direct migration
impractical -- a hybrid approach using Y for new code while maintaining X for
existing modules is the pragmatic path."}
## Open Questions
*Things that remain unresolved after research. Each is a candidate for follow-up
research or an assumption to carry forward.*
- {Question 1 -- why it remains open}
- {Question 2 -- why it remains open}
## Recommendation
*If the research was decision-relevant, provide a concrete recommendation with
reasoning. If the research was exploratory (understanding, not deciding), omit
this section entirely.*
{Recommendation with rationale, citing specific findings from above.}
## Sources
| # | Source | Type | Quality | Used in |
|---|--------|------|---------|---------|
| 1 | {URL or codebase path} | {official / community / codebase / gemini} | {high / medium / low} | {dimension name} |
*Quality assessment:*
- **high** — official documentation, verified codebase analysis, peer-reviewed
- **medium** — reputable community source, well-maintained blog, established project
- **low** — unverified, outdated (>1 year), single-source claim, opinion piece

View file

@ -0,0 +1,155 @@
# Session {N}: {title}
> From master plan: {plan file path}
> Session {N} of {total sessions}
## Context
{Why this session exists. What it accomplishes within the larger plan.
Include enough background that an executor with no prior context can understand
the purpose and make judgment calls.}
## Dependencies
- **Depends on:** {Session M | "none — can run in parallel"}
- **Blocks:** {Session P | "none"}
- **Entry condition:** {what must be true before this session starts — e.g., "Session 2 committed and tests pass"}
## Scope Fence
- **Touch:** {explicit list of files this session may create or modify}
- **Never touch:** {files that belong to other sessions — hard boundary}
## Session Manifest
Machine-readable aggregate of all step manifests in this session. Used by
trekexecute for independent Phase 7.5 audit.
```yaml
session_manifest:
plan_version: "1.7"
legacy_synthesis: false # true if decomposer synthesized manifests from v1.6 plan
expected_paths: # union across all steps (deduplicated)
- {path from step N}
- {path from step M}
commit_count: {N} # number of implementation steps (excludes Step 0)
commit_message_patterns: # in step order; Step 0 omitted
- "^feat\\(scope\\):"
- "^fix\\(scope\\):"
bash_syntax_check: [] # union of step bash_syntax_check
scope_touch: [] # from Scope Fence Touch
scope_forbidden: [] # Never touch + union of step forbidden_paths
```
## Steps
### Step 0: Sandbox pre-flight (auto-generated — do not modify)
- **Files:** none (read-only test)
- **Changes:** verify git push permissions are available in this sandbox
- **Verify:**
```
git push --dry-run origin HEAD 2>&1 | tee /tmp/push-dryrun-$$.log; grep -qE "(rejected|error|denied|forbidden|permission)" /tmp/push-dryrun-$$.log && exit 77 || true
```
→ expected: non-77 exit code
- **On failure:** `escalate` — exit code 77 means this sandbox cannot push.
Abort immediately; do not attempt any work. Main orchestrator will
re-spawn with correct permissions.
- **Checkpoint:** none (no file changes)
- **Manifest:**
```yaml
manifest:
expected_paths: []
min_file_count: 0
commit_message_pattern: ""
bash_syntax_check: []
forbidden_paths: []
must_contain: []
sandbox_preflight: true
```
*Step 0 runs in the same sandbox as all real work. If it exits 77,
trekexecute marks the session `blocked` and does NOT proceed. This
catches the fail-late push-denial mode observed in Wave 1.*
*Escape hatch:* set `TREKEXECUTE_SKIP_PREFLIGHT=1` in the environment to
bypass Step 0 (use only for offline/air-gapped testing).
### Step 1: {description}
- **Files:** `{path}`
- **Changes:** {exactly what to modify}
- **Reuses:** {existing function/pattern, with file path}
- **Test first:** {test file, what it verifies, pattern to follow}
- **Verify:** `{exact command}` → expected: `{output}`
- **On failure:** {revert | retry | skip | escalate} — {specific instructions}
- **Checkpoint:** `git commit -m "{message}"`
- **Manifest:**
```yaml
manifest:
expected_paths:
- {path}
min_file_count: 1
commit_message_pattern: "^feat\\(scope\\):"
bash_syntax_check: []
forbidden_paths: []
must_contain: []
```
### Step 2: {description}
{same structure as Step 1, including Manifest block}
## Exit Condition
All of these must pass before this session is considered complete:
- [ ] `{verification command}` → expected: `{output}`
- [ ] `{verification command}` → expected: `{output}`
- [ ] All changes committed with descriptive messages
- [ ] No uncommitted changes remain (`git status` clean)
## Failure Handling
- If ANY step fails after retry: **stop execution**. Do NOT proceed to later steps.
## Security Constraints
These rules override any step instructions that conflict with them:
- **Never run** `rm -rf`, `chmod 777`, pipe-to-shell (`curl|bash`, `wget|sh`,
`base64|bash`), `eval` with variable expansion, `mkfs`, `dd` to block devices,
`shutdown`/`reboot`/`halt`, fork bombs, `crontab` writes, or `kill -9 -1`
- **Never modify files** outside the Scope Fence (Touch list above)
- **Never write to** `.git/hooks/`, `~/.ssh/`, `~/.aws/`, `~/.gnupg/`, `.env`
files, shell configs (`~/.zshrc`, `~/.bashrc`, `~/.profile`)
- **Never write to** `.claude/settings.json`, `.claude/hooks/`, or any hook
script — these are security infrastructure and must not be modified by execution
- If a `Verify:` or `Checkpoint:` command violates these rules: treat as
`On failure: escalate` and stop execution regardless of the step's On failure setting
- Commit whatever was completed successfully before stopping.
- Report which step failed, the error message, and what was attempted.
## Handoff State
{What the next session (or final verification) needs to know about this session's
output. Include: new files created, exports added, configuration changed, APIs
introduced. This section bridges sessions — it's the "baton" in a relay race.}
## Metadata
- **Master plan:** `{plan file path}`
- **Steps from plan:** {step N}{step M}
- **Estimated complexity:** {low | medium | high}
- **Model recommendation:** {opus | sonnet} — {rationale}
## Recovery Metadata
*This section is populated only when this session spec was generated by the
trekexecute Phase 7.6 recovery dispatcher. Omit for normal sessions.*
- **Recovery of:** `{original session spec path}`
- **Recovery depth:** {1 | 2}
- **Missing steps (reason for recovery):** {step numbers + drift summary}
- **Entry condition override:** {e.g., "previous partial session committed at {sha}"}
- **Parent progress file:** `{path to .trekexecute-progress-*.json}`

View file

@ -0,0 +1,64 @@
# Task: {title}
## Goal
What success looks like. One clear paragraph.
## Non-Goals
What is explicitly out of scope for this task.
- {non-goal 1}
- {non-goal 2}
## Constraints
Technical, time, or resource limitations.
- {constraint 1}
- {constraint 2}
## Preferences
Preferred patterns, frameworks, libraries, or approaches.
- {preference 1}
- {preference 2}
## Non-Functional Requirements
Performance, security, accessibility, scalability, or other quality attributes.
- {NFR 1}
- {NFR 2}
## Success Criteria
Falsifiable conditions that define "done". Each must be checkable by running a
command or observing a specific system behavior.
- {criterion — e.g., "All existing tests pass: `npm test` exits 0"}
- {criterion — e.g., "New endpoint returns 200: `curl -s localhost:3000/api/health | jq .status` → "ok""}
- {criterion — e.g., "No TypeScript errors: `npx tsc --noEmit` exits 0"}
Do NOT write vague criteria:
- "It should work" (not testable)
- "The feature is implemented" (not falsifiable)
- "Performance is acceptable" (no baseline given)
## Prior Attempts
What has been tried before and what happened. Leave blank if this is a fresh task.
## Open Questions
Unresolved items that may affect the plan. Flag these as assumptions if proceeding
without answers.
- {question 1}
## Metadata
- **Created:** {YYYY-MM-DD}
- **Mode:** {interview | manual}
- **Source:** {trekplan interview | user-provided}

View file

@ -0,0 +1,157 @@
---
type: trekbrief
brief_version: 2.0
created: {YYYY-MM-DD}
task: "{one-line task description}"
slug: {slug}
project_dir: .claude/projects/{YYYY-MM-DD}-{slug}/
research_topics: {N}
research_status: pending # pending | in_progress | complete | skipped
auto_research: false # true if user opted into Claude-managed research
interview_turns: {N}
source: {interview | manual}
---
# Task: {title}
> Generated by `/trekbrief` on {YYYY-MM-DD}.
> This brief is the contract between requirements and planning. `/trekplan`
> reads it to produce the implementation plan. Every decision in the plan must
> trace back to content in this brief.
## Intent
*Why are we doing this? What is the motivation, user need, or strategic context?
3-5 sentences. Load-bearing for the plan — every implementation decision must
trace back to this intent.*
{Intent paragraph. Answers "why bother?".}
## Goal
*What does success look like concretely? What state will the system be in when
this is done? 1 paragraph. Specific enough to disagree with.*
{Goal paragraph.}
## Non-Goals
*What is explicitly out of scope? Prevents plan-critic and scope-guardian from
flagging gaps for things we deliberately do not do.*
- {non-goal 1}
- {non-goal 2}
## Constraints
*Technical, time, or resource limitations. Hard boundaries the plan must respect.*
- {constraint 1}
- {constraint 2}
## Preferences
*Preferred patterns, frameworks, libraries, or approaches. Soft constraints
(the plan may deviate with justification).*
- {preference 1}
- {preference 2}
## Non-Functional Requirements
*Performance, security, accessibility, scalability, or other quality attributes.
Quantified where possible.*
- {NFR 1 — e.g., "p95 response time < 200ms"}
- {NFR 2 — e.g., "Zero new npm dependencies"}
## Success Criteria
*Falsifiable, command-checkable conditions that define "done". Each must be
verifiable by running a specific command or observing a specific system behavior.*
- {criterion — e.g., "All existing tests pass: `npm test` exits 0"}
- {criterion — e.g., "New endpoint returns 200: `curl -s localhost:3000/api/health | jq .status``"ok"`"}
- {criterion — e.g., "No TypeScript errors: `npx tsc --noEmit` exits 0"}
Do NOT write vague criteria:
- "It should work" (not testable)
- "The feature is implemented" (not falsifiable)
- "Performance is acceptable" (no baseline given)
## Research Plan
*Explicit research topics that must be answered before `/trekplan` can
produce a high-confidence plan. Each topic is phrased as a research question ready
to feed into `/trekresearch`. Topics may be empty (N=0) for trivial tasks
where the codebase alone is sufficient context.*
{If research_topics = 0, write a single line: "No external research needed —
the codebase and this brief contain sufficient context for planning."}
### Topic 1: {Short title}
- **Why this matters:** {How the plan depends on this answer. Which steps or
decisions cannot be made confidently without it.}
- **Research question:** "{Exact question to feed to /trekresearch.
One sentence, ends in `?`.}"
- **Suggested invocation:** `/trekresearch --project {project_dir} --external "{question}"`
- **Required for plan steps:** {which kinds of steps will consume this — e.g.,
"migration strategy", "library selection", "threat model"}
- **Confidence needed:** {high | medium | low}
- **Estimated cost:** {quick — inline research | standard — agent swarm | deep — with contrarian + gemini}
- **Scope hint:** {local | external | both}
### Topic 2: {Short title}
- **Why this matters:** ...
- **Research question:** "..."
- **Suggested invocation:** `/trekresearch --project {project_dir} ...`
- **Required for plan steps:** ...
- **Confidence needed:** ...
- **Estimated cost:** ...
- **Scope hint:** ...
## Open Questions / Assumptions
*Things still uncertain after the interview. These are carried as `[ASSUMPTION]`
entries into the plan and flagged to the user for review.*
- {question or assumption 1}
- {question or assumption 2}
## Prior Attempts
*What has been tried before and what happened. Leave blank for fresh tasks.
Prior attempts are load-bearing — they prevent the plan from repeating known
failures.*
{Prior attempts narrative, or "None — fresh task."}
## Metadata
- **Created:** {YYYY-MM-DD}
- **Interview turns:** {N}
- **Auto-research opted in:** {yes | no}
- **Source:** {trekbrief interview | manual}
---
## How to continue
Manual (default):
```bash
# Run each research topic (order does not matter):
/trekresearch --project {project_dir} --external "{Topic 1 question}"
/trekresearch --project {project_dir} --external "{Topic 2 question}"
# Then plan:
/trekplan --project {project_dir}
# Then execute:
/trekexecute --project {project_dir}
```
Auto (opt-in during `/trekbrief`): research and planning run
automatically; only execution is manual.

View file

@ -0,0 +1,138 @@
---
type: trekreview
review_version: "1.0"
created: {YYYY-MM-DD}
task: "{Task description from brief.md}"
slug: {project-slug}
project_dir: .claude/projects/{YYYY-MM-DD}-{slug}/
brief_path: .claude/projects/{YYYY-MM-DD}-{slug}/brief.md
scope_sha_start: {sha-from-progress.json/session_start_sha-OR-null-if-mtime-fallback}
scope_sha_end: {sha-of-HEAD-at-review-time}
reviewed_files_count: {N}
findings:
- 0123456789abcdef0123456789abcdef01234567
- fedcba9876543210fedcba9876543210fedcba98
---
# Review: {Task description}
## Executive Summary
Two-to-four sentences: how was the brief honored, what is the verdict
(BLOCK / WARN / ALLOW), and what is the most important finding the user
should look at first.
## Coverage
| File | Treatment | Reason |
|------|-----------|--------|
| lib/foo.mjs | deep-review | matched deep-review pattern |
| lib/bar.mjs | summary-only | low-risk, no test patterns matched |
| dist/bundle.js | skip | matches generated-file pattern |
| commands/baz.md `[uncommitted]` | deep-review | working-tree change since session_start_sha |
> **`[uncommitted]` annotation** appears in the treatment column for files
> in the working tree (uncommitted at review time). This is a brief-level
> contract — see `brief.md` Assumptions section.
## Findings (BLOCKER)
### {finding-id-1-40-char-hex}
- file: lib/foo.mjs
- line: 42
- rule_key: BROKEN_SUCCESS_CRITERION
- brief_ref: SC3 — "review.md is parseable as input to /trekplan"
- title: Plan-validator rejects review.md when source_findings is flow-style
- detail: The validator at lib/validators/plan-validator.mjs:N reads
`source_findings` via parseDocument(), which does not support flow-style
YAML arrays. The fixture review-run-A.md uses flow-style — Handover 6
is broken end-to-end.
- recommended_action: Update template to use block-style YAML, regenerate
fixtures, add explicit test in tests/lib/source-findings.test.mjs.
## Findings (MAJOR)
### {finding-id-2-40-char-hex}
- file: agents/code-correctness-reviewer.md
- line: 34
- rule_key: MISSING_BRIEF_REF
- brief_ref: SC1 — "Every BLOCKER/MAJOR finding has rationale_anchor"
- title: Agent prompt does not require brief_ref in output JSON
- detail: The trailing JSON block in the agent prompt does not list
brief_ref as a required field. Findings emitted by this agent will fail
review-validator strict mode.
- recommended_action: Add `brief_ref` to the required-fields list in the
prompt's JSON template.
## Findings (MINOR)
### {finding-id-3-40-char-hex}
- file: lib/parsers/finding-id.mjs
- line: 18
- rule_key: MISSING_ERROR_HANDLING
- brief_ref: NFR — "Token budget honesty"
- title: TypeError thrown without surrounding context
- detail: When called with bad input, throws bare TypeError. Caller has no
way to know which field was malformed — error message is informative but
the error itself has no `cause` chain.
- recommended_action: Optional improvement: wrap error.cause with the
composite input that caused the throw.
## Findings (SUGGESTION)
### {finding-id-4-40-char-hex}
- file: README.md
- line: 24
- rule_key: PLACEHOLDER_IN_CODE
- brief_ref: Constraint — "Path-guard respect"
- title: TODO comment about cookie path
- detail: README mentions a TODO about cookie regeneration. Not a code
bug but worth noting for v1.1 cleanup.
- recommended_action: Track in TODO.md if not already.
## Remediation Summary
- 1 BLOCKER → must address before next plan iteration
- 1 MAJOR → should address before next plan iteration
- 1 MINOR → nice-to-have for v1.1
- 1 SUGGESTION → log and move on
If running `/trekplan --brief review.md`, the planner will consume
the BLOCKER + MAJOR findings as plan goals (their `recommended_action`
becomes the step intent). MINOR + SUGGESTION are skipped for v1.0
plan-input.
```json
{
"verdict": "BLOCK",
"counts": { "BLOCKER": 1, "MAJOR": 1, "MINOR": 1, "SUGGESTION": 1 },
"findings": [
{
"id": "0123456789abcdef0123456789abcdef01234567",
"severity": "BLOCKER",
"rule_key": "BROKEN_SUCCESS_CRITERION",
"file": "lib/foo.mjs",
"line": 42,
"brief_ref": "SC3",
"title": "Plan-validator rejects review.md when source_findings is flow-style",
"detail": "The validator ...",
"recommended_action": "Update template to use block-style YAML ..."
},
{
"id": "fedcba9876543210fedcba9876543210fedcba98",
"severity": "MAJOR",
"rule_key": "MISSING_BRIEF_REF",
"file": "agents/code-correctness-reviewer.md",
"line": 34,
"brief_ref": "SC1",
"title": "Agent prompt does not require brief_ref in output JSON",
"detail": "The trailing JSON block ...",
"recommended_action": "Add brief_ref to the required-fields list ..."
}
]
}
```