ktg-plugin-marketplace/plugins/voyage/docs/subagent-delegation-audit.md
Kjell Tore Guttormsen 7a90d348ad feat(voyage)!: marketplace handoff — rename plugins/ultraplan-local to plugins/voyage [skip-docs]
Session 5 of voyage-rebrand (V6). Operator-authorized cross-plugin scope.

- git mv plugins/ultraplan-local plugins/voyage (rename detected, history preserved)
- .claude-plugin/marketplace.json: voyage entry replaces ultraplan-local
- CLAUDE.md: voyage row in plugin list, voyage in design-system consumer list
- README.md: bulk rename ultra*-local commands -> trek* commands; ultraplan-local refs -> voyage; type discriminators (type: trekbrief/trekreview); session-title pattern (voyage:<command>:<slug>); v4.0.0 release-note paragraph
- plugins/voyage/.claude-plugin/plugin.json: homepage/repository URLs point to monorepo voyage path
- plugins/voyage/verify.sh: drop URL whitelist exception (no longer needed)

Closes voyage-rebrand. bash plugins/voyage/verify.sh PASS 7/7. npm test 361/361.
2026-05-05 15:37:52 +02:00

129 lines
6.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Subagent Delegation Audit — Main-Context Pressure Analysis
**Status:** Exploratory brief — findings + options, not a decision
**Date:** 2026-04-19
**Scope:** trekplan v2.3.2, all six user-facing commands
## Problem
Main context fills up quickly during trekplan runs. The plugin's
design principle is Context Engineering — the main context should
**orchestrate**, subagents should **execute**. In practice, the exploration
phases do delegate aggressively, but the **synthesis and writing phases
remain inline**, which is where the bulk of heavy reading and reasoning
actually happens.
## Verified findings
### 1. Exploration is already well-delegated
Agent-spawn density per command (nominal):
| Command | Agents spawned |
|--------------------------|-------------------------------------------------------------------|
| trekresearch | ~914 (5 local + 4 external + 1 bridge + up to 2 follow-ups) |
| trekplan | ~10 (6 initial + conditional research-scout + up to 3 deep-dives) |
| trekbrief | 13 (brief-reviewer per iteration, max 3) |
| trekexecute | 0 (explicit no-agent rule) |
| voyage-skill-author-local | 3 (concept-extractor → skill-drafter → ip-hygiene-checker) |
This part is healthy.
### 2. Synthesis and writing is inline
The main context does the heavy cognitive work after swarm completion:
- **`commands/trekplan.md:483498` (Phase 7 Synthesis):**
"Read all agent results carefully" + "Build a mental model of the codebase
architecture" + "Catalog reusable code" + "Integrate research findings".
This forces 610 agent outputs to remain resident in main context simultaneously.
- **`commands/trekplan.md:499548` (Phase 8 Deep Planning):**
Main context writes the entire plan.md from scratch, including all required
sections, quality standards, and file-path validation.
- **`commands/trekresearch.md:302323` (Phase 6 Triangulation):**
Explicitly labelled "the KEY phase that makes trekresearch more than
aggregation". Dimension-by-dimension comparison of local vs external
findings, contradiction flagging, confidence rating — all inline.
- **`commands/trekresearch.md:325341` (Phase 7 Synthesis):**
Writes the research brief inline using the template.
### 3. Root cause — v2.4.0 foreground migration
Each command carries a `> **Why foreground?**` block
(`trekplan.md:330`, `trekresearch.md:192`) documenting that the
background orchestrators were removed because agents spawned from background
orchestrators silently degraded. The swarm-spawn logic was lifted into the
main context — but so was the synthesis logic the orchestrators used to
carry. The "summarizer" link is missing.
## Candidate interventions
Presented as options, ordered by estimated main-context savings. Numbers
are rough estimates based on the size of the phase bodies — not measured.
| # | Intervention | Target phase | Rough saving |
|---|---------------------------------------------------------------------|-------------------------------------|--------------|
| 1 | `synthesis-agent` — digests all exploration outputs into findings + reuse catalog + gaps | trekplan Phase 7 | 4050% |
| 2 | `plan-writer-agent` — writes plan.md from synthesis + template | trekplan Phase 8 | part of #1 |
| 3 | `triangulation-synthesizer` — per-dimension local vs external diff + confidence rating | trekresearch Phase 6 | 2530% |
| 4 | `research-brief-writer` — writes research brief from triangulation output | trekresearch Phase 7 | part of #3 |
## Tradeoffs (important)
- **Iteration friction.** A synthesis- or writer-agent does not see the
live conversation. If the user wants to push back on the plan ("split
step 3 in two", "re-phrase the risks"), refinement still has to happen
in main context. Delegation works best for the first pass; the revision
loop is harder to delegate.
- **Adversarial review still needs main.** `plan-critic` and
`scope-guardian` already return findings to main context — which then
has to act on them. If the plan was written by an agent, main must
either re-invoke the writer agent with critic feedback, or absorb the
plan back in to revise it. Neither is free.
- **Artifact quality gates.** The current inline phases enforce
quality rules (e.g., "every file path must exist in the codebase").
A writer-agent needs the same codebase context the exploration agents
had — re-delivering that context to the writer burns tokens the
delegation was meant to save.
- **Debuggability.** Inline synthesis is inspectable in the transcript.
Agent-synthesis hides the reasoning inside the agent's return message —
fine when it works, harder to diagnose when it doesn't.
## Recommendation (tentative)
If only one change is made, **intervention #1 (synthesis-agent for
trekplan Phase 7)** has the largest ROI. It isolates the heaviest read
(all 610 agent outputs) behind a summarizer, and its output — a compact
findings document — is small enough to keep resident for Phase 8 planning
and Phase 9 review.
Intervention #3 is a smaller-scope and lower-risk proof-of-concept
that could validate the pattern before touching the main planner.
## Open questions
1. Should the synthesis-agent write to disk (`synthesis.md` alongside
`plan.md`) for inspectability, or return in-memory?
2. Does the adversarial review phase (plan-critic + scope-guardian) need
access to the full exploration outputs, or is the synthesis artifact
enough?
3. Is there a way to measure current main-context usage per phase so the
savings estimates above can be replaced with real numbers before
committing to changes?
4. Does this interact with `REMEMBER.md`'s note that "trekplan schema-drift
on 4.7 produces Phase-plans instead of v1.7 step-schema"? A writer-agent
might either help (isolated, more controllable) or hurt (another layer
where drift can happen) the schema-drift problem.
## Out of scope for this brief
- Implementation details of the new agents
- Changes to trekexecute (no-agent by design)
- Changes to trekbrief Phase 3 interview (must be inline to drive
user dialogue)