Kjell Tore Guttormsen 14ecda886c feat(voyage)!: bulk content rewrite ultra -> voyage/trek prose [skip-docs]

Sed-pipeline (16 patterns, longest-match-first) sweeper residuelle ultra*-treff
i prose, command-narrativ, agent-prompts, hook-kommentarer, doc-prosa.

Pipeline-utvidelser fra V4-prompten:
- BSD-syntax [[:<:]]ultra[[:>:]] istedenfor \bultra\b (BSD sed mangler \b)
- 6 compound-patterns for ultraplan/ultraexecute/ultraresearch/ultrabrief/
  ultrareview/ultracontinue uten -local-suffiks
- ultra*-stats glob -> trek*-stats glob
- Linje-eksklusjon redusert til ultra-cc-architect (Q8); session-state-
  eksklusjonen var over-protektiv
- File-eksklusjon utvidet til settings.json, package.json, plugin.json,
  hele .claude/-treet (gitignored + V5-territorium)

Q8-undantak holdt: architecture-discovery.mjs + project-discovery.mjs urort.
Filnavn-konvensjon holdt: .session-state.local.json + *.local.* preservert.

Manuell narrative-fix: tests/lib/agent-frontmatter.test.mjs linje 10
mangled "/ultra*-local" til "/voyage*-local" (ingen slik kommando finnes);
korrigert til "/trek*".

Residualer utenfor scope (V5 handterer): package.json + .claude-plugin/
plugin.json (Step 12-14 versjons-bump). .claude/* er gitignored
spec-historikk med tilsiktet BEFORE/AFTER-narrativ.

Part of voyage-rebrand session 3 (Wave 4 / Step 10).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-05-05 15:08:20 +02:00

6.3 KiB

Raw Blame History

Subagent Delegation Audit — Main-Context Pressure Analysis

Status: Exploratory brief — findings + options, not a decision Date: 2026-04-19 Scope: trekplan v2.3.2, all six user-facing commands

Problem

Main context fills up quickly during trekplan runs. The plugin's design principle is Context Engineering — the main context should orchestrate, subagents should execute. In practice, the exploration phases do delegate aggressively, but the synthesis and writing phases remain inline, which is where the bulk of heavy reading and reasoning actually happens.

Verified findings

1. Exploration is already well-delegated

Agent-spawn density per command (nominal):

Command	Agents spawned
trekresearch	~9–14 (5 local + 4 external + 1 bridge + up to 2 follow-ups)
trekplan	~10 (6 initial + conditional research-scout + up to 3 deep-dives)
trekbrief	1–3 (brief-reviewer per iteration, max 3)
trekexecute	0 (explicit no-agent rule)
voyage-skill-author-local	3 (concept-extractor → skill-drafter → ip-hygiene-checker)

This part is healthy.

2. Synthesis and writing is inline

The main context does the heavy cognitive work after swarm completion:

commands/trekplan.md:483–498 (Phase 7 Synthesis): "Read all agent results carefully" + "Build a mental model of the codebase architecture" + "Catalog reusable code" + "Integrate research findings". This forces 6–10 agent outputs to remain resident in main context simultaneously.
commands/trekplan.md:499–548 (Phase 8 Deep Planning): Main context writes the entire plan.md from scratch, including all required sections, quality standards, and file-path validation.
commands/trekresearch.md:302–323 (Phase 6 Triangulation): Explicitly labelled "the KEY phase that makes trekresearch more than aggregation". Dimension-by-dimension comparison of local vs external findings, contradiction flagging, confidence rating — all inline.
commands/trekresearch.md:325–341 (Phase 7 Synthesis): Writes the research brief inline using the template.

3. Root cause — v2.4.0 foreground migration

Each command carries a > **Why foreground?** block (trekplan.md:330, trekresearch.md:192) documenting that the background orchestrators were removed because agents spawned from background orchestrators silently degraded. The swarm-spawn logic was lifted into the main context — but so was the synthesis logic the orchestrators used to carry. The "summarizer" link is missing.

Candidate interventions

Presented as options, ordered by estimated main-context savings. Numbers are rough estimates based on the size of the phase bodies — not measured.

#	Intervention	Target phase	Rough saving
1	`synthesis-agent` — digests all exploration outputs into findings + reuse catalog + gaps	trekplan Phase 7	40–50%
2	`plan-writer-agent` — writes plan.md from synthesis + template	trekplan Phase 8	part of #1
3	`triangulation-synthesizer` — per-dimension local vs external diff + confidence rating	trekresearch Phase 6	25–30%
4	`research-brief-writer` — writes research brief from triangulation output	trekresearch Phase 7	part of #3

Tradeoffs (important)

Iteration friction. A synthesis- or writer-agent does not see the live conversation. If the user wants to push back on the plan ("split step 3 in two", "re-phrase the risks"), refinement still has to happen in main context. Delegation works best for the first pass; the revision loop is harder to delegate.
Adversarial review still needs main. plan-critic and scope-guardian already return findings to main context — which then has to act on them. If the plan was written by an agent, main must either re-invoke the writer agent with critic feedback, or absorb the plan back in to revise it. Neither is free.
Artifact quality gates. The current inline phases enforce quality rules (e.g., "every file path must exist in the codebase"). A writer-agent needs the same codebase context the exploration agents had — re-delivering that context to the writer burns tokens the delegation was meant to save.
Debuggability. Inline synthesis is inspectable in the transcript. Agent-synthesis hides the reasoning inside the agent's return message — fine when it works, harder to diagnose when it doesn't.

Recommendation (tentative)

If only one change is made, intervention #1 (synthesis-agent for trekplan Phase 7) has the largest ROI. It isolates the heaviest read (all 6–10 agent outputs) behind a summarizer, and its output — a compact findings document — is small enough to keep resident for Phase 8 planning and Phase 9 review.

Intervention #3 is a smaller-scope and lower-risk proof-of-concept that could validate the pattern before touching the main planner.

Open questions

Should the synthesis-agent write to disk (synthesis.md alongside plan.md) for inspectability, or return in-memory?
Does the adversarial review phase (plan-critic + scope-guardian) need access to the full exploration outputs, or is the synthesis artifact enough?
Is there a way to measure current main-context usage per phase so the savings estimates above can be replaced with real numbers before committing to changes?
Does this interact with REMEMBER.md's note that "trekplan schema-drift on 4.7 produces Phase-plans instead of v1.7 step-schema"? A writer-agent might either help (isolated, more controllable) or hurt (another layer where drift can happen) the schema-drift problem.

Out of scope for this brief

Implementation details of the new agents
Changes to trekexecute (no-agent by design)
Changes to trekbrief Phase 3 interview (must be inline to drive user dialogue)

6.3 KiB Raw Blame History Unescape Escape