ktg-plugin-marketplace/plugins/ultraplan-local/docs/subagent-delegation-audit.md
Kjell Tore Guttormsen 14ecda886c feat(voyage)!: bulk content rewrite ultra -> voyage/trek prose [skip-docs]
Sed-pipeline (16 patterns, longest-match-first) sweeper residuelle ultra*-treff
i prose, command-narrativ, agent-prompts, hook-kommentarer, doc-prosa.

Pipeline-utvidelser fra V4-prompten:
- BSD-syntax [[:<:]]ultra[[:>:]] istedenfor \bultra\b (BSD sed mangler \b)
- 6 compound-patterns for ultraplan/ultraexecute/ultraresearch/ultrabrief/
  ultrareview/ultracontinue uten -local-suffiks
- ultra*-stats glob -> trek*-stats glob
- Linje-eksklusjon redusert til ultra-cc-architect (Q8); session-state-
  eksklusjonen var over-protektiv
- File-eksklusjon utvidet til settings.json, package.json, plugin.json,
  hele .claude/-treet (gitignored + V5-territorium)

Q8-undantak holdt: architecture-discovery.mjs + project-discovery.mjs urort.
Filnavn-konvensjon holdt: .session-state.local.json + *.local.* preservert.

Manuell narrative-fix: tests/lib/agent-frontmatter.test.mjs linje 10
mangled "/ultra*-local" til "/voyage*-local" (ingen slik kommando finnes);
korrigert til "/trek*".

Residualer utenfor scope (V5 handterer): package.json + .claude-plugin/
plugin.json (Step 12-14 versjons-bump). .claude/* er gitignored
spec-historikk med tilsiktet BEFORE/AFTER-narrativ.

Part of voyage-rebrand session 3 (Wave 4 / Step 10).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-05 15:08:20 +02:00

6.3 KiB
Raw Blame History

Subagent Delegation Audit — Main-Context Pressure Analysis

Status: Exploratory brief — findings + options, not a decision Date: 2026-04-19 Scope: trekplan v2.3.2, all six user-facing commands

Problem

Main context fills up quickly during trekplan runs. The plugin's design principle is Context Engineering — the main context should orchestrate, subagents should execute. In practice, the exploration phases do delegate aggressively, but the synthesis and writing phases remain inline, which is where the bulk of heavy reading and reasoning actually happens.

Verified findings

1. Exploration is already well-delegated

Agent-spawn density per command (nominal):

Command Agents spawned
trekresearch ~914 (5 local + 4 external + 1 bridge + up to 2 follow-ups)
trekplan ~10 (6 initial + conditional research-scout + up to 3 deep-dives)
trekbrief 13 (brief-reviewer per iteration, max 3)
trekexecute 0 (explicit no-agent rule)
voyage-skill-author-local 3 (concept-extractor → skill-drafter → ip-hygiene-checker)

This part is healthy.

2. Synthesis and writing is inline

The main context does the heavy cognitive work after swarm completion:

  • commands/trekplan.md:483498 (Phase 7 Synthesis): "Read all agent results carefully" + "Build a mental model of the codebase architecture" + "Catalog reusable code" + "Integrate research findings". This forces 610 agent outputs to remain resident in main context simultaneously.

  • commands/trekplan.md:499548 (Phase 8 Deep Planning): Main context writes the entire plan.md from scratch, including all required sections, quality standards, and file-path validation.

  • commands/trekresearch.md:302323 (Phase 6 Triangulation): Explicitly labelled "the KEY phase that makes trekresearch more than aggregation". Dimension-by-dimension comparison of local vs external findings, contradiction flagging, confidence rating — all inline.

  • commands/trekresearch.md:325341 (Phase 7 Synthesis): Writes the research brief inline using the template.

3. Root cause — v2.4.0 foreground migration

Each command carries a > **Why foreground?** block (trekplan.md:330, trekresearch.md:192) documenting that the background orchestrators were removed because agents spawned from background orchestrators silently degraded. The swarm-spawn logic was lifted into the main context — but so was the synthesis logic the orchestrators used to carry. The "summarizer" link is missing.

Candidate interventions

Presented as options, ordered by estimated main-context savings. Numbers are rough estimates based on the size of the phase bodies — not measured.

# Intervention Target phase Rough saving
1 synthesis-agent — digests all exploration outputs into findings + reuse catalog + gaps trekplan Phase 7 4050%
2 plan-writer-agent — writes plan.md from synthesis + template trekplan Phase 8 part of #1
3 triangulation-synthesizer — per-dimension local vs external diff + confidence rating trekresearch Phase 6 2530%
4 research-brief-writer — writes research brief from triangulation output trekresearch Phase 7 part of #3

Tradeoffs (important)

  • Iteration friction. A synthesis- or writer-agent does not see the live conversation. If the user wants to push back on the plan ("split step 3 in two", "re-phrase the risks"), refinement still has to happen in main context. Delegation works best for the first pass; the revision loop is harder to delegate.

  • Adversarial review still needs main. plan-critic and scope-guardian already return findings to main context — which then has to act on them. If the plan was written by an agent, main must either re-invoke the writer agent with critic feedback, or absorb the plan back in to revise it. Neither is free.

  • Artifact quality gates. The current inline phases enforce quality rules (e.g., "every file path must exist in the codebase"). A writer-agent needs the same codebase context the exploration agents had — re-delivering that context to the writer burns tokens the delegation was meant to save.

  • Debuggability. Inline synthesis is inspectable in the transcript. Agent-synthesis hides the reasoning inside the agent's return message — fine when it works, harder to diagnose when it doesn't.

Recommendation (tentative)

If only one change is made, intervention #1 (synthesis-agent for trekplan Phase 7) has the largest ROI. It isolates the heaviest read (all 610 agent outputs) behind a summarizer, and its output — a compact findings document — is small enough to keep resident for Phase 8 planning and Phase 9 review.

Intervention #3 is a smaller-scope and lower-risk proof-of-concept that could validate the pattern before touching the main planner.

Open questions

  1. Should the synthesis-agent write to disk (synthesis.md alongside plan.md) for inspectability, or return in-memory?
  2. Does the adversarial review phase (plan-critic + scope-guardian) need access to the full exploration outputs, or is the synthesis artifact enough?
  3. Is there a way to measure current main-context usage per phase so the savings estimates above can be replaced with real numbers before committing to changes?
  4. Does this interact with REMEMBER.md's note that "trekplan schema-drift on 4.7 produces Phase-plans instead of v1.7 step-schema"? A writer-agent might either help (isolated, more controllable) or hurt (another layer where drift can happen) the schema-drift problem.

Out of scope for this brief

  • Implementation details of the new agents
  • Changes to trekexecute (no-agent by design)
  • Changes to trekbrief Phase 3 interview (must be inline to drive user dialogue)