Step 10 of v3.4.1 plan.
commands/ultracontinue-local.md:
- New Phase 0.5 between Phase 0 and Phase 1 — terminal cleanup mode
triggered by parsed flags['--cleanup'] === true. Requires explicit
positional[0] (no "clean all"), no template placeholders in the Bash
invocation. Passes through to cleanupProject via inline ESM. Cleanup
never falls through to Phase 1/2/3/4.
- Phase 0 usage block updated to document --cleanup and --cleanup
--confirm forms alongside the legacy <project-dir> form.
tests/commands/ultracontinue.test.mjs:
- Test (Bug 4 prose) — Phase 0.5 header present, references
cleanupProject and flags['--cleanup'], appears between Phase 0 and
Phase 1 in document order, usage mentions --cleanup --confirm.
- Test (f-1) dry-run on completed project lists candidates without
deleting; both files still on disk.
- Test (f-2 + f-3) confirm-mode deletes both files; subsequent
invocation on the already-cleaned dir signals CLEANUP_NO_STATE_FILE
(deterministic terminal state, idempotent for operators).
Tests 355 -> 358 (+3).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Step 8 of v3.4.1 plan.
commands/ultracontinue-local.md:
- New Phase 1.5 between Phase 1 and Phase 2 — runs the
next-session-prompt-validator in --consistency mode when both candidates
exist (plugin-root + project-dir). Refuses on producer mismatch with
fresh candidates, downgrades stale candidate to a warning, downgrades
>24h wall-clock drift to a soft warning.
- Anti-substitution rule applies — paths emitted as concrete tokens, not
template placeholders.
lib/validators/next-session-prompt-validator.mjs:
- Sharpen NEXT_SESSION_PROMPT_PRODUCER_MISMATCH error message to include
the literal "produced_by" field name so consumers (and operators) can
trace the disagreement back to the YAML key.
tests/commands/ultracontinue.test.mjs:
- Test (Bug 3 prose) — Phase 1.5 header present, references validator,
appears between Phase 1 and Phase 2 in document order.
- Test (Bug 3 e) — tmp project dir with state file + two prompt files
with mismatched producers, both fresh relative to state.updated_at;
CLI consistency mode exits non-zero, JSON stdout surfaces
NEXT_SESSION_PROMPT_PRODUCER_MISMATCH with both paths and the
"produced_by" token in the message.
Tests 346 -> 348 (+2).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Step 7 of v3.4.1 plan.
ultraplan-end-session-local Phase 3:
- Replace require()-of-ESM-module shim with node --input-type=module + import.
- Convert Phase 1 project enumeration to ESM as well so the file is uniformly
ESM (grep -c 'require(' commands/ultraplan-end-session-local.md → 0).
- Combined ESM block writes both .session-state.local.json (atomicWriteJson)
and sibling NEXT-SESSION-PROMPT.local.md (writeFileSync) so producers
succeed or fail together.
- Sibling markdown gets frontmatter: produced_by, produced_at, project.
ultraexecute-local Phases 8 / 2.55 / 4:
- Each phase that writes .session-state.local.json now also writes a sibling
NEXT-SESSION-PROMPT.local.md with frontmatter (produced_by:
ultraexecute-local, produced_at: ISO-8601, status). Phase 8 includes the
full ESM block; 2.55 / 4 reference the combined pattern.
- This is the producer side of the Bug 3 contract; consumer-side wire-up
(Phase 1.5 consistency check in /ultracontinue) lands in Step 8.
Tests: 346 green (no new tests this step — coverage comes via Step 8
integration test).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Step 5 of v3.4.1 hot-fix plan. Phase 2 of
commands/ultracontinue-local.md is rewritten to remove every curly-
brace template placeholder. The {state-file-path} substitution failure
caused the path-guard hook to crash on unresolved templates.
New Phase 2 structure:
2.a — Read the file with the Read tool (no Bash). Deterministic and
not subject to shell-substitution errors.
2.b — Schema-validate via the existing CLI shim, with the resolved
absolute path emitted as a literal string token by the model
at the time of the Bash call. Anti-substitution invariant:
STOP if about to emit any unresolved placeholder.
2.c — Interpret validator result (preserved verbatim from the
previous Phase 2 — three-way branch on valid + status).
Verification: grep -c "{state-file-path}" returns 0; full Phase 2
section contains no {lowercase-template} curly-brace placeholders.
Suite 322 -> 335 passing (+13: 7 from Step 1, 4 from Step 2, 2 from
Step 4).
Step 3 of v3.4.1 hot-fix plan. Three fixes in
commands/ultracontinue-local.md:
- Phase 0: replace "$ARGUMENTS contains --help or -h" with parsed-arg
dispatch via parseArgs(...,'ultracontinue'). Usage block fires only
when flags['--help'] === true OR positional[0] === '-h'. Empty,
whitespace, and project-dir args fall through to Phase 1
(auto-discovery), which is the operator-default invocation.
- Phase 1.a: NEW — reject .md positional arg with SC-2 diagnostic
("expected <project-dir>" + "did you mean to paste"). Operators
pasting a NEXT-SESSION-PROMPT.local.md path see a clear error
instead of a confusing fallthrough.
- Phase 1.b: auto-discovery node -e now emits {path, updated_at}
JSON per candidate; Phase 1 sorts numerically via
Date.parse(updated_at) DESC instead of lexicographic compare.
Newest in_progress wins, including across year-boundary timestamps.
All 4 Step 2 regression tests now green; full suite 322 → 333 passing.
Wire the main-merge-gate lifecycle event into commands/ultraexecute-local.md
Phase 8. Three event variants emitted via lib/stats/event-emit.mjs (S8):
- main-merge-gate fired at the gate boundary
- main-merge-approved fired on operator confirm
- main-merge-declined fired on operator decline (run recorded as partial)
The gate ALWAYS pauses regardless of gates_mode — it is the one always-on
boundary that --gates does not toggle. On decline, --resume re-enters at
the gate, and the wave session branches survive on the remote thanks to
Hard Rule 19's push-before-cleanup. Recovery surface is documented inline.
Pin in tests/lib/main-merge-gate.test.mjs locks the always-on prose, the
event names, and the recovery-surface contract.
Single autonomy-control surface (--gates) added to ultrabrief, ultraresearch,
ultraplan, and ultraexecute. When present, sets gates_mode = true and
re-enables approval pauses at every phase boundary + every wave for
high-stakes runs. When absent (default in auto), the chain runs continuously
to the main-merge gate (which always pauses regardless of --gates — that
boundary is the one always-on safety stop).
ultrabrief: pause after auto-mode confirmation; emit brief-approved event
ultraresearch: pause after each topic completes
ultraplan: pause after Phases 5, 7, 9
ultraexecute: pause after each wave's worktrees finish, before merge-back,
AND before the main-merge gate (MAIN_MERGE_GATE)
All four commands invoke the autonomy-gate state machine via the CLI shim
node lib/util/autonomy-gate.mjs (built in S8). Test pin in
tests/lib/gates-flag-coverage.test.mjs locks the contract.
Also wires the brief-approved stats emission into ultrabrief Phase 5 auto
path (was the SC4 wiring requirement from plan-v2 Step 11).
Strengthen single-message reinforcement for plan-critic + scope-guardian
parallel dispatch in commands/ultraplan-local.md Phase 9 and mirror in
agents/planning-orchestrator.md Phase 6. Reviewers now write structured JSON
to /tmp/{plan-critic,scope-guardian}-out.json which is merged via the
lib/review/plan-review-dedup.mjs CLI shim from S8.
The merged set lets us revise the plan once for duplicate findings instead
of twice. Source: research/05 R1 + R2.
Pin in tests/lib/doc-consistency.test.mjs locks both files against
single-message + dedup-helper regressions.
Inline STEP_HEADING_REGEX, FORBIDDEN_HEADING_REGEX, the canonical step+manifest
example, and the post-write plan-validator self-check directly into Phase 8 of
commands/ultraplan-local.md. This eliminates the dependency on Opus 4.7
implicitly loading agents/planning-orchestrator.md — the format contract now
travels with the command file itself.
Source: research/04 D5 + plan-v2 Step 7. Pin in tests/lib/doc-consistency.test.mjs
locks the substrings so future edits cannot silently regress the seal.
Tiny helper command for ad-hoc multi-session flows that don't run through
/ultraexecute-local. Writes .session-state.local.json so /ultracontinue
can resume in a fresh chat. Required args (next-brief-path, next-label) —
no inline prompt, headless-safe. Validates via session-state-validator
and prints the same 3-line narration that /ultracontinue Phase 3 uses
(SC-8 cross-project consistency).
Step 9 of /ultracontinue v3.3.0. README/CLAUDE updates land in Step 11.
Three insertions in commands/ultraexecute-local.md so every session-end
path produces or refreshes .session-state.local.json (Handover 7):
- Phase 2.55 (Check 1, line ~376): write status=stopped on dirty-tree
pre-flight stop before parallel session-spawn
- Phase 4 (line ~773): write status=stopped when entry condition fails
- Phase 8 (line ~1151): canonical convergence — every completed/failed/
stopped/partial run refreshes the state file using atomicWriteJson +
validator verification
Phase 2.3 (validate exit) and Phase 5 (dry-run) intentionally skip the
write — neither path is resumable. Validator errors warn but never block
the run; progress.json remains authoritative.
[skip-docs] rationale: README + CLAUDE.md updates land in Step 11.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Reads .claude/projects/<project>/.session-state.local.json (Handover 7),
narrates a 3-line summary, and immediately begins executing the next
session — no interactive confirmation, headless-safe.
Phases:
- 0: --help (self-documenting per brief NFR)
- 1: resolve project dir (auto-discover via node -e enumeration)
- 2: validate via session-state-validator
- 3: narrate (project / next_session_label / brief path)
- 4: read brief and begin
- 5: stats
[skip-docs] rationale: README + CLAUDE.md updates land in Step 11 (Session
2b) per plan structure. Step 8 (docs:) updates HANDOVER-CONTRACTS.md and
the doc-consistency test pin in the same session.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Step 1 of v2.0 plan. Hard cut from commands/ to skills/ per Anthropic
recommendation for new plugins. Frontmatter sets disable-model-invocation:
true and pins model: claude-sonnet-4-6. Docs (README, CLAUDE.md, root
README) deferred to Step 9 per plan.
Extract `/ultra-cc-architect-local` and `/ultra-skill-author-local` plus all 7
supporting agents, the `cc-architect-catalog` skill (13 files), the
`ngram-overlap.mjs` IP-hygiene script, and the skill-factory test fixtures
from `ultraplan-local` v2.4.0 into a new `ultra-cc-architect` plugin v0.1.0.
Why: ultraplan-local had drifted into containing two distinct domains — a
universal planning pipeline (brief → research → plan → execute) and a
Claude-Code-specific architecture phase. Keeping them together forced users
to inherit an unfinished CC-feature catalog (~11 seeds) when they only
wanted the planning pipeline, and locked the catalog and the pipeline into
the same release cadence. The architect was already optional and decoupled
at the code level — only one filesystem touchpoint remained
(auto-discovery of `architecture/overview.md`), which already handles
absence gracefully.
Plugin manifests:
- ultraplan-local: 2.4.0 → 3.0.0 (description + keywords updated)
- ultra-cc-architect: new at 0.1.0 (pre-release; catalog is thin, Fase 2/3
of skill-factory unbuilt, decision-layer empty, fallback list still
needed)
What stays in ultraplan-local: brief/research/plan/execute commands, all
19 planning agents, security hooks, plan auto-discovery of
`architecture/overview.md` (filesystem-level contract, not code-level).
What moved (28 files via git mv, R100 — full history preserved):
- 2 commands, 8 agents, 1 skill catalog (13 files), 2 scripts, 8 fixtures
Documentation updates: plugin CLAUDE.md and README.md for both plugins,
root README.md (added ultra-cc-architect section, updated ultraplan-local
section), root CLAUDE.md (added ultra-cc-architect to repo-struktur),
marketplace.json (registered ultra-cc-architect), ultraplan-local
CHANGELOG.md (v3.0.0 entry with migration guidance).
Test verification: ngram-overlap.test.mjs passes 23/23 from new location.
Memory updated: feedback_no_architect_until_v3.md now points at the new
plugin and reframes the threshold around catalog maturity rather than an
ultraplan-local milestone.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds the profile recommendation step to /ultrabrief-local Phase 4. The
brief stays universal (same questions, same template); the new step is
purely a processing-decision layer that records which profile downstream
commands should apply.
What lands:
- agents/profile-recommender.md — new sonnet agent that scores available
profiles against the finalized brief (keyword + NFR-signal matching,
axis bumps, hallucination gate that forbids inventing profile names).
Emits a fenced JSON block with ranked entries.
- templates/ultrabrief-template.md — frontmatter gains
recommended_profile, profile_match, profile_rationale (default values
applied when only `default` is available — true at M1).
- commands/ultrabrief-local.md — Phase 4 gains Step 4h with explicit
branches: short-circuit when only `default` exists; AskUserQuestion
confirmation when top score ≥ 0.7; explicit fallback message when below
threshold; manual selection sub-question on user override. Persists the
three frontmatter fields to brief.md after user confirmation. JSON
parser failure falls back to `default` with `profile_match: fallback`
rather than blocking — silent fallback is the worst outcome, but a
*visible* fallback is acceptable.
- scripts/profile-loader.mjs — adds selectRecommendation(ranked, opts) +
RECOMMENDATION_THRESHOLD=0.7 export. Single source of truth for the
threshold logic so the command spec and the helper agree.
- scripts/profile-loader.test.mjs — 10 new tests for selectRecommendation
(default-only, empty/malformed input, above/below threshold, custom
threshold, max-by-score, missing fields). Total now 36/36.
- README.md / CLAUDE.md / marketplace landing — docs reflect M0 + M1
shipped, M2 + M3 still pending.
In practice nothing changes for users at M1 because only `default` is
available — Step 4h takes the short-circuit path and writes
`profile_match: default-only`. M2 ships the additional profiles that
make the recommender meaningful.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Introduces a profile-loader infrastructure for runtime-instantiable
ultraplan variants (depth × domain × goal axes). M0 ships only the
`default` profile, which mirrors the current hardcoded Phase 5/9 agent
set — so existing flows are unaffected.
What lands:
- profiles/default.yaml — schema v1, lists current 8 exploration agents
+ 2 review agents, captures today's adversarial regime
- scripts/profile-loader.mjs — null-deps Node loader with limited-subset
YAML parser, listProfiles(), loadProfile(), validateProfile() that
cross-checks every referenced agent exists in agents/
- scripts/profile-loader.test.mjs — 26 node:test cases (parser, validation,
loader, integration with built-in default.yaml)
- commands/ultraplan-local.md — Phase 1 gains a "Resolve the profile"
step (--profile flag → brief.recommended_profile → default fallback)
and prints profile + source in the mode report. Phase 5/9 unchanged.
- README.md, CLAUDE.md, marketplace README — documentation of the M0
foundation, the universal-brief design principle, and the M1/M2/M3
milestones to come.
M1 (next) wires profile recommendation into ultrabrief Phase 4. M2
ships the additional built-in profiles (quick, bugfix, feature, refactor,
security-deep, research-heavy) and replaces the hardcoded Phase 5 agent
table with profile-driven selection. M3 adds user-extensible profiles.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Remove background-transition phases from /ultraresearch-local,
/ultraplan-local, /ultra-cc-architect-local, /ultrabrief-local.
All four commands now run their full pipelines inline in main
context; --fg is retained as a no-op alias for backwards
compatibility.
The Claude Code harness does not expose the Agent tool to
sub-agents, so orchestrators launched with run_in_background:true
cannot spawn their documented swarms (docs-researcher,
community-researcher, architecture-mapper, plan-critic, etc.) and
silently degrade to single-context reasoning. Foreground execution
keeps the swarms intact.
Source: github.com/anthropics/claude-code/issues/19077
Confirmed empirically 2026-04-19.
BREAKING CHANGE: Default execution is foreground — the session
blocks until the brief/plan is ready. Use `claude -p` in a
separate terminal for long-running headless work.
Replace hardcoded Q1-Q8 in /ultrabrief-local with a section-driven
completeness loop (Phase 3) and a draft/review/revise loop with
brief-reviewer as stop-gate (Phase 4). Quality drives the interview,
not a question counter.
brief-reviewer now emits a machine-readable JSON block with per-dimension
scores (1-5) and detail arrays alongside the existing prose report;
planning-orchestrator continues to consume the prose verdict unchanged.
Phase 4 gate: all dimensions >= 4 AND research_plan = 5. On fail, a
targeted follow-up is generated from the weakest dimension's detail
field and the draft is re-reviewed. Max 3 review iterations bound cost;
exhaustion writes brief.md with brief_quality: partial and an explicit
Brief Quality section. Force-stop surfaces per-dimension findings before
the user chooses continue or partial.
Not breaking. /ultrabrief-local [--quick] <task> interface unchanged.
--quick now means compact start with escalation, not a max-N cap.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Opus 4.7 reads agent instructions more literally than 4.6. The v1.7
planning-orchestrator described the Step+Manifest schema via prose +
procedural rules, which 4.6 inferred correctly but 4.7 sometimes
rendered as narrative "Fase N" prose — producing plans ultraexecute
Phase 2 rejected. First observed 2026-04-17 during llm-security v6.2.0
planning.
v1.8.0 closes the gap:
- planning-orchestrator Phase 5 embeds a literal copyable Step+Manifest
example (JWT middleware) replacing "read the template" prose
- Explicit forbidden-format clause: ## Fase N, ### Phase N, ### Stage N,
and any non-"### Step N:" heading are denied
- Phase 5.5 schema self-check: grep-verify canonical Step count matches
Manifest count and narrative heading count is zero, before handing to
plan-critic
- ultraexecute-local --validate mode: schema-only check that parses
steps + manifests, reports READY/FAIL with actionable error hints,
no security scan, no execution. Fast sanity check between
/ultraplan-local and full execution.
Static verification: 17/17 PASS.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Wave 1 of a 6-session parallel build revealed three failure modes:
(1) hallucinated completion (status=completed after 2/5 steps, last
tool call was an arbitrary file review), (2) fail-late bash (3/6
sessions had push blocked inside sub-agent sandbox after all work
was done), (3) no objective verification (plans were prose).
v1.7 closes all three by making the plan an executable contract.
Per-step YAML manifest (expected_paths, commit_message_pattern,
bash_syntax_check, forbidden_paths, must_contain) is the objective
completion predicate. Plan-critic dimension 10 (Manifest quality)
is a hard gate. Session decomposer propagates manifests verbatim
and emits an obligatory Step 0 pre-flight (git push --dry-run,
exit 77 sentinel) in every session spec.
ultraexecute-local gets Phase 7.5 (independent manifest audit from
git log + filesystem, ignoring agent bookkeeping) and Phase 7.6
(bounded recovery dispatch, recovery_depth ≤ 2). Hard Rule 17
forbids marking a step passed without manifest verification. Hard
Rule 18 forbids ending on an arbitrary tool call before reporting.
Division of labor is made explicit:
- /ultraresearch-local gathers context (no build decisions)
- /ultraplan-local produces an executable contract (manifests,
plan-critic gate)
- /ultraexecute-local executes disciplined (does NOT compensate
for weak plans — escalates)
Code complete. Docs partial (Arbeidsdeling table + manifest section
added to plugin + marketplace READMEs). Verification tests
(10-sequence) pending — see REMEMBER.md.
Backward compat: v1.6 plans without plan_version marker get
legacy mode with synthesized manifests and legacy_plan: true in
progress file. Plan-critic emits advisory, not block.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Four-layer security model for ultraexecute-local and headless sessions:
Layer 1 — Plugin hooks: pre-bash-executor.mjs (13 BLOCK + 8 WARN rules
with bash evasion normalization) and pre-write-executor.mjs (8 path guard
rules blocking .git/hooks, .claude/settings, shell configs, .env, SSH/AWS).
Layer 2 — Prompt-level security rules: denylist in ultraexecute-local.md
Sub-step D and session-spec-template.md Security Constraints section.
These are the only rules that work in headless child sessions.
Layer 3 — Pre-execution plan validation: new Phase 2.4 scans all Verify
and Checkpoint commands against denylist before execution begins.
Layer 4 — Replace --dangerously-skip-permissions with scoped
--allowedTools "Read,Write,Edit,Bash,Glob,Grep" --permission-mode
bypassPermissions in ultraexecute-local.md, headless-launch-template.md,
and session-decomposer.md. Blocks Agent, MCP, WebSearch in child sessions.
Also adds Hard Rules 14-16: verify command security check, no writing
outside repository root, no writing to security-sensitive paths.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add /ultraresearch-local for structured research combining local codebase
analysis with external knowledge via parallel agent swarms. Produces research
briefs with triangulation, confidence ratings, and source quality assessment.
New command: /ultraresearch-local with modes --quick, --local, --external, --fg.
New agents: research-orchestrator (opus), docs-researcher, community-researcher,
security-researcher, contrarian-researcher, gemini-bridge (all sonnet).
New template: research-brief-template.md.
Integration: --research flag in /ultraplan-local accepts pre-built research
briefs (up to 3), enriches the interview and exploration phases. Planning
orchestrator cross-references brief findings during synthesis.
Design principle: Context Engineering — right information to right agent at
right time. Research briefs are structured artifacts in the pipeline:
ultraresearch → brief → ultraplan --research → plan → ultraexecute.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Phase 2.6 previously launched parallel claude -p sessions in the same
working directory, causing git race conditions and repository corruption.
Changes:
- Add Phase 2.55 (pre-flight safety checks): clean tree, plan file
tracking, scope fence overlap validation, stale worktree cleanup
- Rewrite Phase 2.6 with git worktree isolation: each parallel session
gets its own worktree and branch, merged back sequentially
- Add merge conflict detection and abort (no silent data loss)
- Add unconditional worktree cleanup (even on failure)
- Add hard rules 11-13 (worktree mandatory, cleanup, sequential merge)
- Session-scoped progress file naming for --session mode
- Update headless launch template with worktree support and cleanup trap
- Bump version to 1.5.0
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>