feat: initial open marketplace with llm-security, config-audit, ultraplan-local

This commit is contained in:
Kjell Tore Guttormsen 2026-04-06 18:47:49 +02:00
commit f93d6abdae
380 changed files with 65935 additions and 0 deletions

View file

@ -0,0 +1,647 @@
---
name: ultraexecute-local
description: Disciplined plan executor — single-session or multi-session with parallel orchestration, failure recovery, and headless support
argument-hint: "[--fg | --resume | --dry-run | --step N | --session N] <plan.md>"
model: opus
allowed-tools: Read, Write, Edit, Bash, Glob, Grep, AskUserQuestion
---
# Ultraexecute Local
Disciplined executor for ultraplan plans. Reads a plan file, detects if it has
an Execution Strategy (multi-session), and either executes directly or
orchestrates parallel headless sessions — all to realize one plan.
Designed to work identically in interactive and headless (`claude -p`) mode.
## Phase 1 — Parse mode and validate input
Parse `$ARGUMENTS` for mode flags:
1. If arguments contain `--fg`: extract the file path. Set **mode = foreground**.
2. If arguments contain `--resume`: extract the file path. Set **mode = resume**.
3. If arguments contain `--dry-run`: extract the file path. Set **mode = dry-run**.
4. If arguments contain `--step N` (N is a positive integer): extract N and the file path.
Set **mode = step**, **target-step = N**.
5. If arguments contain `--session N` (N is a positive integer): extract N and the file path.
Set **mode = session**, **target-session = N**.
6. Otherwise: the entire argument string is the file path. Set **mode = execute**.
If no path is provided, output usage and stop:
```
Usage: /ultraexecute-local <plan.md>
/ultraexecute-local --fg <plan.md>
/ultraexecute-local --resume <plan.md>
/ultraexecute-local --dry-run <plan.md>
/ultraexecute-local --step N <plan.md>
/ultraexecute-local --session N <plan.md>
Modes:
(default) Auto — multi-session if plan has Execution Strategy, else foreground
--fg Force foreground — all steps sequentially, ignore Execution Strategy
--resume Resume from last progress checkpoint
--dry-run Validate plan and show execution strategy without running
--step N Execute only step N (foreground)
--session N Execute only session N from the plan's Execution Strategy
Examples:
/ultraexecute-local .claude/plans/ultraplan-2026-04-06-auth-refactor.md
/ultraexecute-local --fg .claude/plans/ultraplan-2026-04-06-auth-refactor.md
/ultraexecute-local --session 2 .claude/plans/ultraplan-2026-04-06-auth-refactor.md
/ultraexecute-local --dry-run .claude/plans/ultraplan-2026-04-06-auth-refactor.md
```
If the file does not exist, report and stop:
```
Error: file not found: {path}
```
Report detected mode:
```
Mode: {execute | resume | dry-run | step N}
File: {path}
```
## Phase 2 — Detect file type and parse structure
Read the file. Determine whether it is an **ultraplan** or a **session spec**:
- **Session spec**: contains `## Dependencies` with `Entry condition:` AND `## Scope Fence`
AND `## Exit Condition` sections.
- **Ultraplan**: contains `## Implementation Plan` with numbered `### Step N:` headings
but no `## Scope Fence`.
If neither structure is detected, report and stop:
```
Error: unrecognized file format. Expected an ultraplan or session spec.
```
### Parse steps
Extract every `### Step N: {description}` heading (in order). For each step, extract:
- **Files** — file paths to create or modify
- **Changes** — what to modify
- **Reuses** — existing code to leverage (informational)
- **Test first** — test to run before implementation (optional)
- **Verify** — command to run after implementation
- **On failure** — recovery action (revert/retry/skip/escalate)
- **Checkpoint** — git commit command after success
If a step is missing `On failure`, default to `escalate` and record a parse warning.
If a step is missing `Verify`, record a parse warning.
### Parse session spec fields (if applicable)
- **Entry condition** from `## Dependencies`
- **Touch list** and **Never-touch list** from `## Scope Fence`
- **Exit condition** checklist from `## Exit Condition`
### Parse Execution Strategy (if present)
If the plan contains an `## Execution Strategy` section, extract:
- Each `### Session N: {title}` with its Steps, Wave, Depends on, and Scope fence
- The `### Execution Order` with wave definitions
Set **has_execution_strategy = true**.
Report:
```
Type: {plan | session-spec}
Steps: {N}
{if has_execution_strategy}: Execution Strategy: {S} sessions across {W} waves
{if session spec}: Entry condition: {text}
{if session spec}: Scope fence: {N} touch, {N} never-touch
{if warnings}: Warnings: {list}
```
## Phase 2.5 — Execution strategy decision
Determine how to execute this plan:
**Run as single session (foreground)** when ANY of these are true:
- `--fg` flag is set
- `--step N` mode
- `--resume` mode
- `--session N` mode (runs only that session's steps, foreground)
- Plan has no `## Execution Strategy` section
- Plan has Execution Strategy with only 1 session
**Run as multi-session (parallel orchestration)** when ALL of these are true:
- mode = `execute` (default, no --fg)
- Plan has `## Execution Strategy` with 2+ sessions
- At least one wave has 2+ sessions (parallelism possible)
**Run as multi-session (sequential orchestration)** when:
- mode = `execute` (default, no --fg)
- Plan has `## Execution Strategy` with 2+ sessions
- All sessions are in different waves (no parallelism, but still separate sessions)
For single-session: continue to Phase 3.
For multi-session: jump to Phase 2.6.
Report:
```
Strategy: {single session | N sessions (M parallel, K sequential)}
```
## Phase 2.6 — Multi-session orchestration
**Only runs for multi-session execution.** This phase launches headless child
sessions and collects results. After this phase, jump directly to Phase 8
(final report).
### Step 0 — Billing safety check (MANDATORY)
Before launching ANY `claude -p` process, check the environment:
```bash
echo "${ANTHROPIC_API_KEY:+SET}"
```
If the result is `SET`, **STOP** and warn the user. `claude -p` sessions with
`ANTHROPIC_API_KEY` in the environment bill the **API account** (pay-per-token),
not the user's Claude subscription (Max/Pro). Parallel Opus sessions can cost
$50100+ per run.
Use AskUserQuestion with these options:
**Question:** "ANTHROPIC_API_KEY is set in your environment. Parallel `claude -p`
sessions will bill your API account, not your Claude subscription. How do you
want to proceed?"
| Option | Description |
|--------|-------------|
| **Use --fg instead (Recommended)** | Run all steps sequentially in this session using your subscription. No extra cost. |
| **Continue with API billing** | Launch parallel sessions. Each session bills your API account at token rates. |
| **Stop** | Cancel execution. Unset ANTHROPIC_API_KEY first, then re-run. |
If the user chooses `--fg`: restart execution with mode = foreground (jump back
to Phase 3, single-session).
If the user chooses `Continue`: proceed with Phase 2.6 Step 1.
If the user chooses `Stop`: report "Execution cancelled — billing safety check"
and stop.
If `ANTHROPIC_API_KEY` is NOT set: proceed silently to Step 1.
### Step 1 — Create session log directory
```bash
mkdir -p .claude/ultraplan-sessions/{slug}/logs
```
### Step 2 — Execute waves
For each wave (in order):
**Launch sessions in this wave:**
For each session in the wave, launch a headless `claude -p` process:
```bash
claude -p "/ultraexecute-local --session {N} {plan-path}" \
> .claude/ultraplan-sessions/{slug}/logs/session-{N}.log 2>&1 &
```
If the wave has only 1 session, run it without `&` (no background needed).
Track PIDs for parallel sessions.
**Wait for wave completion:**
```bash
wait {PID1} {PID2} ...
```
**Check results after each wave:**
For each session in the wave, read its log file and grep for
`"ultraexecute_summary"`. Parse the JSON to determine:
- Did the session complete? (`result: "completed"`)
- Did it fail? (`result: "failed"` or `"stopped"`)
If ANY session in the wave failed:
```
Wave {W} FAILED: Session {N} failed at step {S}.
Stopping — later waves depend on this wave.
See log: .claude/ultraplan-sessions/{slug}/logs/session-{N}.log
```
Do NOT start later waves. Jump to Phase 8 with partial results.
If all sessions in the wave passed: continue to the next wave.
### Step 3 — Run master verification
After all waves complete successfully, run the plan's `## Verification` section
commands to verify the integrated result.
### Step 4 — Aggregate results
Collect all session summaries into an aggregated report. Jump to Phase 8.
### --session N mode
When mode = `session N`:
1. Find session N in the Execution Strategy
2. Extract its step numbers (e.g., Steps: 4, 5, 6)
3. Extract its scope fence (Touch / Never touch lists)
4. Execute ONLY those steps, in order, using the single-session protocol (Phase 3→7)
5. Enforce the session's scope fence as if it were a session spec's scope fence
6. Report results for those steps only
This mode is used internally by Phase 2.6 when launching child sessions.
It can also be used manually to re-run a specific session.
## Phase 3 — Progress file setup
The progress file lives at `{plan-dir}/.ultraexecute-progress-{slug}.json` where
`{slug}` is the plan filename without extension.
### Progress file schema
```json
{
"schema_version": "1",
"plan": "{path}",
"plan_type": "{plan | session-spec}",
"started_at": "{ISO-8601}",
"updated_at": "{ISO-8601}",
"mode": "{execute | resume | step}",
"total_steps": 0,
"current_step": 0,
"status": "{in-progress | completed | failed | stopped}",
"steps": {
"1": { "status": "pending", "attempts": 0, "error": null, "completed_at": null, "commit": null }
},
"entry_condition_checked": false,
"exit_condition_checked": false,
"summary": null
}
```
### Mode-specific behavior
**mode = execute (fresh):**
- If a progress file exists with status `in-progress` or `failed`: warn that
`--resume` is available, then wait 3 seconds (`sleep 3`) and start fresh.
This allows headless runs to proceed without blocking.
- Otherwise: create the progress file with all steps in `pending` status.
**mode = resume:**
- If no progress file exists: start from step 1 (same as fresh execute).
- If progress file exists: find the first step with status != `passed`.
```
Resuming from step {N}. {M}/{total} steps already completed.
```
**mode = dry-run:**
- Do NOT create or modify the progress file.
**mode = step N:**
- Create the progress file if it does not exist.
- Only step N will be executed.
## Phase 4 — Entry condition check (session specs only)
**Skip for ultraplans.** Skip in dry-run mode (report what would be checked instead).
Read the entry condition. Evaluate it:
- `"none"` or similar → pass immediately
- References git state (e.g., "git status clean") → run `git status --porcelain`
- References passing tests → run the specified command
- References a previous session → check `git log --oneline` for commit pattern
If the entry condition **fails**:
```
Entry condition FAILED: {condition text}
Reason: {what was checked, what was found}
Complete the prerequisite first, then re-run.
```
Update progress file with `status: "stopped"`. Stop execution.
If the entry condition **passes**:
```
Entry condition: PASS
```
Update `entry_condition_checked: true` in the progress file.
## Phase 5 — Dry-run report (dry-run mode only)
**Only runs when mode = dry-run.** Produces a validation report, then stops.
```
## Dry Run Report: {filename}
**Type:** {plan | session-spec}
**Steps:** {N}
### Step Validation
| Step | Description | Verify | On failure | Checkpoint | Issues |
|------|-------------|--------|------------|------------|--------|
| 1 | {desc} | {cmd} | {action} | {msg} | {none / missing X} |
### File References
{For each file in Files: fields, check existence with Glob}
- {path}: EXISTS | NOT FOUND {(marked as new file) | (unexpected — may be missing)}
### Entry / Exit Conditions (session specs)
{What would be checked}
### Execution Preview (only when plan has Execution Strategy)
If `has_execution_strategy = true`, show a preview of multi-session orchestration:
```
**Sessions:** {S} across {W} waves
| Wave | Session | Steps | Depends on | Command |
|------|---------|-------|------------|---------|
| 1 | Session 1: {title} | {nums} | none | `claude -p "/ultraexecute-local --session 1 {path}"` |
| 1 | Session 2: {title} | {nums} | none | `claude -p "/ultraexecute-local --session 2 {path}"` |
| 2 | Session 3: {title} | {nums} | S1, S2 | `claude -p "/ultraexecute-local --session 3 {path}"` |
```
Check billing status via `echo "${ANTHROPIC_API_KEY:+SET}"` and report:
```
Billing: ANTHROPIC_API_KEY is {SET — parallel sessions will bill API account | NOT SET — sessions will use subscription}
```
### Verdict
{READY | NEEDS ATTENTION — N issues found}
```
Stop after the dry-run report. Do not execute anything.
## Phase 6 — Step execution loop
The core execution phase. Runs for modes: `execute`, `resume`, `step`.
### Determine starting step
- **execute**: step 1
- **resume**: first step where status != `passed`
- **step N**: step N only
### For each step
Update progress: `steps.{N}.status = "running"`, `current_step = N`, `updated_at = now`.
```
--- Step {N}/{total}: {description} ---
```
#### Sub-step A — Scope fence check (session specs only)
Before touching any file, verify that every file in the step's `Files:` field is
in the session spec's Touch list (or is a new file to create). If ANY file is in
the Never-touch list:
```
SCOPE VIOLATION: Step {N} requires {file} which is in the never-touch list.
Escalating — this step cannot be executed within this session's scope.
```
Treat this as an automatic `escalate`. Jump to the stop-and-report logic.
#### Sub-step B — Test first (if present)
If the step has a `Test first:` field:
1. If test file is marked `(new)`: note it will be created during implementation.
2. If test file exists: run it. Expect failure (RED state).
3. If test unexpectedly passes: warn but continue — step may already be done.
Do not block on test-first failures — they are expected.
#### Sub-step C — Implement changes
Read the step's `Files:` and `Changes:` fields. Implement exactly as described.
**Rules:**
- Follow `Changes:` exactly — do not improvise, add scope, or optimize
- Use Edit for modifications, Write for new files
- If `Reuses:` references existing code, read that code first for context
- Only touch files listed in `Files:` — nothing else
#### Sub-step D — Verification
Run the `Verify:` command exactly as written, via Bash.
**Rules:**
- Always a fresh run — never trust prior results
- Exit code is the authoritative truth:
- Exit 0 + expected output (if specified) = **PASS**
- Exit non-zero = **FAIL** regardless of output text
- Exit 0 but wrong output = **FAIL**
```
Verify: {command}
Result: {PASS | FAIL} (exit code {N})
{if FAIL}: Output (first 10 lines): {output}
```
If **PASS**: proceed to Sub-step F (checkpoint).
#### Sub-step E — On failure handling
If **FAIL**, read the `On failure:` clause. Apply the retry cap: **maximum 2 retries**
(3 total attempts). Track attempts in `steps.{N}.attempts`.
**`On failure: revert`**
- If attempts < 3: analyze the failure, re-implement with adjustments, re-verify.
```
Attempt {A}/3 failed. Retrying...
```
- If attempts == 3: revert this step's changes:
```bash
git checkout -- {files from Files: field}
```
Record failure. **Do NOT proceed to next step.** Jump to Phase 7.
**`On failure: retry`**
- If attempts < 3: use the alternative approach described in the On failure clause.
- If attempts == 3: revert and stop. Jump to Phase 7.
**`On failure: skip`**
- Mark step as skipped regardless of attempt count. Continue to next step.
```
Step {N}: SKIPPED (non-critical per plan)
```
Update `steps.{N}.status = "skipped"`.
**`On failure: escalate`**
- Stop immediately regardless of attempt count.
```
Step {N}: ESCALATED — requires human judgment
```
Commit all completed work before stopping. Stage ONLY files from steps with
`status: "passed"` in the progress file — collect their `Files:` fields. Never
use `git add -A` (risks staging secrets, binaries, or unrelated work).
```bash
git add {files from passed steps' Files: fields} && git commit -m "wip: ultraexecute-local stopped at step {N} — escalation needed"
```
Jump to Phase 7.
#### Sub-step F — Checkpoint
Run the `Checkpoint:` git commit command exactly as written in the plan.
If the commit fails (nothing to commit, etc.): warn but do NOT fail the step.
The step's verification already passed — the commit is bookkeeping.
```
Step {N}: PASS (committed: {hash})
```
Update progress: `steps.{N}.status = "passed"`, `steps.{N}.commit = {hash}`,
`steps.{N}.completed_at = now`.
### Step mode exit
If mode = `step N`: after completing step N (pass or fail), skip remaining steps
and jump to Phase 8 (final report).
## Phase 7 — Exit condition check (session specs only)
**Skip for ultraplans.** Run only when all steps passed (not on early stop).
Run each exit condition command from the `## Exit Condition` checklist:
```
Exit condition check:
- [ ] {command} → {PASS | FAIL}
- [ ] {command} → {PASS | FAIL}
```
If all pass: `exit_condition_checked: true` in progress file.
If any fail: record which failed. Include in final report.
## Phase 8 — Final report
Always produce a final report.
Update progress file: `status` to `completed`/`failed`/`stopped`, `updated_at`, `summary`.
```
## Ultraexecute Local Complete
**Plan:** {path}
**Type:** {plan | session-spec}
**Mode:** {execute | resume | step N}
**Result:** {COMPLETED | FAILED at step N | STOPPED (escalation) | PARTIAL (N/total passed)}
### Step Results
| Step | Description | Result | Attempts | Commit |
|------|-------------|--------|----------|--------|
| 1 | {desc} | PASS | 1 | abc1234 |
| 2 | {desc} | FAIL | 3 | — |
| 3 | {desc} | — | 0 | — |
### Summary
- Passed: {N}/{total}
- Skipped: {N}
- Failed: {N}
- Not reached: {N}
{if all passed + exit condition passed}:
All steps completed. Exit condition: PASS.
{if failed/stopped}:
### Failure Details
Step {N}: {description}
On failure: {action}
Error: {error output, first 20 lines}
Attempts: {N}
### What Remains
{Numbered list of unexecuted steps}
To resume: /ultraexecute-local --resume {path}
```
**JSON summary block** (always at the end, machine-parseable):
```json
{
"ultraexecute_summary": {
"plan": "{path}",
"plan_type": "{plan | session-spec}",
"result": "{completed | failed | stopped | partial}",
"steps_total": 0,
"steps_passed": 0,
"steps_failed": 0,
"steps_skipped": 0,
"steps_not_reached": 0,
"failed_at_step": null,
"exit_condition": "{pass | fail | skipped | n/a}",
"progress_file": "{path}"
}
}
```
The `ultraexecute_summary` key makes it grep-able in log files from headless runs.
## Phase 9 — Stats tracking
Append one record to `${CLAUDE_PLUGIN_DATA}/ultraexecute-stats.jsonl`:
```json
{
"ts": "{ISO-8601}",
"plan": "{filename only}",
"plan_type": "{plan | session-spec}",
"mode": "{execute | resume | dry-run | step}",
"result": "{completed | failed | stopped | partial}",
"steps_total": 0,
"steps_passed": 0,
"steps_failed": 0,
"steps_skipped": 0,
"failed_at_step": null
}
```
If `${CLAUDE_PLUGIN_DATA}` is not set or not writable, skip silently.
Never let stats failures block the workflow.
## Hard rules
1. **No AskUserQuestion for execution decisions.** All execution decisions come
from the plan's On failure clauses. If the plan says escalate, stop and
report — never ask. **Exception:** the billing safety check in Phase 2.6
Step 0 MUST ask before spending money on the user's API account.
2. **No scope creep.** Only touch files listed in the step's `Files:` field.
If a file outside the list seems to need changing, record it as a finding
in the final report — do not touch it.
3. **Exit code is truth.** The Verify command's exit code is authoritative.
Non-zero = FAIL regardless of output. Zero with wrong output = FAIL.
4. **Fresh verification.** Re-run the Verify command from scratch every time.
Never trust cached or prior results.
5. **Retry cap = 3 attempts.** Initial + 2 retries, then stop. Never loop forever.
6. **Never corrupt completed work.** Only revert files from the failing step.
Never touch files from earlier passed steps.
7. **Checkpoint discipline.** Run the Checkpoint commit exactly as written.
Do not combine, reorder, or skip checkpoints on passed steps.
8. **Scope fence enforcement.** For session specs: never modify files in the
Never-touch list, regardless of what the Changes field says.
9. **Progress file is ground truth.** Resume uses the progress file, not git log.
10. **No sub-agents.** The executor reads and implements directly.
No Agent tool, no TeamCreate, no delegation.

View file

@ -0,0 +1,685 @@
---
name: ultraplan-local
description: Deep implementation planning with interview, parallel specialized agents, external research, and optional background execution
argument-hint: "[--spec spec.md | --fg] <task description>"
model: opus
allowed-tools: Agent, Read, Glob, Grep, Write, Edit, Bash, AskUserQuestion, TaskCreate, TaskUpdate, TeamCreate, TeamDelete
---
# Ultraplan Local v1.0
Deep, multi-phase implementation planning. Uses an interview to gather requirements,
adaptive specialized agent swarms for exploration, external research for unfamiliar
technologies, and adversarial review to stress-test the plan.
## Phase 1 — Parse mode and validate input
Parse `$ARGUMENTS` for mode flags:
1. If arguments start with `--spec `: extract the file path after `--spec`.
Set **mode = spec-driven**. Read the spec file. If it does not exist, report
the error and stop.
2. If arguments start with `--fg `: extract the task description after `--fg`.
Set **mode = foreground**.
3. If arguments start with `--quick `: extract the task description after `--quick`.
Set **mode = quick**.
4. If arguments start with `--export `: extract the remainder as `{format} {plan-path}`.
Split on the first space: format is the first token, plan path is the rest.
Valid formats: `pr`, `issue`, `markdown`, `headless`.
Set **mode = export**.
If the format is not one of pr/issue/markdown/headless, report and stop:
```
Error: unknown export format '{format}'. Valid: pr, issue, markdown, headless
```
If the plan file does not exist, report and stop:
```
Error: plan file not found: {path}
```
5. If arguments start with `--decompose `: extract the plan file path after `--decompose`.
Set **mode = decompose**.
If the plan file does not exist, report and stop:
```
Error: plan file not found: {path}
```
6. Otherwise: the entire argument string is the task description.
Set **mode = default**.
If no task description and no spec file, output usage and stop:
```
Usage: /ultraplan-local <task description>
/ultraplan-local --spec <path-to-spec.md>
/ultraplan-local --fg <task description>
/ultraplan-local --quick <task description>
/ultraplan-local --export <pr|issue|markdown|headless> <plan-path>
/ultraplan-local --decompose <plan-path>
Modes:
default Interview (interactive) → background planning → notify when done
--spec Skip interview, use provided spec → background planning
--fg All phases in foreground (blocks session)
--quick Interview → plan directly (no agent swarm) → adversarial review
--export Generate shareable output from an existing plan (no new planning)
--decompose Split an existing plan into self-contained headless sessions
Examples:
/ultraplan-local Add user authentication with JWT tokens
/ultraplan-local --spec .claude/ultraplan-spec-2026-04-05-jwt-auth.md
/ultraplan-local --fg Refactor the database layer to use connection pooling
/ultraplan-local --quick Add rate limiting to the API
/ultraplan-local --export pr .claude/plans/ultraplan-2026-04-06-rate-limiting.md
/ultraplan-local --export headless .claude/plans/ultraplan-2026-04-06-rate-limiting.md
/ultraplan-local --decompose .claude/plans/ultraplan-2026-04-06-rate-limiting.md
```
Do not continue past this step if no task was provided.
Report the detected mode to the user:
```
Mode: {default | spec-driven | foreground}
Task: {task description or "from spec: {path}"}
```
## Phase 1.5 — Export (runs only when mode = export)
**Skip this phase entirely unless mode = export.**
Read the plan file. Extract these sections from the plan content:
- Task description (from Context section)
- Implementation steps (from Implementation Plan section)
- Risks (from Risks and Mitigations section)
- Test strategy (from Test Strategy section, if present)
- Scope estimate (from Estimated Scope section)
### Format: `pr`
Output a markdown block formatted as a PR description:
```
## Summary
{23 sentence summary of what this change does and why}
## Changes
{Bulleted list of implementation steps, one line each}
## Test plan
{Bulleted checklist from test strategy, formatted as - [ ] items}
## Risks
{Risks from plan, abbreviated to 1 line each}
---
*Generated by ultraplan-local from {plan filename}*
```
### Format: `issue`
Output a markdown block formatted as an issue comment:
```
## Implementation plan summary
**Task:** {task description}
**Plan file:** {plan path}
**Scope:** {N files, complexity}
### Proposed approach
{35 bullet points from key implementation steps}
### Open questions / risks
{Top 23 risks from plan}
---
*Generated by ultraplan-local*
```
### Format: `markdown`
Output the plan content with internal metadata stripped:
- Remove the "Revisions" section
- Remove plan-critic and scope-guardian scores/verdicts
- Remove `[ASSUMPTION]` markers (but keep the surrounding sentence)
- Keep everything else verbatim
### Format: `headless`
This is a shortcut for `--decompose`. It runs the full session decomposition
pipeline and is equivalent to `--decompose {plan-path}`. Proceed to
Phase 1.6 (Decompose) below.
---
After outputting the formatted block (for pr/issue/markdown), say:
```
Export complete ({format}). Copy the block above.
```
Then **stop**. Do not continue to Phase 2 or any subsequent phase.
## Phase 1.6 — Decompose (runs only when mode = decompose or export headless)
**Skip this phase entirely unless mode = decompose or export format = headless.**
Read the plan file. Verify it contains an Implementation Plan section with
numbered steps. If no steps are found, report and stop:
```
Error: plan has no implementation steps. Run /ultraplan-local first to generate a plan.
```
Determine the output directory from the plan slug:
- Extract the slug from the plan filename (e.g., `ultraplan-2026-04-06-auth-refactor``auth-refactor`)
- Output directory: `.claude/ultraplan-sessions/{slug}/`
Launch the **session-decomposer** agent:
```
Plan file: {plan path}
Plugin root: ${CLAUDE_PLUGIN_ROOT}
Output directory: .claude/ultraplan-sessions/{slug}/
```
The session-decomposer will:
1. Parse the plan's steps and their file dependencies
2. Build a dependency graph between steps
3. Group steps into sessions of 35 steps each
4. Identify which sessions can run in parallel (waves)
5. Generate one session spec file per session
6. Generate a dependency diagram (mermaid)
7. Generate a launch script (`launch.sh`)
When the session-decomposer completes, present the summary to the user:
```
## Decomposition Complete
**Master plan:** {plan path}
**Sessions:** {N} across {W} waves
**Output:** .claude/ultraplan-sessions/{slug}/
### Sessions
| # | Title | Steps | Wave | Parallel |
|---|-------|-------|------|----------|
{session table from decomposer}
### Files generated
- Session specs: .claude/ultraplan-sessions/{slug}/session-*.md
- Dependency graph: .claude/ultraplan-sessions/{slug}/dependency-graph.md
- Launch script: .claude/ultraplan-sessions/{slug}/launch.sh
You can:
- Review individual session specs before running
- Run all sessions: `bash .claude/ultraplan-sessions/{slug}/launch.sh`
- Run a single session: `claude -p "$(cat .claude/ultraplan-sessions/{slug}/session-1-*.md)"`
- Say **"launch"** to start headless execution from here
```
If the user says **"launch"**: run the launch script via Bash.
Then **stop**. Do not continue to Phase 2 or any subsequent phase.
## Phase 2 — Requirements gathering (interview)
**Skip this phase entirely if mode = spec-driven.** Proceed to Phase 3.
Use `AskUserQuestion` to interview the user about the task. Ask **one question at
a time** — never dump all questions at once. Follow up based on answers.
### Interview flow
**Start with the most important question:**
> What is the goal of this task? What does success look like?
**Then ask follow-ups based on the answer. Choose from these topics:**
- What is explicitly NOT in scope? (non-goals)
- Are there technical constraints? (specific versions, compatibility, no new dependencies)
- Do you have preferences? (library X over Y, specific patterns, architectural style)
- Are there non-functional requirements? (performance targets, security needs, accessibility)
- Has anything been tried before? What worked or failed?
**Rules:**
- Ask 35 questions for typical tasks. Maximum 8 for complex tasks.
- If the user says "skip", "proceed", "just plan it", or similar — stop interviewing
immediately. Write a minimal spec from the task description alone.
- Adapt your questions to what the user tells you. If they give a detailed task
description, skip obvious questions.
- Never ask about things you can discover from the codebase.
### Adaptive depth
After each answer, assess the response length and vocabulary:
- **Detailed answer** (2+ sentences, technical terminology, specific examples):
- Treat the user as senior — they know the codebase
- Skip obvious follow-ups they already answered
- Ask more targeted questions: constraints, edge cases, specific technical choices
- Reduce question count: aim for 34 total instead of 5
- **Short or uncertain answer** (1 sentence or less, "I don't know", "not sure", vague):
- Treat the user as unfamiliar with the problem space
- Simplify follow-up questions — avoid open-ended technical questions
- Offer alternatives instead of asking open questions:
> "Should this be synchronous or asynchronous? (synchronous is simpler; async handles more concurrent users)"
- For bugs: focus on reproduction before requirements:
> "What do you see? What did you expect to see?"
- Allow "I don't know" as a valid answer — record it as an open assumption in the spec
Never change your question count based on impatience. Only change depth based
on answer quality.
### Write the spec file
After gathering requirements, read the spec template:
@${CLAUDE_PLUGIN_ROOT}/templates/spec-template.md
Generate a slug from the task (first 3-4 meaningful words, lowercase, hyphens).
Write the spec to: `.claude/ultraplan-spec-{YYYY-MM-DD}-{slug}.md`
Create the `.claude/` directory if it does not exist.
Fill in all sections based on interview answers. Mark unanswered sections with
"Not discussed — no constraints assumed."
Tell the user:
```
Spec saved: .claude/ultraplan-spec-{date}-{slug}.md
```
## Phase 3 — Background transition
**If mode = foreground or quick:** Skip this phase. Continue to Phase 4 inline.
**If mode = default or spec-driven:**
Launch the **planning-orchestrator** agent with this prompt:
```
Spec file: {spec path}
Task: {task description}
Mode: {default | spec | quick}
Plan destination: .claude/plans/ultraplan-{YYYY-MM-DD}-{slug}.md
Plugin root: ${CLAUDE_PLUGIN_ROOT}
Read the spec file and execute your full planning workflow.
Write the plan to the destination path.
```
Launch the planning-orchestrator via the Agent tool with `run_in_background: true`.
The agent runs autonomously while you continue working — you will be notified
when the plan is ready.
Then output to the user and **stop your response**:
```
Background planning started via planning-orchestrator.
Spec: .claude/ultraplan-spec-{date}-{slug}.md
Plan: .claude/plans/ultraplan-{date}-{slug}.md
You will be notified when the plan is ready.
You can continue working on other tasks in the meantime.
```
Do not wait for the orchestrator. Do not continue to Phase 4.
The planning-orchestrator handles Phases 4 through 10 autonomously.
---
**Everything below this line runs either in foreground mode or inside the
background agent. The instructions are identical regardless of context.**
---
## Phase 4 — Codebase sizing
Determine codebase scale to calibrate agent turns (not agent count).
Run via Bash:
```
find . -type f \( -name "*.ts" -o -name "*.tsx" -o -name "*.js" -o -name "*.jsx" -o -name "*.py" -o -name "*.go" -o -name "*.rs" -o -name "*.java" -o -name "*.rb" -o -name "*.c" -o -name "*.cpp" -o -name "*.h" -o -name "*.cs" -o -name "*.swift" -o -name "*.kt" -o -name "*.sh" -o -name "*.md" \) -not -path "*/node_modules/*" -not -path "*/.git/*" -not -path "*/vendor/*" -not -path "*/dist/*" -not -path "*/build/*" | wc -l
```
Classify:
- **Small** (< 50 files)
- **Medium** (50500 files)
- **Large** (> 500 files)
Report:
```
Codebase: {N} source files ({scale}). Deploying exploration agents.
```
## Phase 4b — Spec review
Launch the **spec-reviewer** agent:
Prompt: "Review this spec for quality: {spec path}. Check completeness, consistency,
testability, and scope clarity."
Handle the verdict:
- **PROCEED** — continue to Phase 5.
- **PROCEED_WITH_RISKS** — continue, carry flagged risks as `[ASSUMPTION]` in the plan.
- **REVISE** — in foreground mode, present findings and ask the user for clarification.
In background mode, carry all findings as `[ASSUMPTION]` entries.
## Phase 5 — Parallel exploration (specialized agents + research)
**If mode = quick:** Do NOT launch any exploration agents. Instead, run a
lightweight file check:
- `Glob` for files matching key terms from the task description (up to 3 patterns)
- `Grep` for function/type definitions matching key terms (up to 3 patterns)
Report findings as:
```
Quick scan: {N} potentially relevant files found via Glob/Grep.
No agent swarm — proceeding directly to planning.
```
Then skip Phase 6 (deep-dives) and proceed to Phase 7 (Synthesis) with only
the quick-scan results.
---
**All other modes:** Launch exploration agents **in parallel** (all in a single
message). Use the specialized agents from the `agents/` directory.
**All agents run for all codebase sizes.** Scale `maxTurns` by size (small: halved,
medium: default, large: default) instead of dropping agents.
| Agent | Small | Medium | Large | Purpose |
|-------|-------|--------|-------|---------|
| `architecture-mapper` | Yes | Yes | Yes | Codebase structure, patterns, anti-patterns |
| `dependency-tracer` | Yes | Yes | Yes | Module connections, data flow, side effects |
| `risk-assessor` | Yes | Yes | Yes | Risks, edge cases, failure modes |
| `task-finder` | Yes | Yes | Yes | Task-relevant files, functions, types, reuse candidates |
| `test-strategist` | Yes | Yes | Yes | Test patterns, coverage gaps, strategy |
| `git-historian` | Yes | Yes | Yes | Recent changes, ownership, hot files, active branches |
| `research-scout` | Conditional | Conditional | Conditional | External docs (only when unfamiliar tech detected) |
| `convention-scanner` | No | Yes | Yes | Coding conventions, naming, style, test patterns |
### Always launch (all codebase sizes):
**architecture-mapper** — full codebase structure, tech stack, patterns, anti-patterns.
Prompt: "Analyze the architecture of this codebase. The task being planned is: {task}"
**dependency-tracer** — module connections, data flow, side effects for task-relevant code.
Prompt: "Trace dependencies and data flow relevant to this task: {task}. Focus on modules
that will be affected by the implementation."
**risk-assessor** — risks, edge cases, failure modes, technical debt near task area.
Prompt: "Assess risks and failure modes for implementing this task: {task}. Check for
complexity hotspots, security boundaries, and technical debt in the relevant code."
**task-finder** — all files, functions, types, and interfaces directly related to the task.
Prompt: "Find all code relevant to this task: {task}. Include existing implementations
that solve similar problems, API boundaries, database models, configuration files.
Report file paths and line numbers for every finding."
**test-strategist** — existing test patterns, coverage gaps, test strategy.
Prompt: "Analyze the test infrastructure and design a test strategy for this task: {task}.
Discover existing patterns and identify coverage gaps."
**git-historian** — recent changes, code ownership, hot files, active branches.
Prompt: "Analyze git history relevant to this task: {task}. Report recent changes,
ownership, hot files, and active branches that may affect planning."
### Launch for medium+ codebases (50+ files):
**Convention Scanner** — use the `convention-scanner` plugin agent (model: "sonnet")
for medium+ codebases only.
Provide concrete examples from the codebase, not generic advice."
### Conditional: External research
After reading the task description and spec (if available), determine if the task
involves technologies, APIs, or libraries that are:
- Not clearly present in the codebase
- Being upgraded to a new major version
- Being used in an unfamiliar way
If yes: launch **research-scout** in parallel with the other agents.
Prompt: "Research the following technologies for this task: {task}.
Specific questions: {list specific questions about the technology}.
Technologies to research: {list}."
If no external technology is involved: skip research-scout and note:
"No external research needed — all technologies are well-represented in the codebase."
## Phase 6 — Targeted deep-dives
After all Phase 5 agents complete, review their results and identify **knowledge gaps**
— areas where exploration was too shallow to plan confidently.
Common reasons for deep-dives:
- A critical function was found but its implementation details are unclear
- A dependency chain needs tracing to understand side effects
- A test pattern was identified but the test infrastructure needs more detail
- A risk was flagged but the actual impact needs verification
For each significant gap, spawn a targeted deep-dive agent (model: "sonnet",
subagent_type: "Explore") with a narrow, specific brief.
Launch up to 3 deep-dive agents in parallel. If no gaps exist, skip this phase
and note: "Initial exploration was sufficient — no deep-dives needed."
## Phase 7 — Synthesis
After all agents complete (initial + deep-dives + research), synthesize:
1. Read all agent results carefully
2. Identify overlaps and contradictions between agents
3. Build a mental model of the codebase architecture
4. Catalog reusable code: existing functions, utilities, patterns
5. Integrate research findings with codebase analysis
6. Note remaining gaps — things you cannot determine from code or research
(these become assumptions in the plan, marked explicitly)
7. For each finding, track whether it came from **codebase analysis** or
**external research** — the plan must distinguish these sources
Do NOT write this synthesis to disk. It is internal working context only.
## Phase 8 — Deep planning
Read the spec file (from Phase 2 or provided via --spec).
Read the plan template: @${CLAUDE_PLUGIN_ROOT}/templates/plan-template.md
Write the plan following the template structure. The plan MUST include:
### Required sections
1. **Context** — Why this change is needed. Reference the spec's goal and constraints.
2. **Codebase Analysis** — Tech stack, patterns, relevant files, reusable code,
external tech researched. Every file path must be real (verified during exploration).
3. **Research Sources** — If research-scout was used: table of technologies, sources,
findings, and confidence levels. Omit if no research was conducted.
4. **Implementation Plan** — Ordered steps. Each step specifies:
- Exact files to modify or create (with paths)
- What changes to make and why
- Which existing code to reuse
- Dependencies on other steps
- Whether the step is based on codebase analysis or external research
- **On failure:** — recovery action (revert/retry/skip/escalate)
- **Checkpoint:** — git commit command after success
10. **Execution Strategy** — For plans with > 5 steps: group steps into sessions
(35 steps each), organize sessions into waves (parallel where independent),
specify scope fences per session. Omit for plans with ≤ 5 steps.
5. **Alternatives Considered** — At least one alternative approach with
pros/cons and reason for rejection.
6. **Risks and Mitigations** — From the risk-assessor findings. What could go
wrong and how to handle it.
7. **Test Strategy** — From the test-strategist findings (if available).
What tests to write and which patterns to follow.
8. **Verification** — Testable criteria. Not "check that it works" but
specific commands to run and expected outputs.
9. **Estimated Scope** — File counts and complexity rating.
### Quality standards
- Every file path in the plan must exist in the codebase (or be explicitly
marked as "new file to create")
- Every "reuses" reference must point to a real function/pattern found during
exploration
- Steps must be ordered by dependency (not by file path or importance)
- Verification criteria must be concrete and executable
- The plan must be implementable by someone who has not seen the exploration
results — it must stand on its own
- Research-based decisions must cite their source
### Write the plan
Generate the slug from the task description (or reuse the spec slug).
Write the plan to: `.claude/plans/ultraplan-{YYYY-MM-DD}-{slug}.md`
Create the `.claude/plans/` directory if it does not exist.
## Phase 9 — Adversarial review
Launch two review agents **in parallel**:
**plan-critic** — adversarial review of the plan.
Prompt: "Review this implementation plan for the task: {task}.
Plan file: {plan path}. Read it and find every problem — missing steps,
wrong ordering, fragile assumptions, missing error handling, scope creep,
underspecified steps. Rate each finding as blocker, major, or minor."
**scope-guardian** — scope alignment check.
Prompt: "Check this implementation plan against the requirements.
Task: {task}. Spec file: {spec path}. Plan file: {plan path}.
Find scope creep (plan does more than asked) and scope gaps (plan misses
requirements). Check that referenced files and functions exist."
After both complete:
- If **blockers** are found: revise the plan to address them. Add a "Revisions"
note at the bottom of the plan listing what changed and why.
- If only **major** issues: revise to address them. Add revisions note.
- If only **minor** issues or clean: proceed without changes. Note the
review result in the plan.
## Phase 10 — Present and refine
Present a summary to the user:
```
## Ultraplan Complete
**Task:** {task description}
**Mode:** {default | spec-driven | foreground}
**Spec:** {spec file path, or "none (foreground mode)"}
**Plan:** .claude/plans/ultraplan-{date}-{slug}.md
**Exploration:** {N} agents deployed ({N} specialized + {N} deep-dives + {research status})
**Scope:** {N} files to modify, {N} to create — {complexity}
### Key decisions
- {Decision 1 and rationale}
- {Decision 2 and rationale}
### Implementation steps ({N} total)
1. {Step 1 summary}
2. {Step 2 summary}
...
### Research findings
{Summary of external research, or "No external research conducted."}
### Adversarial review
**Plan critic:** {Summary — blockers/majors/minors found, how addressed}
**Scope guardian:** {Summary — creep/gaps found, how addressed}
You can:
- Ask questions or request changes to refine the plan
- Say **"execute"** to start implementing
- Say **"execute with team"** to implement with parallel Agent Team (if eligible)
- Say **"save"** to keep the plan for later
```
If the user asks questions or requests changes:
- Update the plan file in-place
- Show what changed
- Re-present the summary
## Phase 11 — Handoff
### "save" / "later" / "done"
Confirm the plan and spec file locations and exit.
### "execute" / "go" / "start"
Begin implementing the plan step by step in this session. Follow the plan exactly.
Mark each step complete as you go.
### "execute with team" / "team"
Before creating a team, verify eligibility:
1. Count implementation steps that are **independent** (no dependency on each other)
AND touch **different files/modules**
2. If fewer than 3 independent steps: inform the user and fall back to sequential
execution. "The plan has fewer than 3 independent steps — sequential execution
is more efficient."
If eligible:
1. Present the proposed team split: which steps go to which team member
2. Ask for confirmation: "Create Agent Team with {N} members? (yes/no)"
3. If confirmed: create the team with `TeamCreate`, assign step clusters to
each member. Use `isolation: "worktree"` on each team member agent so they
work in isolated git worktrees — this prevents file conflicts during parallel
implementation. Coordinate execution and clean up with `TeamDelete` when done.
4. If `TeamCreate` fails (tool not available): fall back to sequential execution
and notify the user
## Phase 12 — Session tracking
After the plan is presented (Phase 10) or after handoff (Phase 11), write a
session record to `${CLAUDE_PLUGIN_DATA}/ultraplan-stats.jsonl` (create the file
if it does not exist).
Record format (one JSON line):
```json
{
"ts": "{ISO-8601 timestamp}",
"task": "{task description (first 100 chars)}",
"mode": "{default|spec|fg}",
"slug": "{plan slug}",
"codebase_size": "{small|medium|large}",
"codebase_files": {N},
"agents_deployed": {N},
"deep_dives": {N},
"research": {true|false},
"critic_verdict": "{BLOCK|REVISE|PASS}",
"guardian_verdict": "{ALIGNED|CREEP|GAP|MIXED}",
"outcome": "{execute|execute_team|save|refine}"
}
```
If `${CLAUDE_PLUGIN_DATA}` is not set or not writable, skip tracking silently.
Never let tracking failures block the main workflow.
## Hard rules
- **Scope**: Only explore the current working directory and its subdirectories.
Never read files outside the repo (no ~/.env, no credentials, no other repos).
- **Cost**: Sonnet for all agents (exploration, deep-dives, research, critics).
Opus only runs in the main thread for synthesis and planning.
- **Privacy**: Never log, store, or repeat file contents that look like
secrets, tokens, or credentials. Never log prompt text.
- **No premature execution**: Do not modify any project files until the user
explicitly approves the plan.
- **Plan stands alone**: The plan file must be understandable without access
to the exploration results. Include all necessary context.
- **Honesty**: If exploration reveals the task is trivial (single file, obvious
change), say so. Do not inflate the plan to justify the process. Suggest
the user just implements it directly.
- **Adaptive**: Never spawn more agents than the codebase warrants. A 10-file
project does not need 7 exploration agents. Scale down.
- **Research transparency**: Always distinguish codebase-derived decisions from
research-derived decisions in the plan.