feat: initial open marketplace with llm-security, config-audit, ultraplan-local

2026-04-06 18:47:49 +02:00 · 2026-04-06 18:47:49 +02:00 · f93d6abdae
commit f93d6abdae
380 changed files with 65935 additions and 0 deletions
--- a/plugins/ultraplan-local/commands/ultraexecute-local.md
+++ b/plugins/ultraplan-local/commands/ultraexecute-local.md
@ -0,0 +1,647 @@
+---
+name: ultraexecute-local
+description: Disciplined plan executor — single-session or multi-session with parallel orchestration, failure recovery, and headless support
+argument-hint: "[--fg | --resume | --dry-run | --step N | --session N] <plan.md>"
+model: opus
+allowed-tools: Read, Write, Edit, Bash, Glob, Grep, AskUserQuestion
+---
+
+# Ultraexecute Local
+
+Disciplined executor for ultraplan plans. Reads a plan file, detects if it has
+an Execution Strategy (multi-session), and either executes directly or
+orchestrates parallel headless sessions — all to realize one plan.
+
+Designed to work identically in interactive and headless (`claude -p`) mode.
+
+## Phase 1 — Parse mode and validate input
+
+Parse `$ARGUMENTS` for mode flags:
+
+1. If arguments contain `--fg`: extract the file path. Set **mode = foreground**.
+2. If arguments contain `--resume`: extract the file path. Set **mode = resume**.
+3. If arguments contain `--dry-run`: extract the file path. Set **mode = dry-run**.
+4. If arguments contain `--step N` (N is a positive integer): extract N and the file path.
+   Set **mode = step**, **target-step = N**.
+5. If arguments contain `--session N` (N is a positive integer): extract N and the file path.
+   Set **mode = session**, **target-session = N**.
+6. Otherwise: the entire argument string is the file path. Set **mode = execute**.
+
+If no path is provided, output usage and stop:
+
+```
+Usage: /ultraexecute-local <plan.md>
+       /ultraexecute-local --fg <plan.md>
+       /ultraexecute-local --resume <plan.md>
+       /ultraexecute-local --dry-run <plan.md>
+       /ultraexecute-local --step N <plan.md>
+       /ultraexecute-local --session N <plan.md>
+
+Modes:
+  (default)      Auto — multi-session if plan has Execution Strategy, else foreground
+  --fg           Force foreground — all steps sequentially, ignore Execution Strategy
+  --resume       Resume from last progress checkpoint
+  --dry-run      Validate plan and show execution strategy without running
+  --step N       Execute only step N (foreground)
+  --session N    Execute only session N from the plan's Execution Strategy
+
+Examples:
+  /ultraexecute-local .claude/plans/ultraplan-2026-04-06-auth-refactor.md
+  /ultraexecute-local --fg .claude/plans/ultraplan-2026-04-06-auth-refactor.md
+  /ultraexecute-local --session 2 .claude/plans/ultraplan-2026-04-06-auth-refactor.md
+  /ultraexecute-local --dry-run .claude/plans/ultraplan-2026-04-06-auth-refactor.md
+```
+
+If the file does not exist, report and stop:
+```
+Error: file not found: {path}
+```
+
+Report detected mode:
+```
+Mode: {execute | resume | dry-run | step N}
+File: {path}
+```
+
+## Phase 2 — Detect file type and parse structure
+
+Read the file. Determine whether it is an **ultraplan** or a **session spec**:
+
+- **Session spec**: contains `## Dependencies` with `Entry condition:` AND `## Scope Fence`
+  AND `## Exit Condition` sections.
+- **Ultraplan**: contains `## Implementation Plan` with numbered `### Step N:` headings
+  but no `## Scope Fence`.
+
+If neither structure is detected, report and stop:
+```
+Error: unrecognized file format. Expected an ultraplan or session spec.
+```
+
+### Parse steps
+
+Extract every `### Step N: {description}` heading (in order). For each step, extract:
+- **Files** — file paths to create or modify
+- **Changes** — what to modify
+- **Reuses** — existing code to leverage (informational)
+- **Test first** — test to run before implementation (optional)
+- **Verify** — command to run after implementation
+- **On failure** — recovery action (revert/retry/skip/escalate)
+- **Checkpoint** — git commit command after success
+
+If a step is missing `On failure`, default to `escalate` and record a parse warning.
+If a step is missing `Verify`, record a parse warning.
+
+### Parse session spec fields (if applicable)
+
+- **Entry condition** from `## Dependencies`
+- **Touch list** and **Never-touch list** from `## Scope Fence`
+- **Exit condition** checklist from `## Exit Condition`
+
+### Parse Execution Strategy (if present)
+
+If the plan contains an `## Execution Strategy` section, extract:
+- Each `### Session N: {title}` with its Steps, Wave, Depends on, and Scope fence
+- The `### Execution Order` with wave definitions
+
+Set **has_execution_strategy = true**.
+
+Report:
+```
+Type: {plan | session-spec}
+Steps: {N}
+{if has_execution_strategy}: Execution Strategy: {S} sessions across {W} waves
+{if session spec}: Entry condition: {text}
+{if session spec}: Scope fence: {N} touch, {N} never-touch
+{if warnings}: Warnings: {list}
+```
+
+## Phase 2.5 — Execution strategy decision
+
+Determine how to execute this plan:
+
+**Run as single session (foreground)** when ANY of these are true:
+- `--fg` flag is set
+- `--step N` mode
+- `--resume` mode
+- `--session N` mode (runs only that session's steps, foreground)
+- Plan has no `## Execution Strategy` section
+- Plan has Execution Strategy with only 1 session
+
+**Run as multi-session (parallel orchestration)** when ALL of these are true:
+- mode = `execute` (default, no --fg)
+- Plan has `## Execution Strategy` with 2+ sessions
+- At least one wave has 2+ sessions (parallelism possible)
+
+**Run as multi-session (sequential orchestration)** when:
+- mode = `execute` (default, no --fg)
+- Plan has `## Execution Strategy` with 2+ sessions
+- All sessions are in different waves (no parallelism, but still separate sessions)
+
+For single-session: continue to Phase 3.
+For multi-session: jump to Phase 2.6.
+
+Report:
+```
+Strategy: {single session | N sessions (M parallel, K sequential)}
+```
+
+## Phase 2.6 — Multi-session orchestration
+
+**Only runs for multi-session execution.** This phase launches headless child
+sessions and collects results. After this phase, jump directly to Phase 8
+(final report).
+
+### Step 0 — Billing safety check (MANDATORY)
+
+Before launching ANY `claude -p` process, check the environment:
+
+```bash
+echo "${ANTHROPIC_API_KEY:+SET}"
+```
+
+If the result is `SET`, **STOP** and warn the user. `claude -p` sessions with
+`ANTHROPIC_API_KEY` in the environment bill the **API account** (pay-per-token),
+not the user's Claude subscription (Max/Pro). Parallel Opus sessions can cost
+$50–100+ per run.
+
+Use AskUserQuestion with these options:
+
+**Question:** "ANTHROPIC_API_KEY is set in your environment. Parallel `claude -p`
+sessions will bill your API account, not your Claude subscription. How do you
+want to proceed?"
+
+| Option | Description |
+|--------|-------------|
+| **Use --fg instead (Recommended)** | Run all steps sequentially in this session using your subscription. No extra cost. |
+| **Continue with API billing** | Launch parallel sessions. Each session bills your API account at token rates. |
+| **Stop** | Cancel execution. Unset ANTHROPIC_API_KEY first, then re-run. |
+
+If the user chooses `--fg`: restart execution with mode = foreground (jump back
+to Phase 3, single-session).
+
+If the user chooses `Continue`: proceed with Phase 2.6 Step 1.
+
+If the user chooses `Stop`: report "Execution cancelled — billing safety check"
+and stop.
+
+If `ANTHROPIC_API_KEY` is NOT set: proceed silently to Step 1.
+
+### Step 1 — Create session log directory
+
+```bash
+mkdir -p .claude/ultraplan-sessions/{slug}/logs
+```
+
+### Step 2 — Execute waves
+
+For each wave (in order):
+
+**Launch sessions in this wave:**
+
+For each session in the wave, launch a headless `claude -p` process:
+
+```bash
+claude -p "/ultraexecute-local --session {N} {plan-path}" \
+  > .claude/ultraplan-sessions/{slug}/logs/session-{N}.log 2>&1 &
+```
+
+If the wave has only 1 session, run it without `&` (no background needed).
+
+Track PIDs for parallel sessions.
+
+**Wait for wave completion:**
+
+```bash
+wait {PID1} {PID2} ...
+```
+
+**Check results after each wave:**
+
+For each session in the wave, read its log file and grep for
+`"ultraexecute_summary"`. Parse the JSON to determine:
+- Did the session complete? (`result: "completed"`)
+- Did it fail? (`result: "failed"` or `"stopped"`)
+
+If ANY session in the wave failed:
+```
+Wave {W} FAILED: Session {N} failed at step {S}.
+Stopping — later waves depend on this wave.
+See log: .claude/ultraplan-sessions/{slug}/logs/session-{N}.log
+```
+Do NOT start later waves. Jump to Phase 8 with partial results.
+
+If all sessions in the wave passed: continue to the next wave.
+
+### Step 3 — Run master verification
+
+After all waves complete successfully, run the plan's `## Verification` section
+commands to verify the integrated result.
+
+### Step 4 — Aggregate results
+
+Collect all session summaries into an aggregated report. Jump to Phase 8.
+
+### --session N mode
+
+When mode = `session N`:
+1. Find session N in the Execution Strategy
+2. Extract its step numbers (e.g., Steps: 4, 5, 6)
+3. Extract its scope fence (Touch / Never touch lists)
+4. Execute ONLY those steps, in order, using the single-session protocol (Phase 3→7)
+5. Enforce the session's scope fence as if it were a session spec's scope fence
+6. Report results for those steps only
+
+This mode is used internally by Phase 2.6 when launching child sessions.
+It can also be used manually to re-run a specific session.
+
+## Phase 3 — Progress file setup
+
+The progress file lives at `{plan-dir}/.ultraexecute-progress-{slug}.json` where
+`{slug}` is the plan filename without extension.
+
+### Progress file schema
+
+```json
+{
+  "schema_version": "1",
+  "plan": "{path}",
+  "plan_type": "{plan | session-spec}",
+  "started_at": "{ISO-8601}",
+  "updated_at": "{ISO-8601}",
+  "mode": "{execute | resume | step}",
+  "total_steps": 0,
+  "current_step": 0,
+  "status": "{in-progress | completed | failed | stopped}",
+  "steps": {
+    "1": { "status": "pending", "attempts": 0, "error": null, "completed_at": null, "commit": null }
+  },
+  "entry_condition_checked": false,
+  "exit_condition_checked": false,
+  "summary": null
+}
+```
+
+### Mode-specific behavior
+
+**mode = execute (fresh):**
+- If a progress file exists with status `in-progress` or `failed`: warn that
+  `--resume` is available, then wait 3 seconds (`sleep 3`) and start fresh.
+  This allows headless runs to proceed without blocking.
+- Otherwise: create the progress file with all steps in `pending` status.
+
+**mode = resume:**
+- If no progress file exists: start from step 1 (same as fresh execute).
+- If progress file exists: find the first step with status != `passed`.
+  ```
+  Resuming from step {N}. {M}/{total} steps already completed.
+  ```
+
+**mode = dry-run:**
+- Do NOT create or modify the progress file.
+
+**mode = step N:**
+- Create the progress file if it does not exist.
+- Only step N will be executed.
+
+## Phase 4 — Entry condition check (session specs only)
+
+**Skip for ultraplans.** Skip in dry-run mode (report what would be checked instead).
+
+Read the entry condition. Evaluate it:
+
+- `"none"` or similar → pass immediately
+- References git state (e.g., "git status clean") → run `git status --porcelain`
+- References passing tests → run the specified command
+- References a previous session → check `git log --oneline` for commit pattern
+
+If the entry condition **fails**:
+```
+Entry condition FAILED: {condition text}
+Reason: {what was checked, what was found}
+Complete the prerequisite first, then re-run.
+```
+Update progress file with `status: "stopped"`. Stop execution.
+
+If the entry condition **passes**:
+```
+Entry condition: PASS
+```
+Update `entry_condition_checked: true` in the progress file.
+
+## Phase 5 — Dry-run report (dry-run mode only)
+
+**Only runs when mode = dry-run.** Produces a validation report, then stops.
+
+```
+## Dry Run Report: {filename}
+
+**Type:** {plan | session-spec}
+**Steps:** {N}
+
+### Step Validation
+
+| Step | Description | Verify | On failure | Checkpoint | Issues |
+|------|-------------|--------|------------|------------|--------|
+| 1 | {desc} | {cmd} | {action} | {msg} | {none / missing X} |
+
+### File References
+
+{For each file in Files: fields, check existence with Glob}
+- {path}: EXISTS | NOT FOUND {(marked as new file) | (unexpected — may be missing)}
+
+### Entry / Exit Conditions (session specs)
+
+{What would be checked}
+
+### Execution Preview (only when plan has Execution Strategy)
+
+If `has_execution_strategy = true`, show a preview of multi-session orchestration:
+
+```
+**Sessions:** {S} across {W} waves
+
+| Wave | Session | Steps | Depends on | Command |
+|------|---------|-------|------------|---------|
+| 1 | Session 1: {title} | {nums} | none | `claude -p "/ultraexecute-local --session 1 {path}"` |
+| 1 | Session 2: {title} | {nums} | none | `claude -p "/ultraexecute-local --session 2 {path}"` |
+| 2 | Session 3: {title} | {nums} | S1, S2 | `claude -p "/ultraexecute-local --session 3 {path}"` |
+```
+
+Check billing status via `echo "${ANTHROPIC_API_KEY:+SET}"` and report:
+```
+Billing: ANTHROPIC_API_KEY is {SET — parallel sessions will bill API account | NOT SET — sessions will use subscription}
+```
+
+### Verdict
+
+{READY | NEEDS ATTENTION — N issues found}
+```
+
+Stop after the dry-run report. Do not execute anything.
+
+## Phase 6 — Step execution loop
+
+The core execution phase. Runs for modes: `execute`, `resume`, `step`.
+
+### Determine starting step
+
+- **execute**: step 1
+- **resume**: first step where status != `passed`
+- **step N**: step N only
+
+### For each step
+
+Update progress: `steps.{N}.status = "running"`, `current_step = N`, `updated_at = now`.
+
+```
+--- Step {N}/{total}: {description} ---
+```
+
+#### Sub-step A — Scope fence check (session specs only)
+
+Before touching any file, verify that every file in the step's `Files:` field is
+in the session spec's Touch list (or is a new file to create). If ANY file is in
+the Never-touch list:
+
+```
+SCOPE VIOLATION: Step {N} requires {file} which is in the never-touch list.
+Escalating — this step cannot be executed within this session's scope.
+```
+
+Treat this as an automatic `escalate`. Jump to the stop-and-report logic.
+
+#### Sub-step B — Test first (if present)
+
+If the step has a `Test first:` field:
+1. If test file is marked `(new)`: note it will be created during implementation.
+2. If test file exists: run it. Expect failure (RED state).
+3. If test unexpectedly passes: warn but continue — step may already be done.
+
+Do not block on test-first failures — they are expected.
+
+#### Sub-step C — Implement changes
+
+Read the step's `Files:` and `Changes:` fields. Implement exactly as described.
+
+**Rules:**
+- Follow `Changes:` exactly — do not improvise, add scope, or optimize
+- Use Edit for modifications, Write for new files
+- If `Reuses:` references existing code, read that code first for context
+- Only touch files listed in `Files:` — nothing else
+
+#### Sub-step D — Verification
+
+Run the `Verify:` command exactly as written, via Bash.
+
+**Rules:**
+- Always a fresh run — never trust prior results
+- Exit code is the authoritative truth:
+  - Exit 0 + expected output (if specified) = **PASS**
+  - Exit non-zero = **FAIL** regardless of output text
+  - Exit 0 but wrong output = **FAIL**
+
+```
+Verify: {command}
+Result: {PASS | FAIL} (exit code {N})
+{if FAIL}: Output (first 10 lines): {output}
+```
+
+If **PASS**: proceed to Sub-step F (checkpoint).
+
+#### Sub-step E — On failure handling
+
+If **FAIL**, read the `On failure:` clause. Apply the retry cap: **maximum 2 retries**
+(3 total attempts). Track attempts in `steps.{N}.attempts`.
+
+**`On failure: revert`**
+- If attempts < 3: analyze the failure, re-implement with adjustments, re-verify.
+  ```
+  Attempt {A}/3 failed. Retrying...
+  ```
+- If attempts == 3: revert this step's changes:
+  ```bash
+  git checkout -- {files from Files: field}
+  ```
+  Record failure. **Do NOT proceed to next step.** Jump to Phase 7.
+
+**`On failure: retry`**
+- If attempts < 3: use the alternative approach described in the On failure clause.
+- If attempts == 3: revert and stop. Jump to Phase 7.
+
+**`On failure: skip`**
+- Mark step as skipped regardless of attempt count. Continue to next step.
+  ```
+  Step {N}: SKIPPED (non-critical per plan)
+  ```
+  Update `steps.{N}.status = "skipped"`.
+
+**`On failure: escalate`**
+- Stop immediately regardless of attempt count.
+  ```
+  Step {N}: ESCALATED — requires human judgment
+  ```
+  Commit all completed work before stopping. Stage ONLY files from steps with
+  `status: "passed"` in the progress file — collect their `Files:` fields. Never
+  use `git add -A` (risks staging secrets, binaries, or unrelated work).
+  ```bash
+  git add {files from passed steps' Files: fields} && git commit -m "wip: ultraexecute-local stopped at step {N} — escalation needed"
+  ```
+  Jump to Phase 7.
+
+#### Sub-step F — Checkpoint
+
+Run the `Checkpoint:` git commit command exactly as written in the plan.
+
+If the commit fails (nothing to commit, etc.): warn but do NOT fail the step.
+The step's verification already passed — the commit is bookkeeping.
+
+```
+Step {N}: PASS (committed: {hash})
+```
+
+Update progress: `steps.{N}.status = "passed"`, `steps.{N}.commit = {hash}`,
+`steps.{N}.completed_at = now`.
+
+### Step mode exit
+
+If mode = `step N`: after completing step N (pass or fail), skip remaining steps
+and jump to Phase 8 (final report).
+
+## Phase 7 — Exit condition check (session specs only)
+
+**Skip for ultraplans.** Run only when all steps passed (not on early stop).
+
+Run each exit condition command from the `## Exit Condition` checklist:
+
+```
+Exit condition check:
+- [ ] {command} → {PASS | FAIL}
+- [ ] {command} → {PASS | FAIL}
+```
+
+If all pass: `exit_condition_checked: true` in progress file.
+If any fail: record which failed. Include in final report.
+
+## Phase 8 — Final report
+
+Always produce a final report.
+
+Update progress file: `status` to `completed`/`failed`/`stopped`, `updated_at`, `summary`.
+
+```
+## Ultraexecute Local Complete
+
+**Plan:** {path}
+**Type:** {plan | session-spec}
+**Mode:** {execute | resume | step N}
+**Result:** {COMPLETED | FAILED at step N | STOPPED (escalation) | PARTIAL (N/total passed)}
+
+### Step Results
+
+| Step | Description | Result | Attempts | Commit |
+|------|-------------|--------|----------|--------|
+| 1 | {desc} | PASS | 1 | abc1234 |
+| 2 | {desc} | FAIL | 3 | — |
+| 3 | {desc} | — | 0 | — |
+
+### Summary
+
+- Passed: {N}/{total}
+- Skipped: {N}
+- Failed: {N}
+- Not reached: {N}
+
+{if all passed + exit condition passed}:
+All steps completed. Exit condition: PASS.
+
+{if failed/stopped}:
+### Failure Details
+
+Step {N}: {description}
+On failure: {action}
+Error: {error output, first 20 lines}
+Attempts: {N}
+
+### What Remains
+
+{Numbered list of unexecuted steps}
+
+To resume: /ultraexecute-local --resume {path}
+```
+
+**JSON summary block** (always at the end, machine-parseable):
+
+```json
+{
+  "ultraexecute_summary": {
+    "plan": "{path}",
+    "plan_type": "{plan | session-spec}",
+    "result": "{completed | failed | stopped | partial}",
+    "steps_total": 0,
+    "steps_passed": 0,
+    "steps_failed": 0,
+    "steps_skipped": 0,
+    "steps_not_reached": 0,
+    "failed_at_step": null,
+    "exit_condition": "{pass | fail | skipped | n/a}",
+    "progress_file": "{path}"
+  }
+}
+```
+
+The `ultraexecute_summary` key makes it grep-able in log files from headless runs.
+
+## Phase 9 — Stats tracking
+
+Append one record to `${CLAUDE_PLUGIN_DATA}/ultraexecute-stats.jsonl`:
+
+```json
+{
+  "ts": "{ISO-8601}",
+  "plan": "{filename only}",
+  "plan_type": "{plan | session-spec}",
+  "mode": "{execute | resume | dry-run | step}",
+  "result": "{completed | failed | stopped | partial}",
+  "steps_total": 0,
+  "steps_passed": 0,
+  "steps_failed": 0,
+  "steps_skipped": 0,
+  "failed_at_step": null
+}
+```
+
+If `${CLAUDE_PLUGIN_DATA}` is not set or not writable, skip silently.
+Never let stats failures block the workflow.
+
+## Hard rules
+
+1. **No AskUserQuestion for execution decisions.** All execution decisions come
+   from the plan's On failure clauses. If the plan says escalate, stop and
+   report — never ask. **Exception:** the billing safety check in Phase 2.6
+   Step 0 MUST ask before spending money on the user's API account.
+
+2. **No scope creep.** Only touch files listed in the step's `Files:` field.
+   If a file outside the list seems to need changing, record it as a finding
+   in the final report — do not touch it.
+
+3. **Exit code is truth.** The Verify command's exit code is authoritative.
+   Non-zero = FAIL regardless of output. Zero with wrong output = FAIL.
+
+4. **Fresh verification.** Re-run the Verify command from scratch every time.
+   Never trust cached or prior results.
+
+5. **Retry cap = 3 attempts.** Initial + 2 retries, then stop. Never loop forever.
+
+6. **Never corrupt completed work.** Only revert files from the failing step.
+   Never touch files from earlier passed steps.
+
+7. **Checkpoint discipline.** Run the Checkpoint commit exactly as written.
+   Do not combine, reorder, or skip checkpoints on passed steps.
+
+8. **Scope fence enforcement.** For session specs: never modify files in the
+   Never-touch list, regardless of what the Changes field says.
+
+9. **Progress file is ground truth.** Resume uses the progress file, not git log.
+
+10. **No sub-agents.** The executor reads and implements directly.
+    No Agent tool, no TeamCreate, no delegation.
--- a/plugins/ultraplan-local/commands/ultraplan-local.md
+++ b/plugins/ultraplan-local/commands/ultraplan-local.md
@ -0,0 +1,685 @@
+---
+name: ultraplan-local
+description: Deep implementation planning with interview, parallel specialized agents, external research, and optional background execution
+argument-hint: "[--spec spec.md | --fg] <task description>"
+model: opus
+allowed-tools: Agent, Read, Glob, Grep, Write, Edit, Bash, AskUserQuestion, TaskCreate, TaskUpdate, TeamCreate, TeamDelete
+---
+
+# Ultraplan Local v1.0
+
+Deep, multi-phase implementation planning. Uses an interview to gather requirements,
+adaptive specialized agent swarms for exploration, external research for unfamiliar
+technologies, and adversarial review to stress-test the plan.
+
+## Phase 1 — Parse mode and validate input
+
+Parse `$ARGUMENTS` for mode flags:
+
+1. If arguments start with `--spec `: extract the file path after `--spec`.
+   Set **mode = spec-driven**. Read the spec file. If it does not exist, report
+   the error and stop.
+
+2. If arguments start with `--fg `: extract the task description after `--fg`.
+   Set **mode = foreground**.
+
+3. If arguments start with `--quick `: extract the task description after `--quick`.
+   Set **mode = quick**.
+
+4. If arguments start with `--export `: extract the remainder as `{format} {plan-path}`.
+   Split on the first space: format is the first token, plan path is the rest.
+   Valid formats: `pr`, `issue`, `markdown`, `headless`.
+   Set **mode = export**.
+
+   If the format is not one of pr/issue/markdown/headless, report and stop:
+   ```
+   Error: unknown export format '{format}'. Valid: pr, issue, markdown, headless
+   ```
+
+   If the plan file does not exist, report and stop:
+   ```
+   Error: plan file not found: {path}
+   ```
+
+5. If arguments start with `--decompose `: extract the plan file path after `--decompose`.
+   Set **mode = decompose**.
+
+   If the plan file does not exist, report and stop:
+   ```
+   Error: plan file not found: {path}
+   ```
+
+6. Otherwise: the entire argument string is the task description.
+   Set **mode = default**.
+
+If no task description and no spec file, output usage and stop:
+
+```
+Usage: /ultraplan-local <task description>
+       /ultraplan-local --spec <path-to-spec.md>
+       /ultraplan-local --fg <task description>
+       /ultraplan-local --quick <task description>
+       /ultraplan-local --export <pr|issue|markdown|headless> <plan-path>
+       /ultraplan-local --decompose <plan-path>
+
+Modes:
+  default       Interview (interactive) → background planning → notify when done
+  --spec        Skip interview, use provided spec → background planning
+  --fg          All phases in foreground (blocks session)
+  --quick       Interview → plan directly (no agent swarm) → adversarial review
+  --export      Generate shareable output from an existing plan (no new planning)
+  --decompose   Split an existing plan into self-contained headless sessions
+
+Examples:
+  /ultraplan-local Add user authentication with JWT tokens
+  /ultraplan-local --spec .claude/ultraplan-spec-2026-04-05-jwt-auth.md
+  /ultraplan-local --fg Refactor the database layer to use connection pooling
+  /ultraplan-local --quick Add rate limiting to the API
+  /ultraplan-local --export pr .claude/plans/ultraplan-2026-04-06-rate-limiting.md
+  /ultraplan-local --export headless .claude/plans/ultraplan-2026-04-06-rate-limiting.md
+  /ultraplan-local --decompose .claude/plans/ultraplan-2026-04-06-rate-limiting.md
+```
+
+Do not continue past this step if no task was provided.
+
+Report the detected mode to the user:
+```
+Mode: {default | spec-driven | foreground}
+Task: {task description or "from spec: {path}"}
+```
+
+## Phase 1.5 — Export (runs only when mode = export)
+
+**Skip this phase entirely unless mode = export.**
+
+Read the plan file. Extract these sections from the plan content:
+- Task description (from Context section)
+- Implementation steps (from Implementation Plan section)
+- Risks (from Risks and Mitigations section)
+- Test strategy (from Test Strategy section, if present)
+- Scope estimate (from Estimated Scope section)
+
+### Format: `pr`
+
+Output a markdown block formatted as a PR description:
+
+```
+## Summary
+
+{2–3 sentence summary of what this change does and why}
+
+## Changes
+
+{Bulleted list of implementation steps, one line each}
+
+## Test plan
+
+{Bulleted checklist from test strategy, formatted as - [ ] items}
+
+## Risks
+
+{Risks from plan, abbreviated to 1 line each}
+
+---
+*Generated by ultraplan-local from {plan filename}*
+```
+
+### Format: `issue`
+
+Output a markdown block formatted as an issue comment:
+
+```
+## Implementation plan summary
+
+**Task:** {task description}
+**Plan file:** {plan path}
+**Scope:** {N files, complexity}
+
+### Proposed approach
+{3–5 bullet points from key implementation steps}
+
+### Open questions / risks
+{Top 2–3 risks from plan}
+
+---
+*Generated by ultraplan-local*
+```
+
+### Format: `markdown`
+
+Output the plan content with internal metadata stripped:
+- Remove the "Revisions" section
+- Remove plan-critic and scope-guardian scores/verdicts
+- Remove `[ASSUMPTION]` markers (but keep the surrounding sentence)
+- Keep everything else verbatim
+
+### Format: `headless`
+
+This is a shortcut for `--decompose`. It runs the full session decomposition
+pipeline and is equivalent to `--decompose {plan-path}`. Proceed to
+Phase 1.6 (Decompose) below.
+
+---
+
+After outputting the formatted block (for pr/issue/markdown), say:
+```
+Export complete ({format}). Copy the block above.
+```
+
+Then **stop**. Do not continue to Phase 2 or any subsequent phase.
+
+## Phase 1.6 — Decompose (runs only when mode = decompose or export headless)
+
+**Skip this phase entirely unless mode = decompose or export format = headless.**
+
+Read the plan file. Verify it contains an Implementation Plan section with
+numbered steps. If no steps are found, report and stop:
+```
+Error: plan has no implementation steps. Run /ultraplan-local first to generate a plan.
+```
+
+Determine the output directory from the plan slug:
+- Extract the slug from the plan filename (e.g., `ultraplan-2026-04-06-auth-refactor` → `auth-refactor`)
+- Output directory: `.claude/ultraplan-sessions/{slug}/`
+
+Launch the **session-decomposer** agent:
+
+```
+Plan file: {plan path}
+Plugin root: ${CLAUDE_PLUGIN_ROOT}
+Output directory: .claude/ultraplan-sessions/{slug}/
+```
+
+The session-decomposer will:
+1. Parse the plan's steps and their file dependencies
+2. Build a dependency graph between steps
+3. Group steps into sessions of 3–5 steps each
+4. Identify which sessions can run in parallel (waves)
+5. Generate one session spec file per session
+6. Generate a dependency diagram (mermaid)
+7. Generate a launch script (`launch.sh`)
+
+When the session-decomposer completes, present the summary to the user:
+
+```
+## Decomposition Complete
+
+**Master plan:** {plan path}
+**Sessions:** {N} across {W} waves
+**Output:** .claude/ultraplan-sessions/{slug}/
+
+### Sessions
+
+| # | Title | Steps | Wave | Parallel |
+|---|-------|-------|------|----------|
+{session table from decomposer}
+
+### Files generated
+
+- Session specs: .claude/ultraplan-sessions/{slug}/session-*.md
+- Dependency graph: .claude/ultraplan-sessions/{slug}/dependency-graph.md
+- Launch script: .claude/ultraplan-sessions/{slug}/launch.sh
+
+You can:
+- Review individual session specs before running
+- Run all sessions: `bash .claude/ultraplan-sessions/{slug}/launch.sh`
+- Run a single session: `claude -p "$(cat .claude/ultraplan-sessions/{slug}/session-1-*.md)"`
+- Say **"launch"** to start headless execution from here
+```
+
+If the user says **"launch"**: run the launch script via Bash.
+
+Then **stop**. Do not continue to Phase 2 or any subsequent phase.
+
+## Phase 2 — Requirements gathering (interview)
+
+**Skip this phase entirely if mode = spec-driven.** Proceed to Phase 3.
+
+Use `AskUserQuestion` to interview the user about the task. Ask **one question at
+a time** — never dump all questions at once. Follow up based on answers.
+
+### Interview flow
+
+**Start with the most important question:**
+> What is the goal of this task? What does success look like?
+
+**Then ask follow-ups based on the answer. Choose from these topics:**
+- What is explicitly NOT in scope? (non-goals)
+- Are there technical constraints? (specific versions, compatibility, no new dependencies)
+- Do you have preferences? (library X over Y, specific patterns, architectural style)
+- Are there non-functional requirements? (performance targets, security needs, accessibility)
+- Has anything been tried before? What worked or failed?
+
+**Rules:**
+- Ask 3–5 questions for typical tasks. Maximum 8 for complex tasks.
+- If the user says "skip", "proceed", "just plan it", or similar — stop interviewing
+  immediately. Write a minimal spec from the task description alone.
+- Adapt your questions to what the user tells you. If they give a detailed task
+  description, skip obvious questions.
+- Never ask about things you can discover from the codebase.
+
+### Adaptive depth
+
+After each answer, assess the response length and vocabulary:
+
+- **Detailed answer** (2+ sentences, technical terminology, specific examples):
+  - Treat the user as senior — they know the codebase
+  - Skip obvious follow-ups they already answered
+  - Ask more targeted questions: constraints, edge cases, specific technical choices
+  - Reduce question count: aim for 3–4 total instead of 5
+
+- **Short or uncertain answer** (1 sentence or less, "I don't know", "not sure", vague):
+  - Treat the user as unfamiliar with the problem space
+  - Simplify follow-up questions — avoid open-ended technical questions
+  - Offer alternatives instead of asking open questions:
+    > "Should this be synchronous or asynchronous? (synchronous is simpler; async handles more concurrent users)"
+  - For bugs: focus on reproduction before requirements:
+    > "What do you see? What did you expect to see?"
+  - Allow "I don't know" as a valid answer — record it as an open assumption in the spec
+
+Never change your question count based on impatience. Only change depth based
+on answer quality.
+
+### Write the spec file
+
+After gathering requirements, read the spec template:
+@${CLAUDE_PLUGIN_ROOT}/templates/spec-template.md
+
+Generate a slug from the task (first 3-4 meaningful words, lowercase, hyphens).
+Write the spec to: `.claude/ultraplan-spec-{YYYY-MM-DD}-{slug}.md`
+
+Create the `.claude/` directory if it does not exist.
+
+Fill in all sections based on interview answers. Mark unanswered sections with
+"Not discussed — no constraints assumed."
+
+Tell the user:
+```
+Spec saved: .claude/ultraplan-spec-{date}-{slug}.md
+```
+
+## Phase 3 — Background transition
+
+**If mode = foreground or quick:** Skip this phase. Continue to Phase 4 inline.
+
+**If mode = default or spec-driven:**
+
+Launch the **planning-orchestrator** agent with this prompt:
+
+```
+Spec file: {spec path}
+Task: {task description}
+Mode: {default | spec | quick}
+Plan destination: .claude/plans/ultraplan-{YYYY-MM-DD}-{slug}.md
+Plugin root: ${CLAUDE_PLUGIN_ROOT}
+
+Read the spec file and execute your full planning workflow.
+Write the plan to the destination path.
+```
+
+Launch the planning-orchestrator via the Agent tool with `run_in_background: true`.
+The agent runs autonomously while you continue working — you will be notified
+when the plan is ready.
+
+Then output to the user and **stop your response**:
+```
+Background planning started via planning-orchestrator.
+
+  Spec: .claude/ultraplan-spec-{date}-{slug}.md
+  Plan: .claude/plans/ultraplan-{date}-{slug}.md
+
+You will be notified when the plan is ready.
+You can continue working on other tasks in the meantime.
+```
+
+Do not wait for the orchestrator. Do not continue to Phase 4.
+The planning-orchestrator handles Phases 4 through 10 autonomously.
+
+---
+
+**Everything below this line runs either in foreground mode or inside the
+background agent. The instructions are identical regardless of context.**
+
+---
+
+## Phase 4 — Codebase sizing
+
+Determine codebase scale to calibrate agent turns (not agent count).
+
+Run via Bash:
+```
+find . -type f \( -name "*.ts" -o -name "*.tsx" -o -name "*.js" -o -name "*.jsx" -o -name "*.py" -o -name "*.go" -o -name "*.rs" -o -name "*.java" -o -name "*.rb" -o -name "*.c" -o -name "*.cpp" -o -name "*.h" -o -name "*.cs" -o -name "*.swift" -o -name "*.kt" -o -name "*.sh" -o -name "*.md" \) -not -path "*/node_modules/*" -not -path "*/.git/*" -not -path "*/vendor/*" -not -path "*/dist/*" -not -path "*/build/*" | wc -l
+```
+
+Classify:
+- **Small** (< 50 files)
+- **Medium** (50–500 files)
+- **Large** (> 500 files)
+
+Report:
+```
+Codebase: {N} source files ({scale}). Deploying exploration agents.
+```
+
+## Phase 4b — Spec review
+
+Launch the **spec-reviewer** agent:
+Prompt: "Review this spec for quality: {spec path}. Check completeness, consistency,
+testability, and scope clarity."
+
+Handle the verdict:
+- **PROCEED** — continue to Phase 5.
+- **PROCEED_WITH_RISKS** — continue, carry flagged risks as `[ASSUMPTION]` in the plan.
+- **REVISE** — in foreground mode, present findings and ask the user for clarification.
+  In background mode, carry all findings as `[ASSUMPTION]` entries.
+
+## Phase 5 — Parallel exploration (specialized agents + research)
+
+**If mode = quick:** Do NOT launch any exploration agents. Instead, run a
+lightweight file check:
+- `Glob` for files matching key terms from the task description (up to 3 patterns)
+- `Grep` for function/type definitions matching key terms (up to 3 patterns)
+
+Report findings as:
+```
+Quick scan: {N} potentially relevant files found via Glob/Grep.
+No agent swarm — proceeding directly to planning.
+```
+
+Then skip Phase 6 (deep-dives) and proceed to Phase 7 (Synthesis) with only
+the quick-scan results.
+
+---
+
+**All other modes:** Launch exploration agents **in parallel** (all in a single
+message). Use the specialized agents from the `agents/` directory.
+
+**All agents run for all codebase sizes.** Scale `maxTurns` by size (small: halved,
+medium: default, large: default) instead of dropping agents.
+
+| Agent | Small | Medium | Large | Purpose |
+|-------|-------|--------|-------|---------|
+| `architecture-mapper` | Yes | Yes | Yes | Codebase structure, patterns, anti-patterns |
+| `dependency-tracer` | Yes | Yes | Yes | Module connections, data flow, side effects |
+| `risk-assessor` | Yes | Yes | Yes | Risks, edge cases, failure modes |
+| `task-finder` | Yes | Yes | Yes | Task-relevant files, functions, types, reuse candidates |
+| `test-strategist` | Yes | Yes | Yes | Test patterns, coverage gaps, strategy |
+| `git-historian` | Yes | Yes | Yes | Recent changes, ownership, hot files, active branches |
+| `research-scout` | Conditional | Conditional | Conditional | External docs (only when unfamiliar tech detected) |
+| `convention-scanner` | No | Yes | Yes | Coding conventions, naming, style, test patterns |
+
+### Always launch (all codebase sizes):
+
+**architecture-mapper** — full codebase structure, tech stack, patterns, anti-patterns.
+Prompt: "Analyze the architecture of this codebase. The task being planned is: {task}"
+
+**dependency-tracer** — module connections, data flow, side effects for task-relevant code.
+Prompt: "Trace dependencies and data flow relevant to this task: {task}. Focus on modules
+that will be affected by the implementation."
+
+**risk-assessor** — risks, edge cases, failure modes, technical debt near task area.
+Prompt: "Assess risks and failure modes for implementing this task: {task}. Check for
+complexity hotspots, security boundaries, and technical debt in the relevant code."
+
+**task-finder** — all files, functions, types, and interfaces directly related to the task.
+Prompt: "Find all code relevant to this task: {task}. Include existing implementations
+that solve similar problems, API boundaries, database models, configuration files.
+Report file paths and line numbers for every finding."
+
+**test-strategist** — existing test patterns, coverage gaps, test strategy.
+Prompt: "Analyze the test infrastructure and design a test strategy for this task: {task}.
+Discover existing patterns and identify coverage gaps."
+
+**git-historian** — recent changes, code ownership, hot files, active branches.
+Prompt: "Analyze git history relevant to this task: {task}. Report recent changes,
+ownership, hot files, and active branches that may affect planning."
+
+### Launch for medium+ codebases (50+ files):
+
+**Convention Scanner** — use the `convention-scanner` plugin agent (model: "sonnet")
+for medium+ codebases only.
+Provide concrete examples from the codebase, not generic advice."
+
+### Conditional: External research
+
+After reading the task description and spec (if available), determine if the task
+involves technologies, APIs, or libraries that are:
+- Not clearly present in the codebase
+- Being upgraded to a new major version
+- Being used in an unfamiliar way
+
+If yes: launch **research-scout** in parallel with the other agents.
+Prompt: "Research the following technologies for this task: {task}.
+Specific questions: {list specific questions about the technology}.
+Technologies to research: {list}."
+
+If no external technology is involved: skip research-scout and note:
+"No external research needed — all technologies are well-represented in the codebase."
+
+## Phase 6 — Targeted deep-dives
+
+After all Phase 5 agents complete, review their results and identify **knowledge gaps**
+— areas where exploration was too shallow to plan confidently.
+
+Common reasons for deep-dives:
+- A critical function was found but its implementation details are unclear
+- A dependency chain needs tracing to understand side effects
+- A test pattern was identified but the test infrastructure needs more detail
+- A risk was flagged but the actual impact needs verification
+
+For each significant gap, spawn a targeted deep-dive agent (model: "sonnet",
+subagent_type: "Explore") with a narrow, specific brief.
+
+Launch up to 3 deep-dive agents in parallel. If no gaps exist, skip this phase
+and note: "Initial exploration was sufficient — no deep-dives needed."
+
+## Phase 7 — Synthesis
+
+After all agents complete (initial + deep-dives + research), synthesize:
+
+1. Read all agent results carefully
+2. Identify overlaps and contradictions between agents
+3. Build a mental model of the codebase architecture
+4. Catalog reusable code: existing functions, utilities, patterns
+5. Integrate research findings with codebase analysis
+6. Note remaining gaps — things you cannot determine from code or research
+   (these become assumptions in the plan, marked explicitly)
+7. For each finding, track whether it came from **codebase analysis** or
+   **external research** — the plan must distinguish these sources
+
+Do NOT write this synthesis to disk. It is internal working context only.
+
+## Phase 8 — Deep planning
+
+Read the spec file (from Phase 2 or provided via --spec).
+Read the plan template: @${CLAUDE_PLUGIN_ROOT}/templates/plan-template.md
+
+Write the plan following the template structure. The plan MUST include:
+
+### Required sections
+
+1. **Context** — Why this change is needed. Reference the spec's goal and constraints.
+2. **Codebase Analysis** — Tech stack, patterns, relevant files, reusable code,
+   external tech researched. Every file path must be real (verified during exploration).
+3. **Research Sources** — If research-scout was used: table of technologies, sources,
+   findings, and confidence levels. Omit if no research was conducted.
+4. **Implementation Plan** — Ordered steps. Each step specifies:
+   - Exact files to modify or create (with paths)
+   - What changes to make and why
+   - Which existing code to reuse
+   - Dependencies on other steps
+   - Whether the step is based on codebase analysis or external research
+   - **On failure:** — recovery action (revert/retry/skip/escalate)
+   - **Checkpoint:** — git commit command after success
+10. **Execution Strategy** — For plans with > 5 steps: group steps into sessions
+    (3–5 steps each), organize sessions into waves (parallel where independent),
+    specify scope fences per session. Omit for plans with ≤ 5 steps.
+5. **Alternatives Considered** — At least one alternative approach with
+   pros/cons and reason for rejection.
+6. **Risks and Mitigations** — From the risk-assessor findings. What could go
+   wrong and how to handle it.
+7. **Test Strategy** — From the test-strategist findings (if available).
+   What tests to write and which patterns to follow.
+8. **Verification** — Testable criteria. Not "check that it works" but
+   specific commands to run and expected outputs.
+9. **Estimated Scope** — File counts and complexity rating.
+
+### Quality standards
+
+- Every file path in the plan must exist in the codebase (or be explicitly
+  marked as "new file to create")
+- Every "reuses" reference must point to a real function/pattern found during
+  exploration
+- Steps must be ordered by dependency (not by file path or importance)
+- Verification criteria must be concrete and executable
+- The plan must be implementable by someone who has not seen the exploration
+  results — it must stand on its own
+- Research-based decisions must cite their source
+
+### Write the plan
+
+Generate the slug from the task description (or reuse the spec slug).
+Write the plan to: `.claude/plans/ultraplan-{YYYY-MM-DD}-{slug}.md`
+Create the `.claude/plans/` directory if it does not exist.
+
+## Phase 9 — Adversarial review
+
+Launch two review agents **in parallel**:
+
+**plan-critic** — adversarial review of the plan.
+Prompt: "Review this implementation plan for the task: {task}.
+Plan file: {plan path}. Read it and find every problem — missing steps,
+wrong ordering, fragile assumptions, missing error handling, scope creep,
+underspecified steps. Rate each finding as blocker, major, or minor."
+
+**scope-guardian** — scope alignment check.
+Prompt: "Check this implementation plan against the requirements.
+Task: {task}. Spec file: {spec path}. Plan file: {plan path}.
+Find scope creep (plan does more than asked) and scope gaps (plan misses
+requirements). Check that referenced files and functions exist."
+
+After both complete:
+- If **blockers** are found: revise the plan to address them. Add a "Revisions"
+  note at the bottom of the plan listing what changed and why.
+- If only **major** issues: revise to address them. Add revisions note.
+- If only **minor** issues or clean: proceed without changes. Note the
+  review result in the plan.
+
+## Phase 10 — Present and refine
+
+Present a summary to the user:
+
+```
+## Ultraplan Complete
+
+**Task:** {task description}
+**Mode:** {default | spec-driven | foreground}
+**Spec:** {spec file path, or "none (foreground mode)"}
+**Plan:** .claude/plans/ultraplan-{date}-{slug}.md
+**Exploration:** {N} agents deployed ({N} specialized + {N} deep-dives + {research status})
+**Scope:** {N} files to modify, {N} to create — {complexity}
+
+### Key decisions
+- {Decision 1 and rationale}
+- {Decision 2 and rationale}
+
+### Implementation steps ({N} total)
+1. {Step 1 summary}
+2. {Step 2 summary}
+...
+
+### Research findings
+{Summary of external research, or "No external research conducted."}
+
+### Adversarial review
+**Plan critic:** {Summary — blockers/majors/minors found, how addressed}
+**Scope guardian:** {Summary — creep/gaps found, how addressed}
+
+You can:
+- Ask questions or request changes to refine the plan
+- Say **"execute"** to start implementing
+- Say **"execute with team"** to implement with parallel Agent Team (if eligible)
+- Say **"save"** to keep the plan for later
+```
+
+If the user asks questions or requests changes:
+- Update the plan file in-place
+- Show what changed
+- Re-present the summary
+
+## Phase 11 — Handoff
+
+### "save" / "later" / "done"
+
+Confirm the plan and spec file locations and exit.
+
+### "execute" / "go" / "start"
+
+Begin implementing the plan step by step in this session. Follow the plan exactly.
+Mark each step complete as you go.
+
+### "execute with team" / "team"
+
+Before creating a team, verify eligibility:
+1. Count implementation steps that are **independent** (no dependency on each other)
+   AND touch **different files/modules**
+2. If fewer than 3 independent steps: inform the user and fall back to sequential
+   execution. "The plan has fewer than 3 independent steps — sequential execution
+   is more efficient."
+
+If eligible:
+1. Present the proposed team split: which steps go to which team member
+2. Ask for confirmation: "Create Agent Team with {N} members? (yes/no)"
+3. If confirmed: create the team with `TeamCreate`, assign step clusters to
+   each member. Use `isolation: "worktree"` on each team member agent so they
+   work in isolated git worktrees — this prevents file conflicts during parallel
+   implementation. Coordinate execution and clean up with `TeamDelete` when done.
+4. If `TeamCreate` fails (tool not available): fall back to sequential execution
+   and notify the user
+
+## Phase 12 — Session tracking
+
+After the plan is presented (Phase 10) or after handoff (Phase 11), write a
+session record to `${CLAUDE_PLUGIN_DATA}/ultraplan-stats.jsonl` (create the file
+if it does not exist).
+
+Record format (one JSON line):
+```json
+{
+  "ts": "{ISO-8601 timestamp}",
+  "task": "{task description (first 100 chars)}",
+  "mode": "{default|spec|fg}",
+  "slug": "{plan slug}",
+  "codebase_size": "{small|medium|large}",
+  "codebase_files": {N},
+  "agents_deployed": {N},
+  "deep_dives": {N},
+  "research": {true|false},
+  "critic_verdict": "{BLOCK|REVISE|PASS}",
+  "guardian_verdict": "{ALIGNED|CREEP|GAP|MIXED}",
+  "outcome": "{execute|execute_team|save|refine}"
+}
+```
+
+If `${CLAUDE_PLUGIN_DATA}` is not set or not writable, skip tracking silently.
+Never let tracking failures block the main workflow.
+
+## Hard rules
+
+- **Scope**: Only explore the current working directory and its subdirectories.
+  Never read files outside the repo (no ~/.env, no credentials, no other repos).
+- **Cost**: Sonnet for all agents (exploration, deep-dives, research, critics).
+  Opus only runs in the main thread for synthesis and planning.
+- **Privacy**: Never log, store, or repeat file contents that look like
+  secrets, tokens, or credentials. Never log prompt text.
+- **No premature execution**: Do not modify any project files until the user
+  explicitly approves the plan.
+- **Plan stands alone**: The plan file must be understandable without access
+  to the exploration results. Include all necessary context.
+- **Honesty**: If exploration reveals the task is trivial (single file, obvious
+  change), say so. Do not inflate the plan to justify the process. Suggest
+  the user just implements it directly.
+- **Adaptive**: Never spawn more agents than the codebase warrants. A 10-file
+  project does not need 7 exploration agents. Scale down.
+- **Research transparency**: Always distinguish codebase-derived decisions from
+  research-derived decisions in the plan.