feat: initial open marketplace with llm-security, config-audit, ultraplan-local
This commit is contained in:
commit
f93d6abdae
380 changed files with 65935 additions and 0 deletions
647
plugins/ultraplan-local/commands/ultraexecute-local.md
Normal file
647
plugins/ultraplan-local/commands/ultraexecute-local.md
Normal file
|
|
@ -0,0 +1,647 @@
|
|||
---
|
||||
name: ultraexecute-local
|
||||
description: Disciplined plan executor — single-session or multi-session with parallel orchestration, failure recovery, and headless support
|
||||
argument-hint: "[--fg | --resume | --dry-run | --step N | --session N] <plan.md>"
|
||||
model: opus
|
||||
allowed-tools: Read, Write, Edit, Bash, Glob, Grep, AskUserQuestion
|
||||
---
|
||||
|
||||
# Ultraexecute Local
|
||||
|
||||
Disciplined executor for ultraplan plans. Reads a plan file, detects if it has
|
||||
an Execution Strategy (multi-session), and either executes directly or
|
||||
orchestrates parallel headless sessions — all to realize one plan.
|
||||
|
||||
Designed to work identically in interactive and headless (`claude -p`) mode.
|
||||
|
||||
## Phase 1 — Parse mode and validate input
|
||||
|
||||
Parse `$ARGUMENTS` for mode flags:
|
||||
|
||||
1. If arguments contain `--fg`: extract the file path. Set **mode = foreground**.
|
||||
2. If arguments contain `--resume`: extract the file path. Set **mode = resume**.
|
||||
3. If arguments contain `--dry-run`: extract the file path. Set **mode = dry-run**.
|
||||
4. If arguments contain `--step N` (N is a positive integer): extract N and the file path.
|
||||
Set **mode = step**, **target-step = N**.
|
||||
5. If arguments contain `--session N` (N is a positive integer): extract N and the file path.
|
||||
Set **mode = session**, **target-session = N**.
|
||||
6. Otherwise: the entire argument string is the file path. Set **mode = execute**.
|
||||
|
||||
If no path is provided, output usage and stop:
|
||||
|
||||
```
|
||||
Usage: /ultraexecute-local <plan.md>
|
||||
/ultraexecute-local --fg <plan.md>
|
||||
/ultraexecute-local --resume <plan.md>
|
||||
/ultraexecute-local --dry-run <plan.md>
|
||||
/ultraexecute-local --step N <plan.md>
|
||||
/ultraexecute-local --session N <plan.md>
|
||||
|
||||
Modes:
|
||||
(default) Auto — multi-session if plan has Execution Strategy, else foreground
|
||||
--fg Force foreground — all steps sequentially, ignore Execution Strategy
|
||||
--resume Resume from last progress checkpoint
|
||||
--dry-run Validate plan and show execution strategy without running
|
||||
--step N Execute only step N (foreground)
|
||||
--session N Execute only session N from the plan's Execution Strategy
|
||||
|
||||
Examples:
|
||||
/ultraexecute-local .claude/plans/ultraplan-2026-04-06-auth-refactor.md
|
||||
/ultraexecute-local --fg .claude/plans/ultraplan-2026-04-06-auth-refactor.md
|
||||
/ultraexecute-local --session 2 .claude/plans/ultraplan-2026-04-06-auth-refactor.md
|
||||
/ultraexecute-local --dry-run .claude/plans/ultraplan-2026-04-06-auth-refactor.md
|
||||
```
|
||||
|
||||
If the file does not exist, report and stop:
|
||||
```
|
||||
Error: file not found: {path}
|
||||
```
|
||||
|
||||
Report detected mode:
|
||||
```
|
||||
Mode: {execute | resume | dry-run | step N}
|
||||
File: {path}
|
||||
```
|
||||
|
||||
## Phase 2 — Detect file type and parse structure
|
||||
|
||||
Read the file. Determine whether it is an **ultraplan** or a **session spec**:
|
||||
|
||||
- **Session spec**: contains `## Dependencies` with `Entry condition:` AND `## Scope Fence`
|
||||
AND `## Exit Condition` sections.
|
||||
- **Ultraplan**: contains `## Implementation Plan` with numbered `### Step N:` headings
|
||||
but no `## Scope Fence`.
|
||||
|
||||
If neither structure is detected, report and stop:
|
||||
```
|
||||
Error: unrecognized file format. Expected an ultraplan or session spec.
|
||||
```
|
||||
|
||||
### Parse steps
|
||||
|
||||
Extract every `### Step N: {description}` heading (in order). For each step, extract:
|
||||
- **Files** — file paths to create or modify
|
||||
- **Changes** — what to modify
|
||||
- **Reuses** — existing code to leverage (informational)
|
||||
- **Test first** — test to run before implementation (optional)
|
||||
- **Verify** — command to run after implementation
|
||||
- **On failure** — recovery action (revert/retry/skip/escalate)
|
||||
- **Checkpoint** — git commit command after success
|
||||
|
||||
If a step is missing `On failure`, default to `escalate` and record a parse warning.
|
||||
If a step is missing `Verify`, record a parse warning.
|
||||
|
||||
### Parse session spec fields (if applicable)
|
||||
|
||||
- **Entry condition** from `## Dependencies`
|
||||
- **Touch list** and **Never-touch list** from `## Scope Fence`
|
||||
- **Exit condition** checklist from `## Exit Condition`
|
||||
|
||||
### Parse Execution Strategy (if present)
|
||||
|
||||
If the plan contains an `## Execution Strategy` section, extract:
|
||||
- Each `### Session N: {title}` with its Steps, Wave, Depends on, and Scope fence
|
||||
- The `### Execution Order` with wave definitions
|
||||
|
||||
Set **has_execution_strategy = true**.
|
||||
|
||||
Report:
|
||||
```
|
||||
Type: {plan | session-spec}
|
||||
Steps: {N}
|
||||
{if has_execution_strategy}: Execution Strategy: {S} sessions across {W} waves
|
||||
{if session spec}: Entry condition: {text}
|
||||
{if session spec}: Scope fence: {N} touch, {N} never-touch
|
||||
{if warnings}: Warnings: {list}
|
||||
```
|
||||
|
||||
## Phase 2.5 — Execution strategy decision
|
||||
|
||||
Determine how to execute this plan:
|
||||
|
||||
**Run as single session (foreground)** when ANY of these are true:
|
||||
- `--fg` flag is set
|
||||
- `--step N` mode
|
||||
- `--resume` mode
|
||||
- `--session N` mode (runs only that session's steps, foreground)
|
||||
- Plan has no `## Execution Strategy` section
|
||||
- Plan has Execution Strategy with only 1 session
|
||||
|
||||
**Run as multi-session (parallel orchestration)** when ALL of these are true:
|
||||
- mode = `execute` (default, no --fg)
|
||||
- Plan has `## Execution Strategy` with 2+ sessions
|
||||
- At least one wave has 2+ sessions (parallelism possible)
|
||||
|
||||
**Run as multi-session (sequential orchestration)** when:
|
||||
- mode = `execute` (default, no --fg)
|
||||
- Plan has `## Execution Strategy` with 2+ sessions
|
||||
- All sessions are in different waves (no parallelism, but still separate sessions)
|
||||
|
||||
For single-session: continue to Phase 3.
|
||||
For multi-session: jump to Phase 2.6.
|
||||
|
||||
Report:
|
||||
```
|
||||
Strategy: {single session | N sessions (M parallel, K sequential)}
|
||||
```
|
||||
|
||||
## Phase 2.6 — Multi-session orchestration
|
||||
|
||||
**Only runs for multi-session execution.** This phase launches headless child
|
||||
sessions and collects results. After this phase, jump directly to Phase 8
|
||||
(final report).
|
||||
|
||||
### Step 0 — Billing safety check (MANDATORY)
|
||||
|
||||
Before launching ANY `claude -p` process, check the environment:
|
||||
|
||||
```bash
|
||||
echo "${ANTHROPIC_API_KEY:+SET}"
|
||||
```
|
||||
|
||||
If the result is `SET`, **STOP** and warn the user. `claude -p` sessions with
|
||||
`ANTHROPIC_API_KEY` in the environment bill the **API account** (pay-per-token),
|
||||
not the user's Claude subscription (Max/Pro). Parallel Opus sessions can cost
|
||||
$50–100+ per run.
|
||||
|
||||
Use AskUserQuestion with these options:
|
||||
|
||||
**Question:** "ANTHROPIC_API_KEY is set in your environment. Parallel `claude -p`
|
||||
sessions will bill your API account, not your Claude subscription. How do you
|
||||
want to proceed?"
|
||||
|
||||
| Option | Description |
|
||||
|--------|-------------|
|
||||
| **Use --fg instead (Recommended)** | Run all steps sequentially in this session using your subscription. No extra cost. |
|
||||
| **Continue with API billing** | Launch parallel sessions. Each session bills your API account at token rates. |
|
||||
| **Stop** | Cancel execution. Unset ANTHROPIC_API_KEY first, then re-run. |
|
||||
|
||||
If the user chooses `--fg`: restart execution with mode = foreground (jump back
|
||||
to Phase 3, single-session).
|
||||
|
||||
If the user chooses `Continue`: proceed with Phase 2.6 Step 1.
|
||||
|
||||
If the user chooses `Stop`: report "Execution cancelled — billing safety check"
|
||||
and stop.
|
||||
|
||||
If `ANTHROPIC_API_KEY` is NOT set: proceed silently to Step 1.
|
||||
|
||||
### Step 1 — Create session log directory
|
||||
|
||||
```bash
|
||||
mkdir -p .claude/ultraplan-sessions/{slug}/logs
|
||||
```
|
||||
|
||||
### Step 2 — Execute waves
|
||||
|
||||
For each wave (in order):
|
||||
|
||||
**Launch sessions in this wave:**
|
||||
|
||||
For each session in the wave, launch a headless `claude -p` process:
|
||||
|
||||
```bash
|
||||
claude -p "/ultraexecute-local --session {N} {plan-path}" \
|
||||
> .claude/ultraplan-sessions/{slug}/logs/session-{N}.log 2>&1 &
|
||||
```
|
||||
|
||||
If the wave has only 1 session, run it without `&` (no background needed).
|
||||
|
||||
Track PIDs for parallel sessions.
|
||||
|
||||
**Wait for wave completion:**
|
||||
|
||||
```bash
|
||||
wait {PID1} {PID2} ...
|
||||
```
|
||||
|
||||
**Check results after each wave:**
|
||||
|
||||
For each session in the wave, read its log file and grep for
|
||||
`"ultraexecute_summary"`. Parse the JSON to determine:
|
||||
- Did the session complete? (`result: "completed"`)
|
||||
- Did it fail? (`result: "failed"` or `"stopped"`)
|
||||
|
||||
If ANY session in the wave failed:
|
||||
```
|
||||
Wave {W} FAILED: Session {N} failed at step {S}.
|
||||
Stopping — later waves depend on this wave.
|
||||
See log: .claude/ultraplan-sessions/{slug}/logs/session-{N}.log
|
||||
```
|
||||
Do NOT start later waves. Jump to Phase 8 with partial results.
|
||||
|
||||
If all sessions in the wave passed: continue to the next wave.
|
||||
|
||||
### Step 3 — Run master verification
|
||||
|
||||
After all waves complete successfully, run the plan's `## Verification` section
|
||||
commands to verify the integrated result.
|
||||
|
||||
### Step 4 — Aggregate results
|
||||
|
||||
Collect all session summaries into an aggregated report. Jump to Phase 8.
|
||||
|
||||
### --session N mode
|
||||
|
||||
When mode = `session N`:
|
||||
1. Find session N in the Execution Strategy
|
||||
2. Extract its step numbers (e.g., Steps: 4, 5, 6)
|
||||
3. Extract its scope fence (Touch / Never touch lists)
|
||||
4. Execute ONLY those steps, in order, using the single-session protocol (Phase 3→7)
|
||||
5. Enforce the session's scope fence as if it were a session spec's scope fence
|
||||
6. Report results for those steps only
|
||||
|
||||
This mode is used internally by Phase 2.6 when launching child sessions.
|
||||
It can also be used manually to re-run a specific session.
|
||||
|
||||
## Phase 3 — Progress file setup
|
||||
|
||||
The progress file lives at `{plan-dir}/.ultraexecute-progress-{slug}.json` where
|
||||
`{slug}` is the plan filename without extension.
|
||||
|
||||
### Progress file schema
|
||||
|
||||
```json
|
||||
{
|
||||
"schema_version": "1",
|
||||
"plan": "{path}",
|
||||
"plan_type": "{plan | session-spec}",
|
||||
"started_at": "{ISO-8601}",
|
||||
"updated_at": "{ISO-8601}",
|
||||
"mode": "{execute | resume | step}",
|
||||
"total_steps": 0,
|
||||
"current_step": 0,
|
||||
"status": "{in-progress | completed | failed | stopped}",
|
||||
"steps": {
|
||||
"1": { "status": "pending", "attempts": 0, "error": null, "completed_at": null, "commit": null }
|
||||
},
|
||||
"entry_condition_checked": false,
|
||||
"exit_condition_checked": false,
|
||||
"summary": null
|
||||
}
|
||||
```
|
||||
|
||||
### Mode-specific behavior
|
||||
|
||||
**mode = execute (fresh):**
|
||||
- If a progress file exists with status `in-progress` or `failed`: warn that
|
||||
`--resume` is available, then wait 3 seconds (`sleep 3`) and start fresh.
|
||||
This allows headless runs to proceed without blocking.
|
||||
- Otherwise: create the progress file with all steps in `pending` status.
|
||||
|
||||
**mode = resume:**
|
||||
- If no progress file exists: start from step 1 (same as fresh execute).
|
||||
- If progress file exists: find the first step with status != `passed`.
|
||||
```
|
||||
Resuming from step {N}. {M}/{total} steps already completed.
|
||||
```
|
||||
|
||||
**mode = dry-run:**
|
||||
- Do NOT create or modify the progress file.
|
||||
|
||||
**mode = step N:**
|
||||
- Create the progress file if it does not exist.
|
||||
- Only step N will be executed.
|
||||
|
||||
## Phase 4 — Entry condition check (session specs only)
|
||||
|
||||
**Skip for ultraplans.** Skip in dry-run mode (report what would be checked instead).
|
||||
|
||||
Read the entry condition. Evaluate it:
|
||||
|
||||
- `"none"` or similar → pass immediately
|
||||
- References git state (e.g., "git status clean") → run `git status --porcelain`
|
||||
- References passing tests → run the specified command
|
||||
- References a previous session → check `git log --oneline` for commit pattern
|
||||
|
||||
If the entry condition **fails**:
|
||||
```
|
||||
Entry condition FAILED: {condition text}
|
||||
Reason: {what was checked, what was found}
|
||||
Complete the prerequisite first, then re-run.
|
||||
```
|
||||
Update progress file with `status: "stopped"`. Stop execution.
|
||||
|
||||
If the entry condition **passes**:
|
||||
```
|
||||
Entry condition: PASS
|
||||
```
|
||||
Update `entry_condition_checked: true` in the progress file.
|
||||
|
||||
## Phase 5 — Dry-run report (dry-run mode only)
|
||||
|
||||
**Only runs when mode = dry-run.** Produces a validation report, then stops.
|
||||
|
||||
```
|
||||
## Dry Run Report: {filename}
|
||||
|
||||
**Type:** {plan | session-spec}
|
||||
**Steps:** {N}
|
||||
|
||||
### Step Validation
|
||||
|
||||
| Step | Description | Verify | On failure | Checkpoint | Issues |
|
||||
|------|-------------|--------|------------|------------|--------|
|
||||
| 1 | {desc} | {cmd} | {action} | {msg} | {none / missing X} |
|
||||
|
||||
### File References
|
||||
|
||||
{For each file in Files: fields, check existence with Glob}
|
||||
- {path}: EXISTS | NOT FOUND {(marked as new file) | (unexpected — may be missing)}
|
||||
|
||||
### Entry / Exit Conditions (session specs)
|
||||
|
||||
{What would be checked}
|
||||
|
||||
### Execution Preview (only when plan has Execution Strategy)
|
||||
|
||||
If `has_execution_strategy = true`, show a preview of multi-session orchestration:
|
||||
|
||||
```
|
||||
**Sessions:** {S} across {W} waves
|
||||
|
||||
| Wave | Session | Steps | Depends on | Command |
|
||||
|------|---------|-------|------------|---------|
|
||||
| 1 | Session 1: {title} | {nums} | none | `claude -p "/ultraexecute-local --session 1 {path}"` |
|
||||
| 1 | Session 2: {title} | {nums} | none | `claude -p "/ultraexecute-local --session 2 {path}"` |
|
||||
| 2 | Session 3: {title} | {nums} | S1, S2 | `claude -p "/ultraexecute-local --session 3 {path}"` |
|
||||
```
|
||||
|
||||
Check billing status via `echo "${ANTHROPIC_API_KEY:+SET}"` and report:
|
||||
```
|
||||
Billing: ANTHROPIC_API_KEY is {SET — parallel sessions will bill API account | NOT SET — sessions will use subscription}
|
||||
```
|
||||
|
||||
### Verdict
|
||||
|
||||
{READY | NEEDS ATTENTION — N issues found}
|
||||
```
|
||||
|
||||
Stop after the dry-run report. Do not execute anything.
|
||||
|
||||
## Phase 6 — Step execution loop
|
||||
|
||||
The core execution phase. Runs for modes: `execute`, `resume`, `step`.
|
||||
|
||||
### Determine starting step
|
||||
|
||||
- **execute**: step 1
|
||||
- **resume**: first step where status != `passed`
|
||||
- **step N**: step N only
|
||||
|
||||
### For each step
|
||||
|
||||
Update progress: `steps.{N}.status = "running"`, `current_step = N`, `updated_at = now`.
|
||||
|
||||
```
|
||||
--- Step {N}/{total}: {description} ---
|
||||
```
|
||||
|
||||
#### Sub-step A — Scope fence check (session specs only)
|
||||
|
||||
Before touching any file, verify that every file in the step's `Files:` field is
|
||||
in the session spec's Touch list (or is a new file to create). If ANY file is in
|
||||
the Never-touch list:
|
||||
|
||||
```
|
||||
SCOPE VIOLATION: Step {N} requires {file} which is in the never-touch list.
|
||||
Escalating — this step cannot be executed within this session's scope.
|
||||
```
|
||||
|
||||
Treat this as an automatic `escalate`. Jump to the stop-and-report logic.
|
||||
|
||||
#### Sub-step B — Test first (if present)
|
||||
|
||||
If the step has a `Test first:` field:
|
||||
1. If test file is marked `(new)`: note it will be created during implementation.
|
||||
2. If test file exists: run it. Expect failure (RED state).
|
||||
3. If test unexpectedly passes: warn but continue — step may already be done.
|
||||
|
||||
Do not block on test-first failures — they are expected.
|
||||
|
||||
#### Sub-step C — Implement changes
|
||||
|
||||
Read the step's `Files:` and `Changes:` fields. Implement exactly as described.
|
||||
|
||||
**Rules:**
|
||||
- Follow `Changes:` exactly — do not improvise, add scope, or optimize
|
||||
- Use Edit for modifications, Write for new files
|
||||
- If `Reuses:` references existing code, read that code first for context
|
||||
- Only touch files listed in `Files:` — nothing else
|
||||
|
||||
#### Sub-step D — Verification
|
||||
|
||||
Run the `Verify:` command exactly as written, via Bash.
|
||||
|
||||
**Rules:**
|
||||
- Always a fresh run — never trust prior results
|
||||
- Exit code is the authoritative truth:
|
||||
- Exit 0 + expected output (if specified) = **PASS**
|
||||
- Exit non-zero = **FAIL** regardless of output text
|
||||
- Exit 0 but wrong output = **FAIL**
|
||||
|
||||
```
|
||||
Verify: {command}
|
||||
Result: {PASS | FAIL} (exit code {N})
|
||||
{if FAIL}: Output (first 10 lines): {output}
|
||||
```
|
||||
|
||||
If **PASS**: proceed to Sub-step F (checkpoint).
|
||||
|
||||
#### Sub-step E — On failure handling
|
||||
|
||||
If **FAIL**, read the `On failure:` clause. Apply the retry cap: **maximum 2 retries**
|
||||
(3 total attempts). Track attempts in `steps.{N}.attempts`.
|
||||
|
||||
**`On failure: revert`**
|
||||
- If attempts < 3: analyze the failure, re-implement with adjustments, re-verify.
|
||||
```
|
||||
Attempt {A}/3 failed. Retrying...
|
||||
```
|
||||
- If attempts == 3: revert this step's changes:
|
||||
```bash
|
||||
git checkout -- {files from Files: field}
|
||||
```
|
||||
Record failure. **Do NOT proceed to next step.** Jump to Phase 7.
|
||||
|
||||
**`On failure: retry`**
|
||||
- If attempts < 3: use the alternative approach described in the On failure clause.
|
||||
- If attempts == 3: revert and stop. Jump to Phase 7.
|
||||
|
||||
**`On failure: skip`**
|
||||
- Mark step as skipped regardless of attempt count. Continue to next step.
|
||||
```
|
||||
Step {N}: SKIPPED (non-critical per plan)
|
||||
```
|
||||
Update `steps.{N}.status = "skipped"`.
|
||||
|
||||
**`On failure: escalate`**
|
||||
- Stop immediately regardless of attempt count.
|
||||
```
|
||||
Step {N}: ESCALATED — requires human judgment
|
||||
```
|
||||
Commit all completed work before stopping. Stage ONLY files from steps with
|
||||
`status: "passed"` in the progress file — collect their `Files:` fields. Never
|
||||
use `git add -A` (risks staging secrets, binaries, or unrelated work).
|
||||
```bash
|
||||
git add {files from passed steps' Files: fields} && git commit -m "wip: ultraexecute-local stopped at step {N} — escalation needed"
|
||||
```
|
||||
Jump to Phase 7.
|
||||
|
||||
#### Sub-step F — Checkpoint
|
||||
|
||||
Run the `Checkpoint:` git commit command exactly as written in the plan.
|
||||
|
||||
If the commit fails (nothing to commit, etc.): warn but do NOT fail the step.
|
||||
The step's verification already passed — the commit is bookkeeping.
|
||||
|
||||
```
|
||||
Step {N}: PASS (committed: {hash})
|
||||
```
|
||||
|
||||
Update progress: `steps.{N}.status = "passed"`, `steps.{N}.commit = {hash}`,
|
||||
`steps.{N}.completed_at = now`.
|
||||
|
||||
### Step mode exit
|
||||
|
||||
If mode = `step N`: after completing step N (pass or fail), skip remaining steps
|
||||
and jump to Phase 8 (final report).
|
||||
|
||||
## Phase 7 — Exit condition check (session specs only)
|
||||
|
||||
**Skip for ultraplans.** Run only when all steps passed (not on early stop).
|
||||
|
||||
Run each exit condition command from the `## Exit Condition` checklist:
|
||||
|
||||
```
|
||||
Exit condition check:
|
||||
- [ ] {command} → {PASS | FAIL}
|
||||
- [ ] {command} → {PASS | FAIL}
|
||||
```
|
||||
|
||||
If all pass: `exit_condition_checked: true` in progress file.
|
||||
If any fail: record which failed. Include in final report.
|
||||
|
||||
## Phase 8 — Final report
|
||||
|
||||
Always produce a final report.
|
||||
|
||||
Update progress file: `status` to `completed`/`failed`/`stopped`, `updated_at`, `summary`.
|
||||
|
||||
```
|
||||
## Ultraexecute Local Complete
|
||||
|
||||
**Plan:** {path}
|
||||
**Type:** {plan | session-spec}
|
||||
**Mode:** {execute | resume | step N}
|
||||
**Result:** {COMPLETED | FAILED at step N | STOPPED (escalation) | PARTIAL (N/total passed)}
|
||||
|
||||
### Step Results
|
||||
|
||||
| Step | Description | Result | Attempts | Commit |
|
||||
|------|-------------|--------|----------|--------|
|
||||
| 1 | {desc} | PASS | 1 | abc1234 |
|
||||
| 2 | {desc} | FAIL | 3 | — |
|
||||
| 3 | {desc} | — | 0 | — |
|
||||
|
||||
### Summary
|
||||
|
||||
- Passed: {N}/{total}
|
||||
- Skipped: {N}
|
||||
- Failed: {N}
|
||||
- Not reached: {N}
|
||||
|
||||
{if all passed + exit condition passed}:
|
||||
All steps completed. Exit condition: PASS.
|
||||
|
||||
{if failed/stopped}:
|
||||
### Failure Details
|
||||
|
||||
Step {N}: {description}
|
||||
On failure: {action}
|
||||
Error: {error output, first 20 lines}
|
||||
Attempts: {N}
|
||||
|
||||
### What Remains
|
||||
|
||||
{Numbered list of unexecuted steps}
|
||||
|
||||
To resume: /ultraexecute-local --resume {path}
|
||||
```
|
||||
|
||||
**JSON summary block** (always at the end, machine-parseable):
|
||||
|
||||
```json
|
||||
{
|
||||
"ultraexecute_summary": {
|
||||
"plan": "{path}",
|
||||
"plan_type": "{plan | session-spec}",
|
||||
"result": "{completed | failed | stopped | partial}",
|
||||
"steps_total": 0,
|
||||
"steps_passed": 0,
|
||||
"steps_failed": 0,
|
||||
"steps_skipped": 0,
|
||||
"steps_not_reached": 0,
|
||||
"failed_at_step": null,
|
||||
"exit_condition": "{pass | fail | skipped | n/a}",
|
||||
"progress_file": "{path}"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The `ultraexecute_summary` key makes it grep-able in log files from headless runs.
|
||||
|
||||
## Phase 9 — Stats tracking
|
||||
|
||||
Append one record to `${CLAUDE_PLUGIN_DATA}/ultraexecute-stats.jsonl`:
|
||||
|
||||
```json
|
||||
{
|
||||
"ts": "{ISO-8601}",
|
||||
"plan": "{filename only}",
|
||||
"plan_type": "{plan | session-spec}",
|
||||
"mode": "{execute | resume | dry-run | step}",
|
||||
"result": "{completed | failed | stopped | partial}",
|
||||
"steps_total": 0,
|
||||
"steps_passed": 0,
|
||||
"steps_failed": 0,
|
||||
"steps_skipped": 0,
|
||||
"failed_at_step": null
|
||||
}
|
||||
```
|
||||
|
||||
If `${CLAUDE_PLUGIN_DATA}` is not set or not writable, skip silently.
|
||||
Never let stats failures block the workflow.
|
||||
|
||||
## Hard rules
|
||||
|
||||
1. **No AskUserQuestion for execution decisions.** All execution decisions come
|
||||
from the plan's On failure clauses. If the plan says escalate, stop and
|
||||
report — never ask. **Exception:** the billing safety check in Phase 2.6
|
||||
Step 0 MUST ask before spending money on the user's API account.
|
||||
|
||||
2. **No scope creep.** Only touch files listed in the step's `Files:` field.
|
||||
If a file outside the list seems to need changing, record it as a finding
|
||||
in the final report — do not touch it.
|
||||
|
||||
3. **Exit code is truth.** The Verify command's exit code is authoritative.
|
||||
Non-zero = FAIL regardless of output. Zero with wrong output = FAIL.
|
||||
|
||||
4. **Fresh verification.** Re-run the Verify command from scratch every time.
|
||||
Never trust cached or prior results.
|
||||
|
||||
5. **Retry cap = 3 attempts.** Initial + 2 retries, then stop. Never loop forever.
|
||||
|
||||
6. **Never corrupt completed work.** Only revert files from the failing step.
|
||||
Never touch files from earlier passed steps.
|
||||
|
||||
7. **Checkpoint discipline.** Run the Checkpoint commit exactly as written.
|
||||
Do not combine, reorder, or skip checkpoints on passed steps.
|
||||
|
||||
8. **Scope fence enforcement.** For session specs: never modify files in the
|
||||
Never-touch list, regardless of what the Changes field says.
|
||||
|
||||
9. **Progress file is ground truth.** Resume uses the progress file, not git log.
|
||||
|
||||
10. **No sub-agents.** The executor reads and implements directly.
|
||||
No Agent tool, no TeamCreate, no delegation.
|
||||
685
plugins/ultraplan-local/commands/ultraplan-local.md
Normal file
685
plugins/ultraplan-local/commands/ultraplan-local.md
Normal file
|
|
@ -0,0 +1,685 @@
|
|||
---
|
||||
name: ultraplan-local
|
||||
description: Deep implementation planning with interview, parallel specialized agents, external research, and optional background execution
|
||||
argument-hint: "[--spec spec.md | --fg] <task description>"
|
||||
model: opus
|
||||
allowed-tools: Agent, Read, Glob, Grep, Write, Edit, Bash, AskUserQuestion, TaskCreate, TaskUpdate, TeamCreate, TeamDelete
|
||||
---
|
||||
|
||||
# Ultraplan Local v1.0
|
||||
|
||||
Deep, multi-phase implementation planning. Uses an interview to gather requirements,
|
||||
adaptive specialized agent swarms for exploration, external research for unfamiliar
|
||||
technologies, and adversarial review to stress-test the plan.
|
||||
|
||||
## Phase 1 — Parse mode and validate input
|
||||
|
||||
Parse `$ARGUMENTS` for mode flags:
|
||||
|
||||
1. If arguments start with `--spec `: extract the file path after `--spec`.
|
||||
Set **mode = spec-driven**. Read the spec file. If it does not exist, report
|
||||
the error and stop.
|
||||
|
||||
2. If arguments start with `--fg `: extract the task description after `--fg`.
|
||||
Set **mode = foreground**.
|
||||
|
||||
3. If arguments start with `--quick `: extract the task description after `--quick`.
|
||||
Set **mode = quick**.
|
||||
|
||||
4. If arguments start with `--export `: extract the remainder as `{format} {plan-path}`.
|
||||
Split on the first space: format is the first token, plan path is the rest.
|
||||
Valid formats: `pr`, `issue`, `markdown`, `headless`.
|
||||
Set **mode = export**.
|
||||
|
||||
If the format is not one of pr/issue/markdown/headless, report and stop:
|
||||
```
|
||||
Error: unknown export format '{format}'. Valid: pr, issue, markdown, headless
|
||||
```
|
||||
|
||||
If the plan file does not exist, report and stop:
|
||||
```
|
||||
Error: plan file not found: {path}
|
||||
```
|
||||
|
||||
5. If arguments start with `--decompose `: extract the plan file path after `--decompose`.
|
||||
Set **mode = decompose**.
|
||||
|
||||
If the plan file does not exist, report and stop:
|
||||
```
|
||||
Error: plan file not found: {path}
|
||||
```
|
||||
|
||||
6. Otherwise: the entire argument string is the task description.
|
||||
Set **mode = default**.
|
||||
|
||||
If no task description and no spec file, output usage and stop:
|
||||
|
||||
```
|
||||
Usage: /ultraplan-local <task description>
|
||||
/ultraplan-local --spec <path-to-spec.md>
|
||||
/ultraplan-local --fg <task description>
|
||||
/ultraplan-local --quick <task description>
|
||||
/ultraplan-local --export <pr|issue|markdown|headless> <plan-path>
|
||||
/ultraplan-local --decompose <plan-path>
|
||||
|
||||
Modes:
|
||||
default Interview (interactive) → background planning → notify when done
|
||||
--spec Skip interview, use provided spec → background planning
|
||||
--fg All phases in foreground (blocks session)
|
||||
--quick Interview → plan directly (no agent swarm) → adversarial review
|
||||
--export Generate shareable output from an existing plan (no new planning)
|
||||
--decompose Split an existing plan into self-contained headless sessions
|
||||
|
||||
Examples:
|
||||
/ultraplan-local Add user authentication with JWT tokens
|
||||
/ultraplan-local --spec .claude/ultraplan-spec-2026-04-05-jwt-auth.md
|
||||
/ultraplan-local --fg Refactor the database layer to use connection pooling
|
||||
/ultraplan-local --quick Add rate limiting to the API
|
||||
/ultraplan-local --export pr .claude/plans/ultraplan-2026-04-06-rate-limiting.md
|
||||
/ultraplan-local --export headless .claude/plans/ultraplan-2026-04-06-rate-limiting.md
|
||||
/ultraplan-local --decompose .claude/plans/ultraplan-2026-04-06-rate-limiting.md
|
||||
```
|
||||
|
||||
Do not continue past this step if no task was provided.
|
||||
|
||||
Report the detected mode to the user:
|
||||
```
|
||||
Mode: {default | spec-driven | foreground}
|
||||
Task: {task description or "from spec: {path}"}
|
||||
```
|
||||
|
||||
## Phase 1.5 — Export (runs only when mode = export)
|
||||
|
||||
**Skip this phase entirely unless mode = export.**
|
||||
|
||||
Read the plan file. Extract these sections from the plan content:
|
||||
- Task description (from Context section)
|
||||
- Implementation steps (from Implementation Plan section)
|
||||
- Risks (from Risks and Mitigations section)
|
||||
- Test strategy (from Test Strategy section, if present)
|
||||
- Scope estimate (from Estimated Scope section)
|
||||
|
||||
### Format: `pr`
|
||||
|
||||
Output a markdown block formatted as a PR description:
|
||||
|
||||
```
|
||||
## Summary
|
||||
|
||||
{2–3 sentence summary of what this change does and why}
|
||||
|
||||
## Changes
|
||||
|
||||
{Bulleted list of implementation steps, one line each}
|
||||
|
||||
## Test plan
|
||||
|
||||
{Bulleted checklist from test strategy, formatted as - [ ] items}
|
||||
|
||||
## Risks
|
||||
|
||||
{Risks from plan, abbreviated to 1 line each}
|
||||
|
||||
---
|
||||
*Generated by ultraplan-local from {plan filename}*
|
||||
```
|
||||
|
||||
### Format: `issue`
|
||||
|
||||
Output a markdown block formatted as an issue comment:
|
||||
|
||||
```
|
||||
## Implementation plan summary
|
||||
|
||||
**Task:** {task description}
|
||||
**Plan file:** {plan path}
|
||||
**Scope:** {N files, complexity}
|
||||
|
||||
### Proposed approach
|
||||
{3–5 bullet points from key implementation steps}
|
||||
|
||||
### Open questions / risks
|
||||
{Top 2–3 risks from plan}
|
||||
|
||||
---
|
||||
*Generated by ultraplan-local*
|
||||
```
|
||||
|
||||
### Format: `markdown`
|
||||
|
||||
Output the plan content with internal metadata stripped:
|
||||
- Remove the "Revisions" section
|
||||
- Remove plan-critic and scope-guardian scores/verdicts
|
||||
- Remove `[ASSUMPTION]` markers (but keep the surrounding sentence)
|
||||
- Keep everything else verbatim
|
||||
|
||||
### Format: `headless`
|
||||
|
||||
This is a shortcut for `--decompose`. It runs the full session decomposition
|
||||
pipeline and is equivalent to `--decompose {plan-path}`. Proceed to
|
||||
Phase 1.6 (Decompose) below.
|
||||
|
||||
---
|
||||
|
||||
After outputting the formatted block (for pr/issue/markdown), say:
|
||||
```
|
||||
Export complete ({format}). Copy the block above.
|
||||
```
|
||||
|
||||
Then **stop**. Do not continue to Phase 2 or any subsequent phase.
|
||||
|
||||
## Phase 1.6 — Decompose (runs only when mode = decompose or export headless)
|
||||
|
||||
**Skip this phase entirely unless mode = decompose or export format = headless.**
|
||||
|
||||
Read the plan file. Verify it contains an Implementation Plan section with
|
||||
numbered steps. If no steps are found, report and stop:
|
||||
```
|
||||
Error: plan has no implementation steps. Run /ultraplan-local first to generate a plan.
|
||||
```
|
||||
|
||||
Determine the output directory from the plan slug:
|
||||
- Extract the slug from the plan filename (e.g., `ultraplan-2026-04-06-auth-refactor` → `auth-refactor`)
|
||||
- Output directory: `.claude/ultraplan-sessions/{slug}/`
|
||||
|
||||
Launch the **session-decomposer** agent:
|
||||
|
||||
```
|
||||
Plan file: {plan path}
|
||||
Plugin root: ${CLAUDE_PLUGIN_ROOT}
|
||||
Output directory: .claude/ultraplan-sessions/{slug}/
|
||||
```
|
||||
|
||||
The session-decomposer will:
|
||||
1. Parse the plan's steps and their file dependencies
|
||||
2. Build a dependency graph between steps
|
||||
3. Group steps into sessions of 3–5 steps each
|
||||
4. Identify which sessions can run in parallel (waves)
|
||||
5. Generate one session spec file per session
|
||||
6. Generate a dependency diagram (mermaid)
|
||||
7. Generate a launch script (`launch.sh`)
|
||||
|
||||
When the session-decomposer completes, present the summary to the user:
|
||||
|
||||
```
|
||||
## Decomposition Complete
|
||||
|
||||
**Master plan:** {plan path}
|
||||
**Sessions:** {N} across {W} waves
|
||||
**Output:** .claude/ultraplan-sessions/{slug}/
|
||||
|
||||
### Sessions
|
||||
|
||||
| # | Title | Steps | Wave | Parallel |
|
||||
|---|-------|-------|------|----------|
|
||||
{session table from decomposer}
|
||||
|
||||
### Files generated
|
||||
|
||||
- Session specs: .claude/ultraplan-sessions/{slug}/session-*.md
|
||||
- Dependency graph: .claude/ultraplan-sessions/{slug}/dependency-graph.md
|
||||
- Launch script: .claude/ultraplan-sessions/{slug}/launch.sh
|
||||
|
||||
You can:
|
||||
- Review individual session specs before running
|
||||
- Run all sessions: `bash .claude/ultraplan-sessions/{slug}/launch.sh`
|
||||
- Run a single session: `claude -p "$(cat .claude/ultraplan-sessions/{slug}/session-1-*.md)"`
|
||||
- Say **"launch"** to start headless execution from here
|
||||
```
|
||||
|
||||
If the user says **"launch"**: run the launch script via Bash.
|
||||
|
||||
Then **stop**. Do not continue to Phase 2 or any subsequent phase.
|
||||
|
||||
## Phase 2 — Requirements gathering (interview)
|
||||
|
||||
**Skip this phase entirely if mode = spec-driven.** Proceed to Phase 3.
|
||||
|
||||
Use `AskUserQuestion` to interview the user about the task. Ask **one question at
|
||||
a time** — never dump all questions at once. Follow up based on answers.
|
||||
|
||||
### Interview flow
|
||||
|
||||
**Start with the most important question:**
|
||||
> What is the goal of this task? What does success look like?
|
||||
|
||||
**Then ask follow-ups based on the answer. Choose from these topics:**
|
||||
- What is explicitly NOT in scope? (non-goals)
|
||||
- Are there technical constraints? (specific versions, compatibility, no new dependencies)
|
||||
- Do you have preferences? (library X over Y, specific patterns, architectural style)
|
||||
- Are there non-functional requirements? (performance targets, security needs, accessibility)
|
||||
- Has anything been tried before? What worked or failed?
|
||||
|
||||
**Rules:**
|
||||
- Ask 3–5 questions for typical tasks. Maximum 8 for complex tasks.
|
||||
- If the user says "skip", "proceed", "just plan it", or similar — stop interviewing
|
||||
immediately. Write a minimal spec from the task description alone.
|
||||
- Adapt your questions to what the user tells you. If they give a detailed task
|
||||
description, skip obvious questions.
|
||||
- Never ask about things you can discover from the codebase.
|
||||
|
||||
### Adaptive depth
|
||||
|
||||
After each answer, assess the response length and vocabulary:
|
||||
|
||||
- **Detailed answer** (2+ sentences, technical terminology, specific examples):
|
||||
- Treat the user as senior — they know the codebase
|
||||
- Skip obvious follow-ups they already answered
|
||||
- Ask more targeted questions: constraints, edge cases, specific technical choices
|
||||
- Reduce question count: aim for 3–4 total instead of 5
|
||||
|
||||
- **Short or uncertain answer** (1 sentence or less, "I don't know", "not sure", vague):
|
||||
- Treat the user as unfamiliar with the problem space
|
||||
- Simplify follow-up questions — avoid open-ended technical questions
|
||||
- Offer alternatives instead of asking open questions:
|
||||
> "Should this be synchronous or asynchronous? (synchronous is simpler; async handles more concurrent users)"
|
||||
- For bugs: focus on reproduction before requirements:
|
||||
> "What do you see? What did you expect to see?"
|
||||
- Allow "I don't know" as a valid answer — record it as an open assumption in the spec
|
||||
|
||||
Never change your question count based on impatience. Only change depth based
|
||||
on answer quality.
|
||||
|
||||
### Write the spec file
|
||||
|
||||
After gathering requirements, read the spec template:
|
||||
@${CLAUDE_PLUGIN_ROOT}/templates/spec-template.md
|
||||
|
||||
Generate a slug from the task (first 3-4 meaningful words, lowercase, hyphens).
|
||||
Write the spec to: `.claude/ultraplan-spec-{YYYY-MM-DD}-{slug}.md`
|
||||
|
||||
Create the `.claude/` directory if it does not exist.
|
||||
|
||||
Fill in all sections based on interview answers. Mark unanswered sections with
|
||||
"Not discussed — no constraints assumed."
|
||||
|
||||
Tell the user:
|
||||
```
|
||||
Spec saved: .claude/ultraplan-spec-{date}-{slug}.md
|
||||
```
|
||||
|
||||
## Phase 3 — Background transition
|
||||
|
||||
**If mode = foreground or quick:** Skip this phase. Continue to Phase 4 inline.
|
||||
|
||||
**If mode = default or spec-driven:**
|
||||
|
||||
Launch the **planning-orchestrator** agent with this prompt:
|
||||
|
||||
```
|
||||
Spec file: {spec path}
|
||||
Task: {task description}
|
||||
Mode: {default | spec | quick}
|
||||
Plan destination: .claude/plans/ultraplan-{YYYY-MM-DD}-{slug}.md
|
||||
Plugin root: ${CLAUDE_PLUGIN_ROOT}
|
||||
|
||||
Read the spec file and execute your full planning workflow.
|
||||
Write the plan to the destination path.
|
||||
```
|
||||
|
||||
Launch the planning-orchestrator via the Agent tool with `run_in_background: true`.
|
||||
The agent runs autonomously while you continue working — you will be notified
|
||||
when the plan is ready.
|
||||
|
||||
Then output to the user and **stop your response**:
|
||||
```
|
||||
Background planning started via planning-orchestrator.
|
||||
|
||||
Spec: .claude/ultraplan-spec-{date}-{slug}.md
|
||||
Plan: .claude/plans/ultraplan-{date}-{slug}.md
|
||||
|
||||
You will be notified when the plan is ready.
|
||||
You can continue working on other tasks in the meantime.
|
||||
```
|
||||
|
||||
Do not wait for the orchestrator. Do not continue to Phase 4.
|
||||
The planning-orchestrator handles Phases 4 through 10 autonomously.
|
||||
|
||||
---
|
||||
|
||||
**Everything below this line runs either in foreground mode or inside the
|
||||
background agent. The instructions are identical regardless of context.**
|
||||
|
||||
---
|
||||
|
||||
## Phase 4 — Codebase sizing
|
||||
|
||||
Determine codebase scale to calibrate agent turns (not agent count).
|
||||
|
||||
Run via Bash:
|
||||
```
|
||||
find . -type f \( -name "*.ts" -o -name "*.tsx" -o -name "*.js" -o -name "*.jsx" -o -name "*.py" -o -name "*.go" -o -name "*.rs" -o -name "*.java" -o -name "*.rb" -o -name "*.c" -o -name "*.cpp" -o -name "*.h" -o -name "*.cs" -o -name "*.swift" -o -name "*.kt" -o -name "*.sh" -o -name "*.md" \) -not -path "*/node_modules/*" -not -path "*/.git/*" -not -path "*/vendor/*" -not -path "*/dist/*" -not -path "*/build/*" | wc -l
|
||||
```
|
||||
|
||||
Classify:
|
||||
- **Small** (< 50 files)
|
||||
- **Medium** (50–500 files)
|
||||
- **Large** (> 500 files)
|
||||
|
||||
Report:
|
||||
```
|
||||
Codebase: {N} source files ({scale}). Deploying exploration agents.
|
||||
```
|
||||
|
||||
## Phase 4b — Spec review
|
||||
|
||||
Launch the **spec-reviewer** agent:
|
||||
Prompt: "Review this spec for quality: {spec path}. Check completeness, consistency,
|
||||
testability, and scope clarity."
|
||||
|
||||
Handle the verdict:
|
||||
- **PROCEED** — continue to Phase 5.
|
||||
- **PROCEED_WITH_RISKS** — continue, carry flagged risks as `[ASSUMPTION]` in the plan.
|
||||
- **REVISE** — in foreground mode, present findings and ask the user for clarification.
|
||||
In background mode, carry all findings as `[ASSUMPTION]` entries.
|
||||
|
||||
## Phase 5 — Parallel exploration (specialized agents + research)
|
||||
|
||||
**If mode = quick:** Do NOT launch any exploration agents. Instead, run a
|
||||
lightweight file check:
|
||||
- `Glob` for files matching key terms from the task description (up to 3 patterns)
|
||||
- `Grep` for function/type definitions matching key terms (up to 3 patterns)
|
||||
|
||||
Report findings as:
|
||||
```
|
||||
Quick scan: {N} potentially relevant files found via Glob/Grep.
|
||||
No agent swarm — proceeding directly to planning.
|
||||
```
|
||||
|
||||
Then skip Phase 6 (deep-dives) and proceed to Phase 7 (Synthesis) with only
|
||||
the quick-scan results.
|
||||
|
||||
---
|
||||
|
||||
**All other modes:** Launch exploration agents **in parallel** (all in a single
|
||||
message). Use the specialized agents from the `agents/` directory.
|
||||
|
||||
**All agents run for all codebase sizes.** Scale `maxTurns` by size (small: halved,
|
||||
medium: default, large: default) instead of dropping agents.
|
||||
|
||||
| Agent | Small | Medium | Large | Purpose |
|
||||
|-------|-------|--------|-------|---------|
|
||||
| `architecture-mapper` | Yes | Yes | Yes | Codebase structure, patterns, anti-patterns |
|
||||
| `dependency-tracer` | Yes | Yes | Yes | Module connections, data flow, side effects |
|
||||
| `risk-assessor` | Yes | Yes | Yes | Risks, edge cases, failure modes |
|
||||
| `task-finder` | Yes | Yes | Yes | Task-relevant files, functions, types, reuse candidates |
|
||||
| `test-strategist` | Yes | Yes | Yes | Test patterns, coverage gaps, strategy |
|
||||
| `git-historian` | Yes | Yes | Yes | Recent changes, ownership, hot files, active branches |
|
||||
| `research-scout` | Conditional | Conditional | Conditional | External docs (only when unfamiliar tech detected) |
|
||||
| `convention-scanner` | No | Yes | Yes | Coding conventions, naming, style, test patterns |
|
||||
|
||||
### Always launch (all codebase sizes):
|
||||
|
||||
**architecture-mapper** — full codebase structure, tech stack, patterns, anti-patterns.
|
||||
Prompt: "Analyze the architecture of this codebase. The task being planned is: {task}"
|
||||
|
||||
**dependency-tracer** — module connections, data flow, side effects for task-relevant code.
|
||||
Prompt: "Trace dependencies and data flow relevant to this task: {task}. Focus on modules
|
||||
that will be affected by the implementation."
|
||||
|
||||
**risk-assessor** — risks, edge cases, failure modes, technical debt near task area.
|
||||
Prompt: "Assess risks and failure modes for implementing this task: {task}. Check for
|
||||
complexity hotspots, security boundaries, and technical debt in the relevant code."
|
||||
|
||||
**task-finder** — all files, functions, types, and interfaces directly related to the task.
|
||||
Prompt: "Find all code relevant to this task: {task}. Include existing implementations
|
||||
that solve similar problems, API boundaries, database models, configuration files.
|
||||
Report file paths and line numbers for every finding."
|
||||
|
||||
**test-strategist** — existing test patterns, coverage gaps, test strategy.
|
||||
Prompt: "Analyze the test infrastructure and design a test strategy for this task: {task}.
|
||||
Discover existing patterns and identify coverage gaps."
|
||||
|
||||
**git-historian** — recent changes, code ownership, hot files, active branches.
|
||||
Prompt: "Analyze git history relevant to this task: {task}. Report recent changes,
|
||||
ownership, hot files, and active branches that may affect planning."
|
||||
|
||||
### Launch for medium+ codebases (50+ files):
|
||||
|
||||
**Convention Scanner** — use the `convention-scanner` plugin agent (model: "sonnet")
|
||||
for medium+ codebases only.
|
||||
Provide concrete examples from the codebase, not generic advice."
|
||||
|
||||
### Conditional: External research
|
||||
|
||||
After reading the task description and spec (if available), determine if the task
|
||||
involves technologies, APIs, or libraries that are:
|
||||
- Not clearly present in the codebase
|
||||
- Being upgraded to a new major version
|
||||
- Being used in an unfamiliar way
|
||||
|
||||
If yes: launch **research-scout** in parallel with the other agents.
|
||||
Prompt: "Research the following technologies for this task: {task}.
|
||||
Specific questions: {list specific questions about the technology}.
|
||||
Technologies to research: {list}."
|
||||
|
||||
If no external technology is involved: skip research-scout and note:
|
||||
"No external research needed — all technologies are well-represented in the codebase."
|
||||
|
||||
## Phase 6 — Targeted deep-dives
|
||||
|
||||
After all Phase 5 agents complete, review their results and identify **knowledge gaps**
|
||||
— areas where exploration was too shallow to plan confidently.
|
||||
|
||||
Common reasons for deep-dives:
|
||||
- A critical function was found but its implementation details are unclear
|
||||
- A dependency chain needs tracing to understand side effects
|
||||
- A test pattern was identified but the test infrastructure needs more detail
|
||||
- A risk was flagged but the actual impact needs verification
|
||||
|
||||
For each significant gap, spawn a targeted deep-dive agent (model: "sonnet",
|
||||
subagent_type: "Explore") with a narrow, specific brief.
|
||||
|
||||
Launch up to 3 deep-dive agents in parallel. If no gaps exist, skip this phase
|
||||
and note: "Initial exploration was sufficient — no deep-dives needed."
|
||||
|
||||
## Phase 7 — Synthesis
|
||||
|
||||
After all agents complete (initial + deep-dives + research), synthesize:
|
||||
|
||||
1. Read all agent results carefully
|
||||
2. Identify overlaps and contradictions between agents
|
||||
3. Build a mental model of the codebase architecture
|
||||
4. Catalog reusable code: existing functions, utilities, patterns
|
||||
5. Integrate research findings with codebase analysis
|
||||
6. Note remaining gaps — things you cannot determine from code or research
|
||||
(these become assumptions in the plan, marked explicitly)
|
||||
7. For each finding, track whether it came from **codebase analysis** or
|
||||
**external research** — the plan must distinguish these sources
|
||||
|
||||
Do NOT write this synthesis to disk. It is internal working context only.
|
||||
|
||||
## Phase 8 — Deep planning
|
||||
|
||||
Read the spec file (from Phase 2 or provided via --spec).
|
||||
Read the plan template: @${CLAUDE_PLUGIN_ROOT}/templates/plan-template.md
|
||||
|
||||
Write the plan following the template structure. The plan MUST include:
|
||||
|
||||
### Required sections
|
||||
|
||||
1. **Context** — Why this change is needed. Reference the spec's goal and constraints.
|
||||
2. **Codebase Analysis** — Tech stack, patterns, relevant files, reusable code,
|
||||
external tech researched. Every file path must be real (verified during exploration).
|
||||
3. **Research Sources** — If research-scout was used: table of technologies, sources,
|
||||
findings, and confidence levels. Omit if no research was conducted.
|
||||
4. **Implementation Plan** — Ordered steps. Each step specifies:
|
||||
- Exact files to modify or create (with paths)
|
||||
- What changes to make and why
|
||||
- Which existing code to reuse
|
||||
- Dependencies on other steps
|
||||
- Whether the step is based on codebase analysis or external research
|
||||
- **On failure:** — recovery action (revert/retry/skip/escalate)
|
||||
- **Checkpoint:** — git commit command after success
|
||||
10. **Execution Strategy** — For plans with > 5 steps: group steps into sessions
|
||||
(3–5 steps each), organize sessions into waves (parallel where independent),
|
||||
specify scope fences per session. Omit for plans with ≤ 5 steps.
|
||||
5. **Alternatives Considered** — At least one alternative approach with
|
||||
pros/cons and reason for rejection.
|
||||
6. **Risks and Mitigations** — From the risk-assessor findings. What could go
|
||||
wrong and how to handle it.
|
||||
7. **Test Strategy** — From the test-strategist findings (if available).
|
||||
What tests to write and which patterns to follow.
|
||||
8. **Verification** — Testable criteria. Not "check that it works" but
|
||||
specific commands to run and expected outputs.
|
||||
9. **Estimated Scope** — File counts and complexity rating.
|
||||
|
||||
### Quality standards
|
||||
|
||||
- Every file path in the plan must exist in the codebase (or be explicitly
|
||||
marked as "new file to create")
|
||||
- Every "reuses" reference must point to a real function/pattern found during
|
||||
exploration
|
||||
- Steps must be ordered by dependency (not by file path or importance)
|
||||
- Verification criteria must be concrete and executable
|
||||
- The plan must be implementable by someone who has not seen the exploration
|
||||
results — it must stand on its own
|
||||
- Research-based decisions must cite their source
|
||||
|
||||
### Write the plan
|
||||
|
||||
Generate the slug from the task description (or reuse the spec slug).
|
||||
Write the plan to: `.claude/plans/ultraplan-{YYYY-MM-DD}-{slug}.md`
|
||||
Create the `.claude/plans/` directory if it does not exist.
|
||||
|
||||
## Phase 9 — Adversarial review
|
||||
|
||||
Launch two review agents **in parallel**:
|
||||
|
||||
**plan-critic** — adversarial review of the plan.
|
||||
Prompt: "Review this implementation plan for the task: {task}.
|
||||
Plan file: {plan path}. Read it and find every problem — missing steps,
|
||||
wrong ordering, fragile assumptions, missing error handling, scope creep,
|
||||
underspecified steps. Rate each finding as blocker, major, or minor."
|
||||
|
||||
**scope-guardian** — scope alignment check.
|
||||
Prompt: "Check this implementation plan against the requirements.
|
||||
Task: {task}. Spec file: {spec path}. Plan file: {plan path}.
|
||||
Find scope creep (plan does more than asked) and scope gaps (plan misses
|
||||
requirements). Check that referenced files and functions exist."
|
||||
|
||||
After both complete:
|
||||
- If **blockers** are found: revise the plan to address them. Add a "Revisions"
|
||||
note at the bottom of the plan listing what changed and why.
|
||||
- If only **major** issues: revise to address them. Add revisions note.
|
||||
- If only **minor** issues or clean: proceed without changes. Note the
|
||||
review result in the plan.
|
||||
|
||||
## Phase 10 — Present and refine
|
||||
|
||||
Present a summary to the user:
|
||||
|
||||
```
|
||||
## Ultraplan Complete
|
||||
|
||||
**Task:** {task description}
|
||||
**Mode:** {default | spec-driven | foreground}
|
||||
**Spec:** {spec file path, or "none (foreground mode)"}
|
||||
**Plan:** .claude/plans/ultraplan-{date}-{slug}.md
|
||||
**Exploration:** {N} agents deployed ({N} specialized + {N} deep-dives + {research status})
|
||||
**Scope:** {N} files to modify, {N} to create — {complexity}
|
||||
|
||||
### Key decisions
|
||||
- {Decision 1 and rationale}
|
||||
- {Decision 2 and rationale}
|
||||
|
||||
### Implementation steps ({N} total)
|
||||
1. {Step 1 summary}
|
||||
2. {Step 2 summary}
|
||||
...
|
||||
|
||||
### Research findings
|
||||
{Summary of external research, or "No external research conducted."}
|
||||
|
||||
### Adversarial review
|
||||
**Plan critic:** {Summary — blockers/majors/minors found, how addressed}
|
||||
**Scope guardian:** {Summary — creep/gaps found, how addressed}
|
||||
|
||||
You can:
|
||||
- Ask questions or request changes to refine the plan
|
||||
- Say **"execute"** to start implementing
|
||||
- Say **"execute with team"** to implement with parallel Agent Team (if eligible)
|
||||
- Say **"save"** to keep the plan for later
|
||||
```
|
||||
|
||||
If the user asks questions or requests changes:
|
||||
- Update the plan file in-place
|
||||
- Show what changed
|
||||
- Re-present the summary
|
||||
|
||||
## Phase 11 — Handoff
|
||||
|
||||
### "save" / "later" / "done"
|
||||
|
||||
Confirm the plan and spec file locations and exit.
|
||||
|
||||
### "execute" / "go" / "start"
|
||||
|
||||
Begin implementing the plan step by step in this session. Follow the plan exactly.
|
||||
Mark each step complete as you go.
|
||||
|
||||
### "execute with team" / "team"
|
||||
|
||||
Before creating a team, verify eligibility:
|
||||
1. Count implementation steps that are **independent** (no dependency on each other)
|
||||
AND touch **different files/modules**
|
||||
2. If fewer than 3 independent steps: inform the user and fall back to sequential
|
||||
execution. "The plan has fewer than 3 independent steps — sequential execution
|
||||
is more efficient."
|
||||
|
||||
If eligible:
|
||||
1. Present the proposed team split: which steps go to which team member
|
||||
2. Ask for confirmation: "Create Agent Team with {N} members? (yes/no)"
|
||||
3. If confirmed: create the team with `TeamCreate`, assign step clusters to
|
||||
each member. Use `isolation: "worktree"` on each team member agent so they
|
||||
work in isolated git worktrees — this prevents file conflicts during parallel
|
||||
implementation. Coordinate execution and clean up with `TeamDelete` when done.
|
||||
4. If `TeamCreate` fails (tool not available): fall back to sequential execution
|
||||
and notify the user
|
||||
|
||||
## Phase 12 — Session tracking
|
||||
|
||||
After the plan is presented (Phase 10) or after handoff (Phase 11), write a
|
||||
session record to `${CLAUDE_PLUGIN_DATA}/ultraplan-stats.jsonl` (create the file
|
||||
if it does not exist).
|
||||
|
||||
Record format (one JSON line):
|
||||
```json
|
||||
{
|
||||
"ts": "{ISO-8601 timestamp}",
|
||||
"task": "{task description (first 100 chars)}",
|
||||
"mode": "{default|spec|fg}",
|
||||
"slug": "{plan slug}",
|
||||
"codebase_size": "{small|medium|large}",
|
||||
"codebase_files": {N},
|
||||
"agents_deployed": {N},
|
||||
"deep_dives": {N},
|
||||
"research": {true|false},
|
||||
"critic_verdict": "{BLOCK|REVISE|PASS}",
|
||||
"guardian_verdict": "{ALIGNED|CREEP|GAP|MIXED}",
|
||||
"outcome": "{execute|execute_team|save|refine}"
|
||||
}
|
||||
```
|
||||
|
||||
If `${CLAUDE_PLUGIN_DATA}` is not set or not writable, skip tracking silently.
|
||||
Never let tracking failures block the main workflow.
|
||||
|
||||
## Hard rules
|
||||
|
||||
- **Scope**: Only explore the current working directory and its subdirectories.
|
||||
Never read files outside the repo (no ~/.env, no credentials, no other repos).
|
||||
- **Cost**: Sonnet for all agents (exploration, deep-dives, research, critics).
|
||||
Opus only runs in the main thread for synthesis and planning.
|
||||
- **Privacy**: Never log, store, or repeat file contents that look like
|
||||
secrets, tokens, or credentials. Never log prompt text.
|
||||
- **No premature execution**: Do not modify any project files until the user
|
||||
explicitly approves the plan.
|
||||
- **Plan stands alone**: The plan file must be understandable without access
|
||||
to the exploration results. Include all necessary context.
|
||||
- **Honesty**: If exploration reveals the task is trivial (single file, obvious
|
||||
change), say so. Do not inflate the plan to justify the process. Suggest
|
||||
the user just implements it directly.
|
||||
- **Adaptive**: Never spawn more agents than the codebase warrants. A 10-file
|
||||
project does not need 7 exploration agents. Scale down.
|
||||
- **Research transparency**: Always distinguish codebase-derived decisions from
|
||||
research-derived decisions in the plan.
|
||||
Loading…
Add table
Add a link
Reference in a new issue