ktg-plugin-marketplace/plugins/ultraplan-local/commands/ultraexecute-local.md

647 lines
20 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
name: ultraexecute-local
description: Disciplined plan executor — single-session or multi-session with parallel orchestration, failure recovery, and headless support
argument-hint: "[--fg | --resume | --dry-run | --step N | --session N] <plan.md>"
model: opus
allowed-tools: Read, Write, Edit, Bash, Glob, Grep, AskUserQuestion
---
# Ultraexecute Local
Disciplined executor for ultraplan plans. Reads a plan file, detects if it has
an Execution Strategy (multi-session), and either executes directly or
orchestrates parallel headless sessions — all to realize one plan.
Designed to work identically in interactive and headless (`claude -p`) mode.
## Phase 1 — Parse mode and validate input
Parse `$ARGUMENTS` for mode flags:
1. If arguments contain `--fg`: extract the file path. Set **mode = foreground**.
2. If arguments contain `--resume`: extract the file path. Set **mode = resume**.
3. If arguments contain `--dry-run`: extract the file path. Set **mode = dry-run**.
4. If arguments contain `--step N` (N is a positive integer): extract N and the file path.
Set **mode = step**, **target-step = N**.
5. If arguments contain `--session N` (N is a positive integer): extract N and the file path.
Set **mode = session**, **target-session = N**.
6. Otherwise: the entire argument string is the file path. Set **mode = execute**.
If no path is provided, output usage and stop:
```
Usage: /ultraexecute-local <plan.md>
/ultraexecute-local --fg <plan.md>
/ultraexecute-local --resume <plan.md>
/ultraexecute-local --dry-run <plan.md>
/ultraexecute-local --step N <plan.md>
/ultraexecute-local --session N <plan.md>
Modes:
(default) Auto — multi-session if plan has Execution Strategy, else foreground
--fg Force foreground — all steps sequentially, ignore Execution Strategy
--resume Resume from last progress checkpoint
--dry-run Validate plan and show execution strategy without running
--step N Execute only step N (foreground)
--session N Execute only session N from the plan's Execution Strategy
Examples:
/ultraexecute-local .claude/plans/ultraplan-2026-04-06-auth-refactor.md
/ultraexecute-local --fg .claude/plans/ultraplan-2026-04-06-auth-refactor.md
/ultraexecute-local --session 2 .claude/plans/ultraplan-2026-04-06-auth-refactor.md
/ultraexecute-local --dry-run .claude/plans/ultraplan-2026-04-06-auth-refactor.md
```
If the file does not exist, report and stop:
```
Error: file not found: {path}
```
Report detected mode:
```
Mode: {execute | resume | dry-run | step N}
File: {path}
```
## Phase 2 — Detect file type and parse structure
Read the file. Determine whether it is an **ultraplan** or a **session spec**:
- **Session spec**: contains `## Dependencies` with `Entry condition:` AND `## Scope Fence`
AND `## Exit Condition` sections.
- **Ultraplan**: contains `## Implementation Plan` with numbered `### Step N:` headings
but no `## Scope Fence`.
If neither structure is detected, report and stop:
```
Error: unrecognized file format. Expected an ultraplan or session spec.
```
### Parse steps
Extract every `### Step N: {description}` heading (in order). For each step, extract:
- **Files** — file paths to create or modify
- **Changes** — what to modify
- **Reuses** — existing code to leverage (informational)
- **Test first** — test to run before implementation (optional)
- **Verify** — command to run after implementation
- **On failure** — recovery action (revert/retry/skip/escalate)
- **Checkpoint** — git commit command after success
If a step is missing `On failure`, default to `escalate` and record a parse warning.
If a step is missing `Verify`, record a parse warning.
### Parse session spec fields (if applicable)
- **Entry condition** from `## Dependencies`
- **Touch list** and **Never-touch list** from `## Scope Fence`
- **Exit condition** checklist from `## Exit Condition`
### Parse Execution Strategy (if present)
If the plan contains an `## Execution Strategy` section, extract:
- Each `### Session N: {title}` with its Steps, Wave, Depends on, and Scope fence
- The `### Execution Order` with wave definitions
Set **has_execution_strategy = true**.
Report:
```
Type: {plan | session-spec}
Steps: {N}
{if has_execution_strategy}: Execution Strategy: {S} sessions across {W} waves
{if session spec}: Entry condition: {text}
{if session spec}: Scope fence: {N} touch, {N} never-touch
{if warnings}: Warnings: {list}
```
## Phase 2.5 — Execution strategy decision
Determine how to execute this plan:
**Run as single session (foreground)** when ANY of these are true:
- `--fg` flag is set
- `--step N` mode
- `--resume` mode
- `--session N` mode (runs only that session's steps, foreground)
- Plan has no `## Execution Strategy` section
- Plan has Execution Strategy with only 1 session
**Run as multi-session (parallel orchestration)** when ALL of these are true:
- mode = `execute` (default, no --fg)
- Plan has `## Execution Strategy` with 2+ sessions
- At least one wave has 2+ sessions (parallelism possible)
**Run as multi-session (sequential orchestration)** when:
- mode = `execute` (default, no --fg)
- Plan has `## Execution Strategy` with 2+ sessions
- All sessions are in different waves (no parallelism, but still separate sessions)
For single-session: continue to Phase 3.
For multi-session: jump to Phase 2.6.
Report:
```
Strategy: {single session | N sessions (M parallel, K sequential)}
```
## Phase 2.6 — Multi-session orchestration
**Only runs for multi-session execution.** This phase launches headless child
sessions and collects results. After this phase, jump directly to Phase 8
(final report).
### Step 0 — Billing safety check (MANDATORY)
Before launching ANY `claude -p` process, check the environment:
```bash
echo "${ANTHROPIC_API_KEY:+SET}"
```
If the result is `SET`, **STOP** and warn the user. `claude -p` sessions with
`ANTHROPIC_API_KEY` in the environment bill the **API account** (pay-per-token),
not the user's Claude subscription (Max/Pro). Parallel Opus sessions can cost
$50100+ per run.
Use AskUserQuestion with these options:
**Question:** "ANTHROPIC_API_KEY is set in your environment. Parallel `claude -p`
sessions will bill your API account, not your Claude subscription. How do you
want to proceed?"
| Option | Description |
|--------|-------------|
| **Use --fg instead (Recommended)** | Run all steps sequentially in this session using your subscription. No extra cost. |
| **Continue with API billing** | Launch parallel sessions. Each session bills your API account at token rates. |
| **Stop** | Cancel execution. Unset ANTHROPIC_API_KEY first, then re-run. |
If the user chooses `--fg`: restart execution with mode = foreground (jump back
to Phase 3, single-session).
If the user chooses `Continue`: proceed with Phase 2.6 Step 1.
If the user chooses `Stop`: report "Execution cancelled — billing safety check"
and stop.
If `ANTHROPIC_API_KEY` is NOT set: proceed silently to Step 1.
### Step 1 — Create session log directory
```bash
mkdir -p .claude/ultraplan-sessions/{slug}/logs
```
### Step 2 — Execute waves
For each wave (in order):
**Launch sessions in this wave:**
For each session in the wave, launch a headless `claude -p` process:
```bash
claude -p "/ultraexecute-local --session {N} {plan-path}" \
> .claude/ultraplan-sessions/{slug}/logs/session-{N}.log 2>&1 &
```
If the wave has only 1 session, run it without `&` (no background needed).
Track PIDs for parallel sessions.
**Wait for wave completion:**
```bash
wait {PID1} {PID2} ...
```
**Check results after each wave:**
For each session in the wave, read its log file and grep for
`"ultraexecute_summary"`. Parse the JSON to determine:
- Did the session complete? (`result: "completed"`)
- Did it fail? (`result: "failed"` or `"stopped"`)
If ANY session in the wave failed:
```
Wave {W} FAILED: Session {N} failed at step {S}.
Stopping — later waves depend on this wave.
See log: .claude/ultraplan-sessions/{slug}/logs/session-{N}.log
```
Do NOT start later waves. Jump to Phase 8 with partial results.
If all sessions in the wave passed: continue to the next wave.
### Step 3 — Run master verification
After all waves complete successfully, run the plan's `## Verification` section
commands to verify the integrated result.
### Step 4 — Aggregate results
Collect all session summaries into an aggregated report. Jump to Phase 8.
### --session N mode
When mode = `session N`:
1. Find session N in the Execution Strategy
2. Extract its step numbers (e.g., Steps: 4, 5, 6)
3. Extract its scope fence (Touch / Never touch lists)
4. Execute ONLY those steps, in order, using the single-session protocol (Phase 3→7)
5. Enforce the session's scope fence as if it were a session spec's scope fence
6. Report results for those steps only
This mode is used internally by Phase 2.6 when launching child sessions.
It can also be used manually to re-run a specific session.
## Phase 3 — Progress file setup
The progress file lives at `{plan-dir}/.ultraexecute-progress-{slug}.json` where
`{slug}` is the plan filename without extension.
### Progress file schema
```json
{
"schema_version": "1",
"plan": "{path}",
"plan_type": "{plan | session-spec}",
"started_at": "{ISO-8601}",
"updated_at": "{ISO-8601}",
"mode": "{execute | resume | step}",
"total_steps": 0,
"current_step": 0,
"status": "{in-progress | completed | failed | stopped}",
"steps": {
"1": { "status": "pending", "attempts": 0, "error": null, "completed_at": null, "commit": null }
},
"entry_condition_checked": false,
"exit_condition_checked": false,
"summary": null
}
```
### Mode-specific behavior
**mode = execute (fresh):**
- If a progress file exists with status `in-progress` or `failed`: warn that
`--resume` is available, then wait 3 seconds (`sleep 3`) and start fresh.
This allows headless runs to proceed without blocking.
- Otherwise: create the progress file with all steps in `pending` status.
**mode = resume:**
- If no progress file exists: start from step 1 (same as fresh execute).
- If progress file exists: find the first step with status != `passed`.
```
Resuming from step {N}. {M}/{total} steps already completed.
```
**mode = dry-run:**
- Do NOT create or modify the progress file.
**mode = step N:**
- Create the progress file if it does not exist.
- Only step N will be executed.
## Phase 4 — Entry condition check (session specs only)
**Skip for ultraplans.** Skip in dry-run mode (report what would be checked instead).
Read the entry condition. Evaluate it:
- `"none"` or similar → pass immediately
- References git state (e.g., "git status clean") → run `git status --porcelain`
- References passing tests → run the specified command
- References a previous session → check `git log --oneline` for commit pattern
If the entry condition **fails**:
```
Entry condition FAILED: {condition text}
Reason: {what was checked, what was found}
Complete the prerequisite first, then re-run.
```
Update progress file with `status: "stopped"`. Stop execution.
If the entry condition **passes**:
```
Entry condition: PASS
```
Update `entry_condition_checked: true` in the progress file.
## Phase 5 — Dry-run report (dry-run mode only)
**Only runs when mode = dry-run.** Produces a validation report, then stops.
```
## Dry Run Report: {filename}
**Type:** {plan | session-spec}
**Steps:** {N}
### Step Validation
| Step | Description | Verify | On failure | Checkpoint | Issues |
|------|-------------|--------|------------|------------|--------|
| 1 | {desc} | {cmd} | {action} | {msg} | {none / missing X} |
### File References
{For each file in Files: fields, check existence with Glob}
- {path}: EXISTS | NOT FOUND {(marked as new file) | (unexpected — may be missing)}
### Entry / Exit Conditions (session specs)
{What would be checked}
### Execution Preview (only when plan has Execution Strategy)
If `has_execution_strategy = true`, show a preview of multi-session orchestration:
```
**Sessions:** {S} across {W} waves
| Wave | Session | Steps | Depends on | Command |
|------|---------|-------|------------|---------|
| 1 | Session 1: {title} | {nums} | none | `claude -p "/ultraexecute-local --session 1 {path}"` |
| 1 | Session 2: {title} | {nums} | none | `claude -p "/ultraexecute-local --session 2 {path}"` |
| 2 | Session 3: {title} | {nums} | S1, S2 | `claude -p "/ultraexecute-local --session 3 {path}"` |
```
Check billing status via `echo "${ANTHROPIC_API_KEY:+SET}"` and report:
```
Billing: ANTHROPIC_API_KEY is {SET — parallel sessions will bill API account | NOT SET — sessions will use subscription}
```
### Verdict
{READY | NEEDS ATTENTION — N issues found}
```
Stop after the dry-run report. Do not execute anything.
## Phase 6 — Step execution loop
The core execution phase. Runs for modes: `execute`, `resume`, `step`.
### Determine starting step
- **execute**: step 1
- **resume**: first step where status != `passed`
- **step N**: step N only
### For each step
Update progress: `steps.{N}.status = "running"`, `current_step = N`, `updated_at = now`.
```
--- Step {N}/{total}: {description} ---
```
#### Sub-step A — Scope fence check (session specs only)
Before touching any file, verify that every file in the step's `Files:` field is
in the session spec's Touch list (or is a new file to create). If ANY file is in
the Never-touch list:
```
SCOPE VIOLATION: Step {N} requires {file} which is in the never-touch list.
Escalating — this step cannot be executed within this session's scope.
```
Treat this as an automatic `escalate`. Jump to the stop-and-report logic.
#### Sub-step B — Test first (if present)
If the step has a `Test first:` field:
1. If test file is marked `(new)`: note it will be created during implementation.
2. If test file exists: run it. Expect failure (RED state).
3. If test unexpectedly passes: warn but continue — step may already be done.
Do not block on test-first failures — they are expected.
#### Sub-step C — Implement changes
Read the step's `Files:` and `Changes:` fields. Implement exactly as described.
**Rules:**
- Follow `Changes:` exactly — do not improvise, add scope, or optimize
- Use Edit for modifications, Write for new files
- If `Reuses:` references existing code, read that code first for context
- Only touch files listed in `Files:` — nothing else
#### Sub-step D — Verification
Run the `Verify:` command exactly as written, via Bash.
**Rules:**
- Always a fresh run — never trust prior results
- Exit code is the authoritative truth:
- Exit 0 + expected output (if specified) = **PASS**
- Exit non-zero = **FAIL** regardless of output text
- Exit 0 but wrong output = **FAIL**
```
Verify: {command}
Result: {PASS | FAIL} (exit code {N})
{if FAIL}: Output (first 10 lines): {output}
```
If **PASS**: proceed to Sub-step F (checkpoint).
#### Sub-step E — On failure handling
If **FAIL**, read the `On failure:` clause. Apply the retry cap: **maximum 2 retries**
(3 total attempts). Track attempts in `steps.{N}.attempts`.
**`On failure: revert`**
- If attempts < 3: analyze the failure, re-implement with adjustments, re-verify.
```
Attempt {A}/3 failed. Retrying...
```
- If attempts == 3: revert this step's changes:
```bash
git checkout -- {files from Files: field}
```
Record failure. **Do NOT proceed to next step.** Jump to Phase 7.
**`On failure: retry`**
- If attempts < 3: use the alternative approach described in the On failure clause.
- If attempts == 3: revert and stop. Jump to Phase 7.
**`On failure: skip`**
- Mark step as skipped regardless of attempt count. Continue to next step.
```
Step {N}: SKIPPED (non-critical per plan)
```
Update `steps.{N}.status = "skipped"`.
**`On failure: escalate`**
- Stop immediately regardless of attempt count.
```
Step {N}: ESCALATED — requires human judgment
```
Commit all completed work before stopping. Stage ONLY files from steps with
`status: "passed"` in the progress file — collect their `Files:` fields. Never
use `git add -A` (risks staging secrets, binaries, or unrelated work).
```bash
git add {files from passed steps' Files: fields} && git commit -m "wip: ultraexecute-local stopped at step {N} — escalation needed"
```
Jump to Phase 7.
#### Sub-step F — Checkpoint
Run the `Checkpoint:` git commit command exactly as written in the plan.
If the commit fails (nothing to commit, etc.): warn but do NOT fail the step.
The step's verification already passed — the commit is bookkeeping.
```
Step {N}: PASS (committed: {hash})
```
Update progress: `steps.{N}.status = "passed"`, `steps.{N}.commit = {hash}`,
`steps.{N}.completed_at = now`.
### Step mode exit
If mode = `step N`: after completing step N (pass or fail), skip remaining steps
and jump to Phase 8 (final report).
## Phase 7 — Exit condition check (session specs only)
**Skip for ultraplans.** Run only when all steps passed (not on early stop).
Run each exit condition command from the `## Exit Condition` checklist:
```
Exit condition check:
- [ ] {command} → {PASS | FAIL}
- [ ] {command} → {PASS | FAIL}
```
If all pass: `exit_condition_checked: true` in progress file.
If any fail: record which failed. Include in final report.
## Phase 8 — Final report
Always produce a final report.
Update progress file: `status` to `completed`/`failed`/`stopped`, `updated_at`, `summary`.
```
## Ultraexecute Local Complete
**Plan:** {path}
**Type:** {plan | session-spec}
**Mode:** {execute | resume | step N}
**Result:** {COMPLETED | FAILED at step N | STOPPED (escalation) | PARTIAL (N/total passed)}
### Step Results
| Step | Description | Result | Attempts | Commit |
|------|-------------|--------|----------|--------|
| 1 | {desc} | PASS | 1 | abc1234 |
| 2 | {desc} | FAIL | 3 | — |
| 3 | {desc} | — | 0 | — |
### Summary
- Passed: {N}/{total}
- Skipped: {N}
- Failed: {N}
- Not reached: {N}
{if all passed + exit condition passed}:
All steps completed. Exit condition: PASS.
{if failed/stopped}:
### Failure Details
Step {N}: {description}
On failure: {action}
Error: {error output, first 20 lines}
Attempts: {N}
### What Remains
{Numbered list of unexecuted steps}
To resume: /ultraexecute-local --resume {path}
```
**JSON summary block** (always at the end, machine-parseable):
```json
{
"ultraexecute_summary": {
"plan": "{path}",
"plan_type": "{plan | session-spec}",
"result": "{completed | failed | stopped | partial}",
"steps_total": 0,
"steps_passed": 0,
"steps_failed": 0,
"steps_skipped": 0,
"steps_not_reached": 0,
"failed_at_step": null,
"exit_condition": "{pass | fail | skipped | n/a}",
"progress_file": "{path}"
}
}
```
The `ultraexecute_summary` key makes it grep-able in log files from headless runs.
## Phase 9 — Stats tracking
Append one record to `${CLAUDE_PLUGIN_DATA}/ultraexecute-stats.jsonl`:
```json
{
"ts": "{ISO-8601}",
"plan": "{filename only}",
"plan_type": "{plan | session-spec}",
"mode": "{execute | resume | dry-run | step}",
"result": "{completed | failed | stopped | partial}",
"steps_total": 0,
"steps_passed": 0,
"steps_failed": 0,
"steps_skipped": 0,
"failed_at_step": null
}
```
If `${CLAUDE_PLUGIN_DATA}` is not set or not writable, skip silently.
Never let stats failures block the workflow.
## Hard rules
1. **No AskUserQuestion for execution decisions.** All execution decisions come
from the plan's On failure clauses. If the plan says escalate, stop and
report — never ask. **Exception:** the billing safety check in Phase 2.6
Step 0 MUST ask before spending money on the user's API account.
2. **No scope creep.** Only touch files listed in the step's `Files:` field.
If a file outside the list seems to need changing, record it as a finding
in the final report — do not touch it.
3. **Exit code is truth.** The Verify command's exit code is authoritative.
Non-zero = FAIL regardless of output. Zero with wrong output = FAIL.
4. **Fresh verification.** Re-run the Verify command from scratch every time.
Never trust cached or prior results.
5. **Retry cap = 3 attempts.** Initial + 2 retries, then stop. Never loop forever.
6. **Never corrupt completed work.** Only revert files from the failing step.
Never touch files from earlier passed steps.
7. **Checkpoint discipline.** Run the Checkpoint commit exactly as written.
Do not combine, reorder, or skip checkpoints on passed steps.
8. **Scope fence enforcement.** For session specs: never modify files in the
Never-touch list, regardless of what the Changes field says.
9. **Progress file is ground truth.** Resume uses the progress file, not git log.
10. **No sub-agents.** The executor reads and implements directly.
No Agent tool, no TeamCreate, no delegation.