Kjell Tore Guttormsen 9ecd225018 feat(ultraplan-local): Spor 3 — semantic plan-critic, examples, CC features, security docs

- agents/plan-critic.md: rule #7 split into literal blockers (TBD/TODO/FIXME)
  + semantic rubric with 8 deferred-decision tests; calibrated against the
  5-phrase corpus from the v3.1.0 quality brief
- hooks/hooks.json: rebuilt from corrupted state; valid JSON, registers
  PreToolUse(Bash,Write), UserPromptSubmit, PostToolUse(Bash), PreCompact
- hooks/scripts/session-title.mjs: NEW — sets ultra:<cmd>:<slug> session
  title for ultra commands (CC v2.1.94+)
- hooks/scripts/post-bash-stats.mjs: NEW — appends duration_ms per Bash
  call to ultraexecute-stats.jsonl (CC v2.1.97+)
- SECURITY.md: NEW — Forgejo private-issue reporting, supported = current
  minor only, scope = 4 hooks + denylist, hardening recommendations
- docs/architect-bridge-test.md: NEW — manual smoke checklist for the
  ultraplan ↔ ultra-cc-architect bridge
- examples/01-add-verbose-flag/: NEW — calibrated end-to-end (brief +
  research + plan + progress.json) for fork-er onramp; all four artifacts
  pass their validators
- README.md: + Extending the plugin, + Headless multi-session tuning
  (MCP_CONNECTION_NONBLOCKING), + Session titles, + Per-step timing,
  + disableSkillShellExecution recommendation
- CLAUDE.md: documents session-title.mjs and post-bash-stats.mjs
- root README.md: v3.1.0 entry expanded with Spor 2+3 deliverables

CC features adopted: F8, F9, F12 implemented; F3 implemented as Bash
PostToolUse logger; F2 (hook 'if'-field scoping) deferred — universal
protection beats reduced-scope protection for blocked commands.

Tests: 109/109 green.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-05-01 06:28:44 +02:00

2.8 KiB

Raw Blame History

Examples

Complete kalibrerte walk-throughs of the ultraplan-local pipeline for realistic tasks. Each example shows the four artifacts a project directory contains after a full run:

brief.md — task brief from /ultrabrief-local
research/*.md — research briefs from /ultraresearch-local
plan.md — implementation plan from /ultraplan-local
progress.json — execution log from /ultraexecute-local

These are hand-calibrated, not LLM-generated. The point is to give a fork-er a deterministic reference — what the artifacts look like when everything goes right, with a small but real task.

Running pipeline yourself

For your own work, point the four commands at a real project directory:

mkdir -p .claude/projects/2026-05-01-my-task
/ultrabrief-local
/ultraresearch-local --project .claude/projects/2026-05-01-my-task
/ultraplan-local --project .claude/projects/2026-05-01-my-task
/ultraexecute-local --project .claude/projects/2026-05-01-my-task

The artifacts in each example mirror that flow.

Examples

01-add-verbose-flag

Task: add a --verbose flag to a small CLI parser. Touches one parser file and six command handlers; adds two tests.

Why this example: small enough to read end-to-end in 10 minutes, but exercises every artifact (research with brief-anchoring, plan with manifests, progress.json with multi-step git history). Demonstrates how plan_version: 1.7 schema looks in real life — including the manifest YAML block per step and the must_contain list-of-dicts form.

What to study first:

brief.md — note the explicit Out of scope section and concrete Success Criteria (no "make it work" hand-waving).
plan.md Step 1 — note that the FIRST step captures golden output before any behavior change. This is the stability harness pattern.
plan.md Step 5 — note that this step touches 5 files in one commit, and the plan justifies the deviation from the 1–2 file guideline. Plan-critic should accept that justification.
progress.json — every step has both commit_sha and verify_passed. Resumes work from the last completed step.

Regeneration

Each example has a REGENERATED.md documenting the version it was calibrated against. When the artifact format changes, the example needs to be re-built. See the REGENERATED.md file in each example for triggers and procedure.

Adding a new example

If you have a small, realistic task (touches 1-3 files, has a clear success criterion, finishes in under 30 minutes) and want to add it as an example:

Create examples/NN-slug-here/ with the same four artifacts.
Add a REGENERATED.md documenting the calibration date and version.
Add a section to this README under ## Examples.
Open an issue on the marketplace describing what the example teaches that 01 doesn't already teach.

2.8 KiB Raw Blame History Unescape Escape

Examples

Running pipeline yourself

Examples

01-add-verbose-flag

Regeneration

Adding a new example

2.8 KiB

Raw Blame History