refactor(marketplace): split cc-architect from ultraplan-local into its own plugin

Extract `/ultra-cc-architect-local` and `/ultra-skill-author-local` plus all 7
supporting agents, the `cc-architect-catalog` skill (13 files), the
`ngram-overlap.mjs` IP-hygiene script, and the skill-factory test fixtures
from `ultraplan-local` v2.4.0 into a new `ultra-cc-architect` plugin v0.1.0.

Why: ultraplan-local had drifted into containing two distinct domains — a
universal planning pipeline (brief → research → plan → execute) and a
Claude-Code-specific architecture phase. Keeping them together forced users
to inherit an unfinished CC-feature catalog (~11 seeds) when they only
wanted the planning pipeline, and locked the catalog and the pipeline into
the same release cadence. The architect was already optional and decoupled
at the code level — only one filesystem touchpoint remained
(auto-discovery of `architecture/overview.md`), which already handles
absence gracefully.

Plugin manifests:
- ultraplan-local: 2.4.0 → 3.0.0 (description + keywords updated)
- ultra-cc-architect: new at 0.1.0 (pre-release; catalog is thin, Fase 2/3
  of skill-factory unbuilt, decision-layer empty, fallback list still
  needed)

What stays in ultraplan-local: brief/research/plan/execute commands, all
19 planning agents, security hooks, plan auto-discovery of
`architecture/overview.md` (filesystem-level contract, not code-level).

What moved (28 files via git mv, R100 — full history preserved):
- 2 commands, 8 agents, 1 skill catalog (13 files), 2 scripts, 8 fixtures

Documentation updates: plugin CLAUDE.md and README.md for both plugins,
root README.md (added ultra-cc-architect section, updated ultraplan-local
section), root CLAUDE.md (added ultra-cc-architect to repo-struktur),
marketplace.json (registered ultra-cc-architect), ultraplan-local
CHANGELOG.md (v3.0.0 entry with migration guidance).

Test verification: ngram-overlap.test.mjs passes 23/23 from new location.

Memory updated: feedback_no_architect_until_v3.md now points at the new
plugin and reframes the threshold around catalog maturity rather than an
ultraplan-local milestone.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Kjell Tore Guttormsen 2026-04-30 17:18:47 +02:00
commit ab504bdf8c
48 changed files with 627 additions and 177 deletions

View file

@ -0,0 +1,193 @@
---
name: cc-architect-catalog
description: Internal catalog for ultra-cc-architect-local — not invoked directly. Indexes CC-feature reference and pattern skills.
layer: manifest
cc_feature: meta
source: https://docs.claude.com/en/docs/claude-code
concept: catalog-index
last_verified: 2026-04-18
ngram_overlap_score: null
review_status: approved
---
# CC Architect Catalog — Manifest
This file is the catalog index consumed by the `feature-matcher` and
`gap-identifier` agents inside `/ultra-cc-architect-local`. It is NOT
intended to be auto-invoked by Claude Code's skill system — the
description above exists only so the skill loader has something to
display if it indexes this directory.
## Purpose
The catalog enumerates which Claude Code features the architect command
can reason about, and at which layer of abstraction. Each feature gets
one or more skill files in this directory. A skill is a self-contained
note about *one* feature at *one* layer.
## Layers (frontmatter `layer` field)
| Layer | Purpose | Kilde |
|-------|---------|-------|
| `reference` | Facts, API, syntax. What the feature *is*. | CC changelog, official docs |
| `pattern` | When and how to use it. Typical shapes, pitfalls. | Synthesized from practice |
| `decision` | Operational decision tree across features. | Synthesized. (No seeds yet.) |
Lag representeres som frontmatter-felt, ikke mappestruktur. Rationale:
one feature can have skills at multiple layers without relocation, and
future skill-factory output can target a specific layer without moving
files.
## Frontmatter contract (per skill file)
Every skill file in this directory MUST have this frontmatter. The
fields are load-bearing for the architect command.
```yaml
---
name: <skill-id> # unique, kebab-case
description: <one-line matcher hint> # used by feature-matcher
layer: reference | pattern | decision
cc_feature: <feature-id> # see table below
source: <URL> # canonical upstream source
concept: <short phrase> # 36 word concept handle
last_verified: <YYYY-MM-DD> # when a human last checked against upstream
ngram_overlap_score: null # reserved for skill-factory IP-hygiene
review_status: approved | pending | auto-merged
---
```
`ngram_overlap_score` is reserved. This command respects it (will flag
non-null values > threshold) but does not compute it. Skill-factory (a
separate later process) populates that field.
`review_status: approved` is the default for seeds (approved by
construction — handwritten, no third-party text copied). Future
skill-factory output will start as `pending` (channel 2) or
`auto-merged` (channel 1).
## Canonical `cc_feature` values
| Value | Coverage |
|-------|----------|
| `hooks` | Event hooks (UserPromptSubmit, PreToolUse, PostToolUse, Stop, Notification, SessionStart, etc.) |
| `subagents` | Task-tool sub-agents, delegation patterns |
| `skills` | Claude Code skill system (not Agent SDK skills) |
| `output-styles` | Output style configuration |
| `mcp` | Model Context Protocol servers and tools |
| `plan-mode` | Built-in plan mode |
| `worktrees` | Git worktree integration |
| `background-agents` | `run_in_background`, Monitor, long-running agents |
| `meta` | This manifest only — not a real feature |
If a future skill covers a feature not in this list, add the row above
and surface the extension in the relevant CHANGELOG entry.
## Slug convention
Skill filenames follow this pattern:
```
<cc_feature>[-<qualifier>]-<layer>.md
```
- **Unqualified slug** (`<feature>-<layer>.md`) — the baseline/canonical
entry for that (feature, layer) pair. One per pair. Example:
`hooks-pattern.md` = generic hook shapes and pitfalls.
- **Qualified slug** (`<feature>-<qualifier>-<layer>.md`) — a
specialized variant covering one specific sub-pattern or use-case.
Zero-or-more per pair. Example: `hooks-observability-pattern.md` =
progressive-alert observability pattern specifically.
Qualifiers exist because one CC feature can support multiple
non-overlapping patterns at different abstraction levels. Forcing
everything into a single `<feature>-<layer>.md` either bloats the
canonical entry or loses useful specialization. Qualified slugs let the
catalog grow with concrete, named patterns without displacing the
baseline.
`feature-matcher` treats all skills with the same `cc_feature` + `layer`
as candidates and picks the most relevant to the brief (or proposes
several if they cover different aspects). See "How `feature-matcher`
uses this file" below.
## Current seed coverage (v2.3.0 — see .drafts/ for skill-factory output)
| Feature | reference | pattern | qualified patterns | decision |
|---------|-----------|---------|--------------------|----------|
| hooks | hooks-reference | hooks-pattern | hooks-observability-pattern | — |
| subagents | subagents-reference | subagents-pattern | — | — |
| skills | skills-reference | — | — | — |
| output-styles | output-styles-reference | — | — | — |
| mcp | mcp-reference | — | — | — |
| plan-mode | plan-mode-reference | — | — | — |
| worktrees | worktrees-reference | — | — | — |
| background-agents | background-agents-reference | — | — | — |
Total: 11 seed skills, 8 features, 2 layers. Decision-layer is
intentionally empty — decisions cross features and require broader
synthesis than a single seed pass can provide. Skill-factory populates
decision-layer later.
## How `feature-matcher` uses this file
1. Read this file to learn the `cc_feature` taxonomy and slug convention.
2. Glob the directory for `*.md` files (excluding SKILL.md).
3. Parse each skill's frontmatter.
4. For each feature mentioned in the brief or research, match against
`cc_feature` field. Build the candidate set per feature, grouped by
layer. Selection rules:
- **Layer preference:** prefer `pattern` over `reference` when both
exist (pattern is richer).
- **Multiple patterns per feature:** when two or more pattern-layer
skills share the same `cc_feature`, read each `description` and
pick the one(s) most relevant to the brief. If two cover
non-overlapping aspects that both apply, propose both with clear
rationale. Prefer the unqualified baseline (`<feature>-pattern.md`)
when the brief does not specifically justify a qualified variant.
- **Be explicit:** name the chosen skill in `supporting_skill` so
`architecture-critic` can verify the match.
5. When no skill exists for a mentioned feature, fall back to the
hardcoded minimum-list inside the `feature-matcher` prompt and mark
the gap in stats (`fallback_used: true`).
## How `gap-identifier` uses this file
1. Collect every feature referenced in `feature-matcher`'s output.
2. For each feature, check whether the catalog has at least one skill at
each expected layer (reference always; pattern when complexity
warrants; decision for cross-feature choices).
3. Emit a gap entry for every missing (feature × layer) pair.
4. Label with `skill-layer:<layer>` and `cc-feature:<feature>`.
## Non-goals for this file
- No skill-factory logic. That is a separate later process.
- No auto-discovery of new CC features.
- No n-gram computation.
- No triggering Claude Code auto-invocation (description deliberately
says "not invoked directly").
## Modification rules
- **Adding a canonical skill:** create `<feature>-<layer>.md` with the
frontmatter contract above. Bump the coverage table in this file.
- **Adding a qualified pattern skill:** create
`<feature>-<qualifier>-<layer>.md` when the new pattern covers a
distinct sub-case that does not belong in the unqualified baseline.
The qualifier MUST be kebab-case and descriptive (e.g.,
`observability`, `migration`, `multi-tenant`). Add it to the
"qualified patterns" column in the coverage table.
- **Choosing qualified vs. canonical:** if no unqualified skill exists
yet for a `(feature, layer)` pair, the new skill SHOULD be the
unqualified baseline — don't ship a qualified skill without a
canonical one, because `feature-matcher` prefers baseline when the
brief has no specific justification for a variant.
- **Renaming `cc_feature` values:** update both this file AND every
skill using the old value in the same commit.
- **Removing a skill:** document in CHANGELOG under the version that
drops it.
- **Slug collision:** two skills with the same slug are a hard error.
Skill-factory (`/ultra-skill-author-local`) must refuse to promote a
draft that would overwrite an approved skill. Collision is resolved
either by qualifying the new slug or by revising the baseline.

View file

@ -0,0 +1,127 @@
---
name: background-agents-reference
description: CC background agents — long-running subagents with run_in_background and Monitor for progress streaming.
layer: reference
cc_feature: background-agents
source: https://docs.claude.com/en/docs/claude-code/background-agents
concept: async-agents-and-monitoring
last_verified: 2026-04-19
ngram_overlap_score: null
review_status: approved
---
# Background Agents — Reference
A background agent is a subagent launched with `run_in_background:
true`. The parent does not block on its return; instead, the harness
notifies the parent when the agent completes. Useful for long-running
exploration, orchestration, and work that overlaps with user activity.
> **Hard constraint (verified 2026-04-19):** The Claude Code harness does
> NOT expose the `Agent` tool to sub-agents — foreground OR background.
> A sub-agent cannot spawn further sub-agents. "Nested orchestration"
> patterns silently degrade: the inner orchestrator loses its planned
> swarm and falls back to single-context reasoning. Source:
> github.com/anthropics/claude-code/issues/19077. Only the main session
> can spawn agents. Plan shapes that require a swarm below a background
> agent do not work — keep the orchestration in the main command context
> instead. See Pitfalls and Composition below.
## Launching
```
Agent({
description: "...",
subagent_type: "...",
prompt: "...",
run_in_background: true
})
```
The Agent tool returns a handle (agent ID / name). The parent
continues its turn; no wait.
## Monitoring
Two complementary tools work with background agents:
- **Monitor** — streams updates from a named background process. Each
event line arrives as a notification. Used for long-running Bash
processes (and, in newer builds, some agent streaming paths).
- **Completion notifications** — the harness posts a message to the
parent when the background agent finishes. The parent sees it as a
system-reminder / notification on its next turn.
## When background is worth it
- **Overlapping work** — orchestrator runs 30+ minutes of research
while the user continues coding. Without background, the user is
blocked the whole time.
- **Parallel waves** — wave N of sessions running concurrently; the
parent collects results as they arrive.
- **Long-running processes** — an agent waiting on a build, test run,
or deployment.
## When background hurts
- **Short tasks** — agent returns in 10 seconds; making it async adds
overhead for no gain.
- **Tight coupling** — if the parent needs the result before doing
anything else, background is just foreground with extra steps.
- **Unbounded token spend** — a background agent with no budget
signaling can run until it hits limits. Cap explicitly.
## Common shapes
### Shape A: Orchestrator handoff (AVOID — see warning above)
Pattern: parent interviews user, writes a spec, launches a background
orchestrator that re-spawns a swarm of workers. **This does not work
as documented.** The background orchestrator never gets the Agent
tool, so the swarm never materializes and the inner orchestrator
degrades to single-context reasoning.
Anti-pattern confirmed in ultraplan-local v1.0v2.3.2. Removed in
v2.4.0 — the command markdown itself is now the orchestrator in main
context.
### Shape B: Parallel waves (single-file execution only)
Parent decomposes work into N independent sessions, launches them
all in parallel with `run_in_background: true`, then synthesizes
returns as they arrive. **Each session must be self-contained and
avoid spawning further agents** — they can use Bash, Read, Grep,
Edit, Write, but not Agent.
Valid use: `ultraplan-local --decompose` execution where each session
implements concrete code changes without further orchestration.
### Shape C: Watcher
A background agent polls a process (build, test, deploy) and reports
status changes. Uses Monitor for streaming. Watchers need Bash/Read
only — no Agent tool needed, so the harness limitation does not
apply.
## Pitfalls
- **Lost context** — if the parent conversation ends before the
background agent completes, the result may be orphaned. Persist to
disk, not memory.
- **Notification fatigue** — too many background agents = too many
reminders interrupting the parent's flow.
- **Debugging** — background agents run out of the user's view; their
failures can be silent. Log to files, not just return messages.
## Composition
- Background + worktrees: the canonical pattern for parallel
implementation — each background agent in its own worktree, no
clashes. Each session performs concrete file changes, not nested
orchestration.
- Background + subagents: **NOT SUPPORTED.** A sub-agent, whether
foreground or background, does not have the Agent tool. Only the
main session can spawn agents. See the warning at the top of this
file.
- Background + hooks: hooks fire inside the background agent's tool
calls, same as foreground.

View file

@ -0,0 +1,50 @@
---
name: hooks-observability-pattern
description: Observe user-interaction patterns across session lifecycle hooks and emit cooldown-gated nudges
layer: pattern
cc_feature: hooks
source: ../../../../ai-psychosis/README.md
concept: progressive alerts via lifecycle hooks
last_verified: 2026-04-18
ngram_overlap_score: 0.01
review_status: approved
---
# Hooks — Progressive-Alert Observability Pattern
## Use this when
- Measure behaviour the model itself cannot see: cadence, duration, repetition, time-of-day.
- Surface soft nudges rather than hard blocks — the operator keeps the final call.
- Separate "what happened" (metrics) from "what was said" (prompt text) so no conversation content touches disk.
## Shape
- Wire four lifecycle events: a start handler for baseline counters, a prompt handler for language-category flags, a tool handler for cadence and burst detection, and an end handler for totals and state cleanup.
- Keep per-session counters in a tiny JSON file under the plugin data dir; keep aggregate events in an append-only JSONL log for later reporting.
- Gate every nudge behind two things: a threshold (hard or soft) and a cooldown window, so repeat alerts do not spam the transcript.
- Deliver alerts as `additionalContext` injection, never as a tool block — the goal is awareness, not control.
## Forces
- **Privacy vs. signal.** Richer signal wants more content logged; the user wants none. Resolve by computing boolean flags in-memory and discarding the raw text before the handler returns.
- **Latency budget.** Handlers fire on every prompt and every tool call. Stay well under 100 ms per invocation; append-only JSONL is sub-millisecond and safe.
- **Portability.** Hooks that assume a shell, `jq`, or npm dependencies break on half the operator fleet. Stick to Node stdlib so the same script runs on macOS, Linux, and Windows.
- **Instruction layer alone is not enough.** Behavioural rules in a skill file shape tone but cannot measure duration or frequency. Layer the hook observability on top of the skill — each compensates for the other.
## Gotchas
- A handler that crashes blocks the turn. Catch everything, log, and exit zero by default.
- Cooldowns must be per-category, not global, or the most-triggered alert silences the rarer, more informative ones.
- Late-night and rapid-fire thresholds are legitimate signals but also easy to over-tune; start with generous bands and tighten only with data.
- `additionalContext` from an end-of-session handler is discarded — inject alerts on start, prompt, or tool events where the model will actually see them.
## Anti-patterns
- Storing prompt text or tool arguments "just for debugging" — once it is on disk, the privacy guarantee is gone.
- Treating every elevated metric as an intervention. If the hook starts blocking, the operator works around it and loses the awareness benefit.
- Hardcoding thresholds into the handler. Pull them from a single config so future tuning does not require a rewrite of four scripts.
## Decision quick-check
Reach for this pattern when you need visibility into *how* the user is interacting, not *what* they are saying, and when the response should be a gentle nudge rather than a gate. Otherwise use a PreToolUse denylist (hard limit) or a skill-only instruction layer (style, not cadence).

View file

@ -0,0 +1,89 @@
---
name: hooks-pattern
description: When to choose hooks over prompts or subagents, and common hook shapes that work.
layer: pattern
cc_feature: hooks
source: https://docs.claude.com/en/docs/claude-code/hooks
concept: hooks-decision-and-shapes
last_verified: 2026-04-18
ngram_overlap_score: null
review_status: approved
---
# Hooks — Pattern
## When to reach for a hook
Use hooks when the behavior must hold even if Claude is prompt-injected,
distracted, or adversarial. Hooks are outside the model's control loop —
Claude cannot talk its way past them.
Good fits:
- **Hard prohibitions** — "never run `rm -rf /`", "never push to main
without a commit signature", "never read files outside this repo".
- **Deterministic context injection** — always show git status at
session start; always inject the current sprint's tasks.
- **Audit trails** — log every Bash call, every file write, outside the
conversation context so it survives `/clear`.
- **Compliance boundaries** — redact secrets from transcripts; block
tool calls that would leak PII.
Bad fits:
- Behavior that requires judgment ("should this test run?") — that's a
subagent call.
- Heuristics that drift ("mostly block, sometimes allow") — a hook that
frequently second-guesses itself is a sign the rule belongs in a
prompt or skill.
- Anything that needs to read lots of conversation history — hooks see
a payload, not the full context.
## Common shapes
### Shape A: PreToolUse deny
A small script that reads `tool_input`, matches a pattern, and exits
with a `block` decision. Latency: a few ms. Used for command denylists,
path guards, secrets scanners.
### Shape B: UserPromptSubmit context injection
A script that reads the prompt, computes context (e.g., recent git log,
active TODO list), and emits JSON with an `additionalContext` field.
The harness adds that context to the prompt before Claude sees it.
### Shape C: Stop hook with reminder
A script that runs when Claude finishes a turn. Checks for uncommitted
work, surfaces it as a notification. Non-blocking.
### Shape D: SessionStart status
Runs once per session. Prints repo metadata (branch, open PRs, recent
commits) so every new session starts with shared context.
## Pitfalls
- **Slow hooks compound.** A 500 ms PreToolUse hook run 50 times per
session adds 25 seconds of latency.
- **Hooks without error handling crash the turn.** A hook that exits
non-zero on an edge case blocks real work. Default to exit 0 with a
logged warning.
- **Shell-injection in hook scripts.** A hook that interpolates
`tool_input.command` into `bash -c` is itself a security hole. Parse
inputs; never interpolate unescaped.
- **Hooks can leak secrets into transcripts** if their output mentions
env-var values. Scrub before emitting.
- **Cross-platform scripts.** A hook that assumes bash 4 or GNU sed
breaks on macOS. Prefer Node.js, Python, or POSIX sh.
## Composition with other features
- Hooks + subagents: a hook can delegate the "should this be blocked"
question to a subagent when the decision needs judgment. Cost: a full
model call per hook invocation — use sparingly.
- Hooks + MCP: a hook can call out to an MCP-exposed tool for policy
lookup. Latency depends on transport.
- Hooks + plan mode: hooks fire during plan mode too. Useful for
enforcing "no writes while planning".

View file

@ -0,0 +1,85 @@
---
name: hooks-reference
description: CC hooks API — event types, payload shapes, exit codes, and where hooks run.
layer: reference
cc_feature: hooks
source: https://docs.claude.com/en/docs/claude-code/hooks
concept: hooks-api-surface
last_verified: 2026-04-18
ngram_overlap_score: null
review_status: approved
---
# Hooks — Reference
Hooks are shell commands or scripts that the Claude Code harness runs
in response to events. They give the harness — not Claude — the final
say on whether a tool call, prompt, or session action proceeds. Claude
cannot bypass a hook by prompting itself; the hook runs outside the
model's control loop.
## Event types
- **UserPromptSubmit** — fires when the user sends a prompt. Runs before
Claude processes it. Common use: inject extra context, reject
disallowed prompts.
- **PreToolUse** — fires before a tool call. Can deny the call. Common
use: block destructive commands, require confirmation.
- **PostToolUse** — fires after a tool call completes. Sees the result.
Common use: log side effects, redact output, trigger follow-up work.
- **Stop** — fires when the agent finishes a turn. Common use: commit
reminders, session summaries.
- **Notification** — fires when Claude wants to show the user a
notification (e.g., long-running task).
- **SessionStart** — fires when a session begins. Common use: print
repo state, inject context.
## Payload shape
Hooks receive a JSON payload on stdin. Common fields:
- `session_id` — the current session identifier.
- `transcript_path` — path to the conversation transcript.
- `cwd` — current working directory.
- `tool_name` (PreToolUse, PostToolUse) — which tool is running.
- `tool_input` (PreToolUse) — the arguments to the tool.
- `tool_response` (PostToolUse) — the tool's result.
- `prompt` (UserPromptSubmit) — the submitted text.
Exact field availability depends on event type. Read the payload JSON
rather than assuming a schema.
## Exit codes and control
Hooks communicate back via exit code and stdout JSON:
- Exit 0, no stdout → proceed normally.
- Exit 0, stdout JSON with `decision` field → harness honors the
decision (e.g., `{"decision": "block", "reason": "..."}`).
- Exit non-zero → harness treats as a denial or error, depending on
event and hook type.
Some hook types support structured output beyond deny/allow (e.g.,
adding context to the prompt). Details differ per event.
## Where hooks live
- Project hooks: `.claude/settings.json` `hooks` field, paths relative
to project.
- User hooks: `~/.claude/settings.json` (global).
- Plugin hooks: packaged with a plugin, activated when the plugin is
enabled.
Hooks run in the harness process's shell, not in Claude's tool-use
sandbox. They can spawn subprocesses, read environment variables,
and touch the filesystem.
## Implications for architecture
- Hooks are the mechanism for **deterministic policy** (things that
must always or never happen, regardless of what Claude decides).
- Hooks are load-bearing for security: prompt-injection-resistant
defenses live here, not in prompts.
- Hooks add latency to every tool call they gate — keep them fast.
- Hook output is part of the context window; verbose hooks burn
tokens quickly.

View file

@ -0,0 +1,72 @@
---
name: mcp-reference
description: Model Context Protocol — external tools and resources exposed to Claude via MCP servers.
layer: reference
cc_feature: mcp
source: https://docs.claude.com/en/docs/claude-code/mcp
concept: mcp-tool-protocol
last_verified: 2026-04-18
ngram_overlap_score: null
review_status: approved
---
# MCP — Reference
Model Context Protocol (MCP) is the protocol Claude Code uses to talk
to external tool servers. An MCP server advertises *tools* and
*resources*; Claude Code surfaces them to Claude as callable tools.
## Architecture
- **MCP server** — a process (local or remote) that implements the
protocol. Can be written in any language. Communicates over stdio,
HTTP, or WebSocket.
- **Transport** — stdio (local subprocess), SSE/HTTP (remote), or
WebSocket. Stdio is the default for local servers.
- **Tools** — callable functions the server exposes. Each has a name,
description, and JSON schema for inputs.
- **Resources** — readable entities the server exposes (files,
database rows, API responses). Addressed by URI.
- **Prompts** — optional; MCP can expose templated prompts.
## Configuration
MCP servers are declared in:
- `.mcp.json` — project-level MCP config.
- `~/.claude.json` or equivalent — user-level.
- Plugin-bundled — a plugin can ship its own MCP server.
Each entry specifies command, args, transport, and optional auth.
## Tool naming
Tools from MCP servers appear to Claude with a namespaced name:
`mcp__<server-name>__<tool-name>`. This keeps names collision-free
across servers.
## Permissions
- `allowed-tools` in settings or plugin frontmatter can include MCP
tools by full name.
- Some MCP servers require OAuth or API keys; those are configured in
the server's own config, not Claude's.
## Common uses
- Exposing internal APIs to Claude without hand-wrapping them (one
generic MCP server → many tools).
- Cross-language tool servers (Python tool called from Claude Code
running in Node).
- Sandboxed access to external services with explicit scoping.
## Failure modes
- Server not running → tool calls fail; Claude sees an error string.
- Server misbehaves → tool returns wrong schema; Claude may retry or
hallucinate.
- Authentication drift → 401s look like transient errors; diagnose by
checking the server directly.
- Security: an MCP server runs with the permissions of its own
process. A malicious server is a supply-chain risk; audit before
enabling.

View file

@ -0,0 +1,58 @@
---
name: output-styles-reference
description: CC output styles — configurable response shape, tone, length, and formatting baselines.
layer: reference
cc_feature: output-styles
source: https://docs.claude.com/en/docs/claude-code/output-styles
concept: output-style-config
last_verified: 2026-04-18
ngram_overlap_score: null
review_status: approved
---
# Output Styles — Reference
Output styles let a user or plugin shape how Claude Code responds:
length defaults, formatting preferences, tone, verbosity. They apply
across the session rather than needing to be re-stated per prompt.
## Where they live
- **Built-in styles** — shipped with the CLI.
- **Custom styles** — directory with a manifest describing the style.
- **Selection** — the user sets an active style via settings or a
`/style` command. The style is injected into Claude's system context.
## What a style can control
- Default response length baseline ("keep responses ≤ 100 words
unless detail is required").
- Formatting rules (markdown vs plain, code-block conventions).
- Tone ("terse", "pedagogical", "adversarial").
- Domain voice (Norwegian for dialog, English for code — a project
convention encoded as a style).
## What a style cannot control
- Tool permissions (that is `allowed-tools` / `settings.json`).
- Hooks (those are harness-level).
- Agent system prompts (those come from agent definitions).
## When to use a custom style
- The project has a persistent communication convention that should
hold across every session (e.g., "never use emojis").
- Multiple users share the project and want consistent output.
- A skill's prompts would otherwise have to restate formatting rules
each time.
## When not to
- For per-task formatting needs — use the prompt instead.
- For rules that must hold against prompt injection — use hooks.
## Practical shape
A minimal custom style is a markdown or plain-text block listing the
conventions. Claude treats it as top-of-system guidance. Keep it
short: long styles crowd out the task.

View file

@ -0,0 +1,65 @@
---
name: plan-mode-reference
description: CC plan mode — read-only planning phase before implementation, with explicit user approval gate.
layer: reference
cc_feature: plan-mode
source: https://docs.claude.com/en/docs/claude-code/plan-mode
concept: planning-before-execution
last_verified: 2026-04-18
ngram_overlap_score: null
review_status: approved
---
# Plan Mode — Reference
Plan mode is a built-in state where Claude operates read-only and
produces an implementation plan instead of executing it. The user
reviews the plan, then either approves it to transition to
implementation or iterates.
## State machine
1. **Plan mode entered** — user triggers it (Shift+Tab twice, `/plan`,
or harness-initiated) or Claude calls `EnterPlanMode`.
2. **Read-only operation** — Claude can read files, search, run
analysis. Writes, edits, and commits are blocked by the harness.
3. **Plan produced** — Claude presents a plan via `ExitPlanMode` or
equivalent.
4. **User reviews** — accepts, rejects, or iterates.
5. **Exit** — on acceptance, mode returns to normal (edits allowed).
## What plan mode guarantees
- No writes during the plan phase. Even if Claude tries, the harness
denies write tools.
- A structured handoff: the plan is a message the user sees before
anything happens.
## What plan mode does not guarantee
- Plan quality. Plan mode is a *gate*, not a *reviewer*. A bad plan
still passes if the user approves it.
- Scope locking. After exit, Claude can do whatever the new prompts
ask — plan mode is a phase, not a contract.
## When to opt into plan mode
- Tasks touching multiple files or modules where the order and file
list matter.
- Refactors where the first wrong edit is expensive to undo.
- Unfamiliar codebases where planning surfaces missing context.
## When to skip plan mode
- Single-file trivial changes.
- Tasks already specified by a detailed plan from another tool
(e.g., an `ultraplan-local` plan.md) — planning twice is waste.
## Relationship to /ultra* planning
- `/ultraplan-local` produces a *plan artifact* that outlives the
session. Plan mode produces an *in-conversation plan* that does not
survive `/clear`.
- They compose: use plan mode to sketch at session start, then run
`/ultraplan-local` to get the durable, reviewable, machine-readable
plan with manifests.

View file

@ -0,0 +1,83 @@
---
name: skills-reference
description: CC skills — auto-invoked domain modules with SKILL.md, frontmatter triggers, and file-hierarchy discovery.
layer: reference
cc_feature: skills
source: https://docs.claude.com/en/docs/claude-code/skills
concept: skills-system
last_verified: 2026-04-18
ngram_overlap_score: null
review_status: approved
---
# Skills — Reference
Claude Code's skill system is a way to package domain knowledge,
workflows, and supporting files so Claude can load them on demand. A
skill is a directory with a `SKILL.md` manifest and any auxiliary
files (scripts, templates, references).
## Anatomy
- **`SKILL.md`** — the entry point. Markdown file with YAML
frontmatter and a body. The frontmatter declares when the skill
triggers; the body is instructions Claude follows after loading.
- **Auxiliary files** — any files in the skill directory. Loaded on
demand (typically by Claude reading them when the body references
them). Common: `scripts/`, `templates/`, `references/`.
## Frontmatter
```yaml
---
name: <skill-name>
description: <one-line trigger hint>
---
```
The `description` is what Claude sees when deciding whether to invoke.
It should describe *when* to use the skill, not *what* it does.
## Invocation
- **Auto-invocation** — Claude loads the skill when the user's prompt
matches the description's triggers. The trigger model is implicit
(natural-language match), not regex.
- **Manual invocation** — the user types `/<skill-name>` and Claude
calls the Skill tool.
- **Never auto-invoked** — if the description explicitly says so
("Internal catalog for X — not invoked directly"), Claude typically
respects the hint.
## Discovery
Skills are discovered from:
- **Built-in skills** — shipped with the CLI.
- **Plugin skills** — packaged inside a plugin's `skills/` directory.
- **User skills**`~/.claude/skills/`.
- **Project skills**`.claude/skills/` inside the repo.
The CLI surfaces available skills via `--allowedTools` and in the
system prompt.
## Progressive disclosure
Skills use progressive disclosure:
1. Claude sees only the name + description at session start.
2. When a skill triggers, the body loads into context.
3. Files referenced by the body load only when Claude reads them.
This keeps the baseline context small; depth is paid for on demand.
## Relationship to other features
- **Skills vs subagents** — a skill lives in the parent's context when
loaded. A subagent runs in its own context. Choose skill when the
context flows through; subagent when it should not.
- **Skills vs hooks** — hooks are deterministic harness-level rules.
Skills are context that Claude interprets. A skill cannot enforce;
only guide.
- **Skills vs MCP** — MCP exposes *tools*. Skills are *knowledge*. An
MCP tool + a skill that explains when to use it is a common pairing.

View file

@ -0,0 +1,103 @@
---
name: subagents-pattern
description: When subagents earn their cost and how to compose them — swarm, pipeline, and guard patterns.
layer: pattern
cc_feature: subagents
source: https://docs.claude.com/en/docs/claude-code/sub-agents
concept: subagent-composition
last_verified: 2026-04-18
ngram_overlap_score: null
review_status: approved
---
# Subagents — Pattern
## When to delegate
Delegation earns its cost when at least one of these holds:
- **Context isolation** — the subtask needs to read 50+ files or run
many greps, and the parent conversation does not need the raw
results. Summaries survive; raw output stays in the subagent.
- **Parallelism** — multiple independent subtasks can run at once,
compressing wall-clock time.
- **Specialization** — the subagent has a tailored system prompt that
changes its behavior meaningfully (e.g., adversarial reviewer).
- **Tool scoping** — the subtask should run with fewer tools than the
parent (principle of least privilege).
If none of these apply, inline the work. A subagent call costs a full
model turn; do not pay it for routine reads.
## Common patterns
### Pattern A: Exploration swarm
Parent launches 4-8 specialized subagents in parallel, each with a
narrow brief (architecture, dependencies, risks, tests, ...). Each
returns a summary. Parent synthesizes.
Used by: `ultraplan-local` Phase 2 exploration.
Cost shape: N × Sonnet call, wall-clock ≈ slowest subagent.
### Pattern B: Adversarial review
Parent writes an artifact (plan, design note). Launches a reviewer
subagent with a system prompt that demands problems, never praise.
Reviewer returns findings. Parent revises.
Used by: `plan-critic`, `scope-guardian`, `architecture-critic`.
Cost shape: 1 × Sonnet call per review pass.
### Pattern C: Background orchestrator
Parent kicks off a long-running orchestrator subagent with
`run_in_background: true`, then continues. Orchestrator runs its own
subagents, synthesizes, writes output to disk. Parent is notified on
completion.
Used by: `planning-orchestrator`, `research-orchestrator`.
Cost shape: 1 × Opus orchestrator + N × Sonnet workers. Overlaps with
other user work.
### Pattern D: Guard subagent
A hook delegates an "is this safe?" question to a subagent when the
answer needs judgment. The subagent returns a verdict; the hook
enforces it.
Cost shape: 1 × Sonnet call per hook invocation. Use sparingly —
adds seconds of latency per tool call.
## Pitfalls
- **Delegate-understanding anti-pattern** — do not write "based on your
findings, fix the bug" to a subagent. The subagent is not inside your
head; it cannot see what you synthesized. Pass concrete context.
- **Prompt-on-top-of-prompt drift** — if the parent's prompt to a
subagent contradicts the subagent's own system prompt, the subagent
follows its system prompt. Do not try to re-style a reviewer into a
cheerleader by prompting harder.
- **Silent failure** — a subagent that returns "done" without evidence
may have done nothing. Trust but verify: check for the concrete
artifacts the subagent was asked to produce.
- **Orchestration explosion** — a three-level-deep subagent tree costs
exponentially. Flatten wherever the inner levels don't need their own
context isolation.
- **Token budget fights** — parent and all active subagents share the
harness's overall budget. Cap subagent output length ("report in
under 200 words") when the summary is what matters.
## Composition with other features
- Subagents + hooks: hooks fire during subagent tool calls too. A
subagent with only `Read` tools is already constrained; hooks add
defense in depth.
- Subagents + worktrees: an `isolation: "worktree"` subagent works in
an isolated copy of the repo, so its writes never clash with the
parent's writes.
- Subagents + background: run heavy exploration in background while the
user continues other work.

View file

@ -0,0 +1,98 @@
---
name: subagents-reference
description: CC subagents — how the Task tool spawns isolated agent instances with scoped tools and context.
layer: reference
cc_feature: subagents
source: https://docs.claude.com/en/docs/claude-code/sub-agents
concept: task-tool-delegation
last_verified: 2026-04-18
ngram_overlap_score: null
review_status: approved
---
# Subagents — Reference
Subagents are fresh Claude instances spawned via the Task tool. Each
receives a task prompt, a tool subset, and no memory of the parent
conversation except what the parent explicitly passes. They return a
single final message to the parent.
## Anatomy
A subagent has:
- **A name** — either a built-in type (`general-purpose`, `Explore`,
`Plan`) or a plugin-defined type (`code-reviewer`, `test-runner`,
...).
- **A system prompt** — defined by the agent type. The parent cannot
override it.
- **A tool set** — subset of the parent's tools, as declared in the
agent definition's frontmatter `tools:` field.
- **A task prompt** — what the parent asks it to do. Self-contained;
the subagent has no access to prior messages.
- **A model** — either explicit in the agent definition (`model: opus |
sonnet | haiku`) or inherited from the parent.
## How to define a subagent
A plugin agent is a markdown file in `agents/` with frontmatter:
```yaml
---
name: code-reviewer
description: <when to invoke, include examples>
model: sonnet
tools: ["Read", "Grep", "Glob"]
---
<system prompt content>
```
The `description` field is how Claude decides when to spawn this
subagent. Include concrete trigger examples.
## How to invoke
Parent calls the `Task` / `Agent` tool with:
- `subagent_type` — the agent's name
- `description` — short label for logs
- `prompt` — the self-contained task
Optional:
- `run_in_background: true` — agent runs in the background; parent
is notified on completion.
- `isolation: "worktree"` — agent runs in a temporary git worktree
(isolated copy of the repo).
## Return protocol
- Foreground agent: parent blocks until the agent returns. Return value
is a single text message.
- Background agent: parent continues. On completion, the harness
injects a notification into the parent's next turn.
## Isolation guarantees
- No conversation history sharing. Each subagent starts cold.
- Separate context window. A subagent can read large files without
polluting the parent's context.
- Separate tool permissions. A subagent with only `Read` tools cannot
write files, even if the parent can.
## Cost and latency
- Every subagent call is a full model call with its own token budget.
- Model choice matters: Sonnet is ~5× cheaper than Opus; Haiku is
cheaper still but cannot be used (per project policy).
- Parallel subagents: the parent can launch multiple in one message;
the harness runs them concurrently. Total latency ≈ slowest agent.
## Failure modes
- Subagent hits context limit → partial or missing return.
- Subagent runs in background and the parent conversation ends → result
may be orphaned.
- Subagent's return message hallucinates file paths → caller must
verify.

View file

@ -0,0 +1,68 @@
---
name: worktrees-reference
description: CC git worktree integration — isolated repo copies per agent for parallel or destructive work.
layer: reference
cc_feature: worktrees
source: https://docs.claude.com/en/docs/claude-code/worktrees
concept: git-worktree-isolation
last_verified: 2026-04-18
ngram_overlap_score: null
review_status: approved
---
# Worktrees — Reference
Git worktrees let a repo check out multiple branches simultaneously in
separate directories. Claude Code integrates with worktrees to give
agents isolated filesystem scopes — one agent's edits cannot clash
with another's.
## How it surfaces in CC
- **`isolation: "worktree"` on a subagent call** — the harness creates
a temporary worktree for the subagent. The subagent runs with its
cwd set to the worktree. Changes stay there until merged or
discarded.
- **`/worktree` skill / command** — interactive tooling for creating,
listing, and merging worktrees during a session.
- **Auto-cleanup** — worktrees created by subagents are removed if the
subagent made no changes. Otherwise the path is returned in the
result for the caller to review.
## Branch semantics
- Each worktree checks out a named branch. Two worktrees cannot check
out the same branch (git's rule).
- Creating a worktree creates a branch if one does not exist.
- Deleting a worktree does not delete the branch; use `git branch -d`
separately if desired.
## Use cases
- **Parallel exploration** — three subagents trying three approaches,
each in its own worktree. The parent compares diffs.
- **Destructive experiments** — upgrade a dependency, run the full
test suite, measure breakage. Discard the worktree if results are
bad.
- **Long-running session without blocking main** — execute a refactor
in a worktree while continuing other work in the main checkout.
## Pitfalls
- **Shared state leaks** — node_modules, .env, build artifacts are
per-worktree but may be symlinked in ways that defeat isolation.
Verify.
- **Disk use** — each worktree is a full checkout of the tree. For
large repos, disk pressure adds up.
- **Branch proliferation** — agents that create worktrees without
cleanup leave orphan branches. Prune periodically.
- **Not a sandbox** — a worktree isolates files, not network or
processes. A subagent in a worktree can still call the outside
world.
## Composition
- Worktrees + background agents: a background agent in a worktree can
work on a long task while the user continues in the main checkout.
- Worktrees + subagents + hooks: hooks fire inside the worktree cwd,
so path-based hooks naturally scope to the isolated work.