test(ultraplan-local): add skill-factory calibration fixtures

3 source/draft pairs that pin n-gram-overlap verdicts to representative prose:
- accepted (containment 0.014 / longestRun 3)
- needs-review (containment 0.211 / longestRun 12)
- rejected (containment 0.676 / longestRun 74)

Topics: session-start hooks, subagent delegation, output styles — proximate to
the production source material the skill-factory will ingest.

Plan: .claude/projects/2026-04-18-skill-factory-fase-1-mvp/plan.md (step 5)
This commit is contained in:
Kjell Tore Guttormsen 2026-04-18 15:16:28 +02:00
commit 486f544d39
7 changed files with 280 additions and 0 deletions

View file

@ -0,0 +1,55 @@
---
name: session-startup-hook-pattern
description: Pattern for warming up agent context at conversation boot
layer: pattern
cc_feature: hooks
source: ./source-accepted.md
concept: warm-start briefing via boot hook
last_verified: 2026-04-18
ngram_overlap_score: null
review_status: pending
---
## Use this when
You want each new conversation to begin with the agent already aware of project status — pending tasks, recent commits, key file paths — instead of burning early turns on orientation.
## Shape
Bind one script to the boot lifecycle event. Inside, gather whatever situational data justifies opening a fresh chat with: git log summary, todo file digest, branch identity, env snapshot. Format as terse markdown headings. Emit on stdout. The runtime grafts whatever you print into the model's opening context window.
## Forces
Cold turns cost real seconds. Anything you compute inside this script delays the first user prompt visibly. Therefore: prefer cached lookups, avoid synchronous calls to remote services, and timebox expensive operations. If a piece of data takes longer than roughly 200 ms to fetch, demote it from the briefing or front-load a cache refresh asynchronously elsewhere.
Multiple installation layers can each register their own boot script. They run in this order: marketplace-supplied first, your personal home-level second, and project-rooted third. Outputs concatenate without merging. Two scripts both reporting the branch name will both show up. There is no built-in dedupe.
Crash semantics are partial-write friendly: anything you streamed to stdout before exiting non-zero still reaches the agent. The fix is to assemble the full briefing in a buffer and only flush on the success path. That way a midpoint failure leaves an empty intro rather than a torn one.
## Gotchas
- No native simulator. Test by piping a hand-crafted payload directly into the script and reading the stdout it returns. Keep one canonical payload checked in.
- No selective firing — boot scripts cannot be scoped to subdirectories or file patterns. Workaround: branch inside the script body and short-circuit early when the heuristic does not match.
- Privileges are unfiltered: these scripts run with full operator credentials before any agent sandbox spins up. Audit third-party scripts before enabling them, and pin their versions to defeat silent updates.
- Telemetry is your problem: the runtime does not emit structured execution logs. If you want metrics, append to a local jsonl yourself from inside the script.
## Anti-patterns
- Doing remote network calls on the hot path without caching.
- Skipping output buffering and streaming partial briefings.
- Allowing two layers to print conflicting information without coordinating.
- Trusting installed marketplace scripts implicitly.
## Decision quick-check
Pick this approach when:
- Pending state across sessions is non-trivial enough to merit the boot delay.
- You can keep the script under a 250 ms wall-clock budget.
- The information surfaced changes meaningfully between sessions.
Skip it when:
- Project state rarely shifts; static instructions in CLAUDE.md serve the same purpose at zero startup cost.
- The data you would print is sensitive and could leak into copies of the conversation transcript.
- You are still iterating on what should appear; build the briefing as a plain script the operator runs manually before nailing it down as a boot binding.