# Skill-factory calibration fixtures These fixtures calibrate the IP-hygiene thresholds used by `scripts/ngram-overlap.mjs`. Each pair (`source-*`, `draft-*`) is hand-tuned so that the n-gram containment verdict lands in a specific band, anchoring the empirical thresholds against representative prose. ## Pairs | Pair | Target verdict | Containment | Longest run | Notes | |------|----------------|-------------|-------------|-------| | `source-accepted.md` ↔ `draft-accepted.md` | **accepted** | 0.014 | 3 | Heavy paraphrase; concept-equivalent without phrase reuse | | `source-needs-review.md` ↔ `draft-needs-review.md` | **needs-review** | 0.211 | 12 | Mixed: paraphrased frame, retained domain phrasing | | `source-rejected.md` ↔ `draft-rejected.md` | **rejected** | 0.676 | 74 | Light edit on top of source; verbatim runs survive | ## Verdict bands The verdict bands match the constants in `scripts/ngram-overlap.mjs`: - **accepted** — containment < 0.15 AND longestRun < 8 - **needs-review** — between accepted and rejected - **rejected** — containment ≥ 0.35 OR longestRun ≥ 15 If you change the thresholds in `ngram-overlap.mjs`, re-verify each fixture pair to confirm the calibration still holds. The fixtures are content-stable; the thresholds are the variable. ## Why these topics The fixtures use Claude Code reference prose (session-start hooks, subagent delegation, output styles) so they live near the kind of source material the skill-factory will actually paraphrase in production. Drift between fixture domain and production domain would weaken the calibration signal. ## Regeneration These files are committed to the repo as ground-truth fixtures. Do not regenerate them ad-hoc — edit deliberately, re-run the verification commands listed in `plan.md` Step 5, and commit intentionally. ```bash node scripts/ngram-overlap.mjs tests/fixtures/skill-factory/draft-accepted.md \ tests/fixtures/skill-factory/source-accepted.md ```