From 0d3da7828d4e997278af9b72e9e7c096d05556ac Mon Sep 17 00:00:00 2001
From: Kjell Tore Guttormsen <hello@fromaitochitta.com>
Date: Sat, 30 May 2026 07:17:55 +0200
Subject: [PATCH] docs(linkedin-studio): measure long-form review-pass overlap,
 trim where unjustified
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Steg 20 (remediation Wave 4 / S5, SOLO): measure whether the 7-agent long-form
review stack carries redundant gates. Method: cross-reference each agent's check
taxonomy against its in-repo fasit fixture; four fixtures (editorial, content,
language, fact-reviewer) target the SAME Del 4 edition, enabling a real
cross-gate overlap comparison on one piece (not a live run — fixtures' own
live-run notes require a reload + cross-repo Maskinrommet access, out of scope).

Finding: every gate has >=1 unique catch on Del 4. The four genuine overlaps
(verbatim repetition, the Vi/Vi-i-Nav quote, the postulated number, the
small-orgs thread) are each justified — a cold re-take (Endring 9's reason to
exist), the same symptom via a different operation (flag-absence vs web-verify),
or two distinct defects sharing a surface topic — with no subsumption either way.
The fact-checker <-> fact-reviewer overlap is load-bearing (the pivot premise
arrived after Step 5, so only the cold re-run caught it).

Decision: NO TRIM. voice-scrubber has no fixture -> inconclusive; redundancy
retained (Step 20 On-failure = skip). Counts unchanged 19 agents / 27 commands;
count contract (EXPECT_AGENTS=19) untouched. test-runner 62/62 green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
---
 .../docs/remediation/overlap-measurement.md   | 217 ++++++++++++++++++
 1 file changed, 217 insertions(+)
 create mode 100644 plugins/linkedin-studio/docs/remediation/overlap-measurement.md

diff --git a/plugins/linkedin-studio/docs/remediation/overlap-measurement.md b/plugins/linkedin-studio/docs/remediation/overlap-measurement.md
new file mode 100644
index 0000000..3208c12
--- /dev/null
+++ b/plugins/linkedin-studio/docs/remediation/overlap-measurement.md
@@ -0,0 +1,217 @@
+# Long-form Review-Pass Overlap Measurement — Steg 20
+
+_Remediation Voyage, Wave 4 / S5. Measures whether the long-form review stack
+carries redundant gates, and trims **only** where a gate catches nothing the
+others don't. Written 2026-05-30, SOLO (no subagent fan-out)._
+
+## The question and the trim rule
+
+The long-form pipeline runs **seven** review agents. Endring 9 (v3.1.0) added a
+cold/headless package (`content-reviewer`, `language-reviewer`, `fact-reviewer`)
+whose agent prompts argue, in their own words, that they overlap the in-session
+gates **on purpose** (`fact-reviewer`: «the redundancy is load-bearing, not
+waste»; `language-reviewer` anti-pattern: «'De-duplicate' yourself against
+editorial-reviewer — the overlap is the cold re-take»). Steg 20 tests that claim
+against evidence instead of taking it on faith:
+
+> **Trim a gate ONLY where it catches nothing the others don't** (then merge/remove
+> it + update the count contract). **If the redundancy is justified, record that
+> and keep it. If the fixture is insufficient to decide, record «inconclusive;
+> redundancy retained» and do NOT trim.** (Step 20 On-failure = skip the trim.)
+
+## Method — and its honest limit
+
+I measured the **documented catch-sets**: each agent's check taxonomy (the agent
+`.md`) cross-referenced against its in-repo **fasit fixture**
+(`agents/fixtures/*-cases.md`). I did **not** run the agents live: every fixture's
+own *live-run note* states a live cold run needs (a) a session reload and (b)
+read access to the frozen Del 4 draft in the **Maskinrommet series folder** —
+cross-repo, explicitly out of scope this session. By each fixture's own
+declaration the fasit is «the gold-standard of record» until both hold, so the
+fasit catch-sets are the legitimate measurement surface.
+
+**The lucky break that makes this more than taxonomy-reasoning:** four of the six
+fixtures target the **same edition** — Del 4 (Security Champions, Maskinrommet).
+`editorial-reviewer` reviewed v5 (2026-05-28, in-session); the cold trio
+(`content`/`language`/`fact-reviewer`) re-read the **frozen/pivoted** version
+(2026-05-29). That shared edition lets me compare what each gate *actually caught
+on one piece* — a real cross-gate overlap measurement, not just a boundary
+restatement.
+
+| Fixture | Edition under review | Cases | Enables shared-edition compare? |
+|---------|---------------------|-------|----------------------------------|
+| `editorial-reviewer-cases.md` | **Del 4 v5** (28.05, in-session) | 8 | ✅ yes |
+| `content-reviewer-cases.md` | **Del 4 frozen/pivoted** (29.05, cold) | 6 | ✅ yes |
+| `language-reviewer-cases.md` | **Del 4 frozen** (29.05, cold) | 6 | ✅ yes |
+| `fact-reviewer-cases.md` | **Del 4 frozen/pivoted** (29.05, cold) | 6 | ✅ yes |
+| `persona-reviewer-cases.md` | separate jargon-wall sample (+ documented Del 4 behaviour) | 6 axes | partial |
+| `fact-checker-cases.md` | 3 generic reference claims (not Del 4) | 3 | role only |
+| `voice-scrubber` | **NO FIXTURE** | — | ❌ inconclusive |
+
+## The seven agents — axis map
+
+| Agent | Step | Axis (the one question it answers) | When | Fixture |
+|-------|------|-------------------------------------|------|---------|
+| `fact-checker` | 5 | factual truth — *is it true?* | in-session, **moving draft** | generic (3 claims) |
+| `editorial-reviewer` | 5.5 | prose craft + narrative architecture — *is it well-made?* | in-session | Del 4 v5 |
+| `persona-reviewer` | 2.5/6/9 | reader response — *does it land?* | in-session | sample + Del 4 behaviour |
+| `voice-scrubber` | 4 | de-AI + chronicle voice drift — *does it sound like the author?* | in-session (**applies** edits) | none |
+| `content-reviewer` | 6.5 | argument integrity — *does the reasoning hold?* | **cold/frozen** | Del 4 frozen |
+| `language-reviewer` | 6.5 | Norwegian language — *does it read clean?* | **cold/frozen** | Del 4 frozen |
+| `fact-reviewer` | 6.5 | factual truth, re-verified — *is every claim, incl. pivot, true?* | **cold/frozen+pivoted** | Del 4 frozen |
+
+## Per-reviewer catch table (what each gate caught on the fixtures)
+
+Legend: **U** = unique catch (no other gate's fixture surfaces this defect) ·
+**O** = overlaps another gate's catch (overlap analysed in the matrix below).
+
+### `editorial-reviewer` — Del 4 v5 (8 catches)
+
+| # | Check | Defect caught | Sev | U/O |
+|---|-------|---------------|-----|-----|
+| 1 | A1 | abstract figure never instantiated (craft/vividness) | REWORK | O → content C4 (adjacent) |
+| 2 | P3 | postulated number, no source/hedge — *flags absence, no search* | REWORK | O → fact-reviewer F3 |
+| 3 | A2 | trust-effect hypothesis with no SDT/theory anchor | BLOCK | **U** |
+| 4 | A3 | broken series-title symmetry (part floats free) | REWORK | **U** |
+| 5 | A4 | small-business addressee stranded — no usable action | BLOCK | O → content C5 (adjacent) |
+| 6 | P2 | verbatim repetition | REWORK | O → language L1 |
+| 7 | P1 | em-dash over-density | REWORK | **U** |
+| 8 | P4 | prose-level internal contradiction (two passages) | BLOCK | O → content C3 (adjacent) |
+
+### `content-reviewer` — Del 4 frozen (6 catches) — argument-integritet
+
+| # | Check | Defect caught | Sev | U/O |
+|---|-------|---------------|-----|-----|
+| 1 | C2 | Security-Champions **pivot premise** asserted unsupported | BLOCK | **U** |
+| 2 | C5 | unanswered «what about small orgs?» objection | BLOCK | O → editorial A4 (adjacent) |
+| 3 | C1 | logical hole «Champions finnes» → «dømmekraft bevart» | REWORK | **U** |
+| 4 | C4 | role section needs **one concrete org** for the argument | REWORK | O → editorial A1 (adjacent) |
+| 5 | C3 | recommendation **delegates the judgment** the series premise rules out | BLOCK | **U** |
+| 6 | C2 | gevinst assumes widespread org maturity | REWORK | **U** |
+
+### `language-reviewer` — Del 4 frozen (6 catches) — norsk-språkkvalitet
+
+| # | Check | Defect caught | Sev | U/O |
+|---|-------|---------------|-----|-----|
+| 1 | L4 | quote error «Vi» vs «Vi i Nav» (wording misrepresents source) | BLOCK | O → fact-reviewer F2 |
+| 2 | L2 | anglicism «adressere problemet» | REWORK | **U** |
+| 3 | L2 | anglicism «på en daglig basis» | REWORK | **U** |
+| 4 | L1 | verbatim repetition 3× across §1/§4/§6 | REWORK | O → editorial P2 |
+| 5 | L3 | «det vises til» kanselli-stil in a personal chronicle | REWORK | **U** |
+| 6 | L5 | monotone cadence (5 same-length sentences) | NICE | **U** |
+
+### `fact-reviewer` — Del 4 frozen/pivoted (6 catches) — faktisk-korrekthet (cold)
+
+| # | Check | Defect caught | Verdict | U/O |
+|---|-------|---------------|---------|-----|
+| 1 | F1 | **pivot premise never met Step 5** (PIVOT-RISK headline) | 🔴 | **U** |
+| 2 | F1+F2 | misattribution to wrong originator | 🔴 | **U** |
+| 3 | F2 | quote precision «Vi» vs «Vi i Nav» (vs source) | 🟡 | O → language L4 |
+| 4 | F3 | postulated number, no provenance — *searches, finds none* | 🟡 | O → editorial P3 |
+| 5 | F1 | «Security Champions» as a settled standard that **varies per org** (PIVOT-RISK) | 🔴 | **U** |
+| 6 | F4+F3 | secondary source for a precise figure («~a third» ≠ «37 %») | 🟡 | **U** |
+
+### `fact-checker` — role on Del 4 (generic fixture, 3 claims)
+
+Catches truth defects **cheaply and early, on the moving draft** (Step 5). Its
+fixture is 3 generic ground-truth claims (EU AI Act 🟢 / GPT-4-by-Anthropic 🔴 /
+unverifiable 37 % 🟡), not Del 4. Its measured **role** on Del 4 is documented by
+the `fact-reviewer` fixture: the Security-Champions pivot arrived **after** the
+Step 5 sweep, so `fact-checker` structurally **never saw** the pivot premise. It
+is necessary (early/cheap truth gate) but **provably insufficient** — which is the
+entire reason `fact-reviewer` exists. **U** by pipeline position.
+
+### `persona-reviewer` — resonance/response
+
+On Del 4 the persona sweep returned **15 flags across 3 personas and every
+persona PASS / ready-to-publish** (per the editorial fixture). Its own fixture
+(jargon-wall sample) shows the 6 response axes (Krok IKKE, Leder-takeaway IKKE,
+…). Catches **reader-response** defects no other gate measures. **U** by axis.
+
+### `voice-scrubber` — de-AI + chronicle voice drift
+
+**No fixture exists.** Its axis (mechanical AI-tells + Norwegian-chronicle voice
+drift, judged against approved Norwegian editions) is measured by no other gate,
+and uniquely it **applies** edits (Pass 1) and maintains a drift-log — it is not
+even part of the review-report package. Overlap **inconclusive from in-repo
+fixtures**; see decision below.
+
+## Cross-gate overlap matrix (the shared Del 4 edition)
+
+Four genuine overlaps surface on Del 4. The decisive test for each: **does either
+gate's catch-set subsume the other's?** In every case — **no**.
+
+| # | Defect | Gates that catch it | Same defect or same symptom? | Subsumption? | Justification |
+|---|--------|---------------------|------------------------------|--------------|---------------|
+| O1 | verbatim repetition | editorial **P2** (in-session, v5) ↔ language **L1** (cold, frozen) | same defect | **neither** | **Cold re-take.** Editorial caught it in-session sharing the author's framing; language re-caught it cold on the frozen version. The agent prompts mandate this overlap explicitly. The value is the independent reading, not a second checklist. |
+| O2 | quote «Vi» vs «Vi i Nav» | language **L4** (BLOCK) ↔ fact-reviewer **F2** (🟡) | same defect, **two operations** | **neither** | language flags the *wording* misrepresenting the source **without web access**; fact-reviewer *verifies against the actual source via web search*. Different tools, different severities — one catches it if the source is unreachable, the other if the wording reads clean but the source differs. |
+| O3 | postulated number | editorial **P3** (REWORK) ↔ fact-reviewer **F3** (🟡) | same symptom, **two operations** | **neither** | editorial flags the **absence** of a source/hedge (no search); fact-reviewer **searches for provenance and finds none**. The prompts draw this boundary by hand. A bare number with a *findable* source passes editorial (it has none inline) but is exactly what fact-reviewer's search resolves. |
+| O4 | small-orgs thread | editorial **A4** (stranded addressee) ↔ content **C5** (unanswered objection) | **adjacent — different defects** | n/a | Same surface topic (small orgs) decomposes into two genuinely different defects: A4 = «the small-business reader leaves with no *action*» (architecture); C5 = «the *argument* never meets the obvious counter and collapses for that class» (logic). Not redundancy — two gates needed to see both faces. |
+
+Plus the **fact-checker ↔ fact-reviewer time-axis overlap** (deliberate, not in
+the matrix because it spans pipeline stages, not one defect): Step 5 runs
+in-session on the **moving** draft; Step 6.5 re-runs cold on the **frozen/pivoted**
+draft. **Case 1 (pivot premise) is the proof it's load-bearing** — the pivot
+arrived after Step 5, so only the cold re-run could catch it. Collapsing the two
+would re-open the exact gap that motivated Endring 9.
+
+Adjacent (not overlap) pairs the prompts separate by design and the Del 4 cases
+confirm as distinct defects: editorial **P4** (prose contradiction) vs content
+**C3** (argument-logic contradiction); editorial **A1** (vividness) vs content
+**C4** (a load-bearing claim a skeptic won't *believe* abstractly).
+
+## Unique catch per gate — none is a subset of another
+
+Every one of the seven has **≥1 catch no other gate's fixture surfaces**:
+
+- **fact-checker** — early/cheap truth on the moving draft; provably *insufficient*
+  alone (never saw the pivot), which is the case for keeping `fact-reviewer`.
+- **editorial-reviewer** — **A2 theory-anchor** and **A3 series-title symmetry**
+  are pure blind spots no other gate measures (and were persona-blind on Del 4).
+- **persona-reviewer** — reader response (Krok/resonans/takeaway); the only gate on
+  that axis. The «PASS yet 8 editorial + 6 argument + 6 language points» result is
+  the whole motivation for the stack.
+- **content-reviewer** — argument logic (C1/C2/C3/C5 all unique); the only gate that
+  asks *does the reasoning hold?*
+- **language-reviewer** — anglicisms, kanselli-stil, cadence; the only gate on
+  Norwegian idiom/register/rhythm.
+- **fact-reviewer** — the **pivot-risk** catches (Cases 1, 5); the only cold
+  post-pivot truth re-run.
+- **voice-scrubber** — de-AI tells + chronicle voice drift; the only gate that
+  *applies* edits and keeps a drift-log.
+
+## Trim decision — NO TRIM
+
+**No gate catches nothing the others don't.** Every gate has ≥1 unique catch on the
+fixtures, and every one of the four genuine overlaps (O1–O4) is justified — a cold
+re-take (O1), the same symptom via a different operation (O2, O3), or two distinct
+defects sharing a surface topic (O4) — with **no subsumption in any direction**.
+The fact-checker ↔ fact-reviewer overlap is load-bearing by construction (proven by
+the pivot-premise catch). Per the Steg 20 rule this is the **«redundancy is
+justified — record and keep»** case for all measurable gates.
+
+**`voice-scrubber` specifically:** no in-repo fixture, so its overlap cannot be
+*measured* here → **«measurement inconclusive; redundancy retained pending a real
+edition»** (Step 20 On-failure = skip the trim). Its axis is orthogonal by design
+and it is not part of the review-report package, so there is no redundancy claim to
+adjudicate even in principle.
+
+**Consequence for the count contract:** **no gate removed → counts unchanged.**
+
+| Count | Value | Touched? |
+|-------|-------|----------|
+| Agents | **19** | no |
+| Commands | **27** | no |
+
+The count contract (`EXPECT_AGENTS=19`, the CLAUDE.md/README agent tables) is **not
+modified** this step — there is nothing to update because nothing was trimmed.
+Steg 21 (version bump + count recompute) inherits an unchanged 19/27 baseline.
+
+## Verification
+
+- `test -f docs/remediation/overlap-measurement.md` → present (this file).
+- Per-reviewer **catch** table present (one per gate) + cross-gate overlap matrix.
+- No gate removed → count contract untouched; `EXPECT_AGENTS` stays 19. (The trim
+  branch's `test-runner.sh exit 0 + same-commit count update` is N/A — no trim.)
+- `bash scripts/test-runner.sh` run for hygiene regardless → expect exit 0 (repo
+  green, nothing changed but a doc).