ktg-plugin-marketplace/plugins/linkedin-studio/agents/fixtures/content-reviewer-cases.md
Kjell Tore Guttormsen e69ea1f4c9 feat(linkedin-studio): v3.1.0 — Endring 9 adversarial review-pakke + per-artefakt personas
Cold, adversarial review package for the long-form pipeline + configurable
per-edition personas. Motivated by Del 4 (Security Champions pivot): the
in-session editor + persona sweep shared the drafting session's framing-bias,
so the shipped version was never independently re-reviewed.

Headless package (9a/9b):
- New Step 6.5 (headless-review) in /linkedin:newsletter, after the persona
  sweep, before lock — the independence layer the in-session gates can't be.
- New standalone /linkedin:headless-review command (run in a fresh session for
  maximum isolation; reconstructs frozen draft + contract + personas from disk).
- 3 new Opus archetypes, each with a cardinal context-isolation block that
  refuses drafting-session framing as "context pollution":
  - content-reviewer (argument integrity C1–C5, ≤8 flags)
  - language-reviewer (Norwegian language L1–L5, ≤10 flags)
  - fact-reviewer (cold re-verification F1–F4, risk-sort + pivot-risk, WebSearch)
- Deliberate redundancy with fact-checker / editorial-reviewer documented so
  the pairs are never de-duplicated.

Pivot-reopen (9c):
- New /linkedin:pivot command: logs articles.NN.pivots[], resets currentPhase,
  un-locks, marks gates to re-run.
- Pivot-detection gate in Step 8 lock precondition (>20% word-count change or
  >2 new sections re-opens cleared gates). Del 4 v8→v11 worked example.

Per-artifact personas (new requirement):
- articles.NN.personas with resolution order (edition-state → series file →
  plugin library → interactive). One or more readers configurable per edition.

Schema/docs:
- edition-state.template.json: additive personas[], pivots[], headlessReview,
  headless-review phase (16 phases); personaSweep.resonance.wordCount baseline.
- 3 fasit fixtures + 3 structural lint tests (Del 4 worked cases).
- Counts: 24→26 commands, 16→19 agents, 15→16 newsletter phases.
- README + CLAUDE.md (plugin + root) + CHANGELOG synced.

Verification: 35 agent-fixture + 59 hook + 20 render tests green. Backward-
compatible (additive state); reload required before the 3 new agents resolve.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-29 13:01:24 +02:00

210 lines
12 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Content-Reviewer Fasit Fixture
The Del 4 production round (Security Champions, Maskinrommet, 2026-05-29) as the
gold standard for the `content-reviewer` agent. Late in the round the draft took
a **Security Champions pivot**: a new ~260-word section introducing the Champions
model and a new ~270-word role-description section were added after the
in-session gates had already formed their reading. The in-session gates
(fact-check Step 5, editorial Step 5.5, persona sweep Step 6) all read the draft
through the drafting session's framing — they knew *why* the pivot happened and
*what* it was meant to argue, so they silently supplied the missing argumentative
steps for free. A **cold, adversarial reviewer** — handed only the frozen page —
cannot supply them, and that is exactly the point: the cold read catches the
argument holes the framing hid.
The six cases below are the fasit: a correct `content-reviewer` run on the frozen
Del 4 draft should surface **comparable flags**, mapped to the one axis with
consistent severities.
This file is a *fasit*, not a test harness. The structural lint lives in
`agents/__tests__/content-reviewer-fixture.test.mjs`. Whether the agent's live
flags actually reproduce these directions is `[GATE]`/`[OPERATØR]` — it is **not**
self-certified here.
> **The jury judges; the writer writes.** Every expected output below is
> **direction, not rewritten copy.** A correct agent run hands back flags + a
> severity — never edited prose. (`Foreslått retning`, not a new sentence.)
> **Why this gate exists.** The in-session gates shared the drafting session's
> framing-bias: they read the pivot knowing its intent, so they bridged the
> argument's gaps without noticing the gaps were there. A **cold reader** — run in
> an isolated context with no history, no changelog, no "out of scope" list, no
> pivot narrative — reads the frozen page as a first-time skeptic and finds the
> argument holes the framing hid. Any such framing that reaches the agent is
> **context pollution**: it is named and ignored, never acted on. This is a
> distinct failure surface from craft (editorial), language (language-reviewer),
> truth (fact-reviewer), and response (persona) — those gates can all pass while
> the argument itself does not hold.
---
## The axis (the agent judges on exactly this)
The agent judges on **one axis — argument-integritet** (argument & logical
integrity): *does the reasoning hold?* It does **not** judge craft, language,
factual truth, or reader response — those are `editorial-reviewer`,
`language-reviewer`, `fact-reviewer`, and `persona-reviewer` respectively. The
axis decomposes into exactly five checks:
- **C1 — logiske hull** (logical holes): a step in the argument chain is missing;
the text jumps from A to C without B, and the reader cannot reconstruct why the
conclusion follows.
- **C2 — ubegrunnede antakelser** (unsupported assumptions): the argument leans on
a premise it never establishes, asserted as self-evident when a thoughtful
reader would not simply grant it.
- **C3 — argument-motsigelser** (argument-level contradiction): the recommendation,
premise, and payoff are not mutually consistent — distinct from editorial-
reviewer's P4 (a *prose-level* contradiction between two passages); C3 is a
contradiction in the *logic of the argument itself*.
- **C4 — manglende konkretisering der argumentet trenger det** (missing
argumentative concretization): a load-bearing claim a skeptic would only believe
with a concrete instance stays abstract — not for vividness (editorial A1) but
because the argument *needs* the instance to carry weight.
- **C5 — ubesvart «what about X?»** (the unanswered obvious objection): the
strongest obvious objection a thoughtful reader raises is never acknowledged or
answered — the argument wins only because it never met its best counter.
## Severity (every flag carries exactly one)
- **BLOCK** — a defect that breaks the argument: an argument-level contradiction
(C3), or an unanswered objection (C5) that, once raised, collapses the
recommendation.
- **REWORK** — a real gap that should be filled, not load-bearing-fatal: a logical
hole (C1), an unsupported load-bearing assumption (C2), a claim that needs
concretization (C4).
- **NICE** — a minor reasoning soft spot worth tightening if cheap.
Sort BLOCK → REWORK → NICE; cap at **eight** flags (argument defects are coarser
than prose nits); if any are suppressed, say how many and of what severity —
never silently truncate.
---
## The six Del 4 argument points (fasit)
Each case states the argument defect a cold read would catch on the frozen Del 4
draft, the check (C1C5) it belongs to, the expected severity, and the direction a
correct agent run returns. Every case is an **argument blind spot** — distinct
from craft (what `editorial-reviewer` would catch) and response (what
`persona-reviewer` would catch). The in-session gates passed the draft; the cold
read does not, because the framing they shared is gone.
### Case 1 — pivot-premisset asserted uten støtte (unsupported pivot premise)
- **Axis:** argument-integritet · **Check:** C2 · **Severity:** BLOCK
- **Cold-read defect:** The new ~260-word Security Champions section opens by
treating "Security Champions er svaret" as an established premise the rest of
the part builds on — but the frozen page never establishes *why* the Champions
model is the right response rather than one option among several. The drafting
session knew the rationale; the cold reader is handed only the assertion.
- **Fasit / direction:** The pivot's load-bearing premise is asserted as
self-evident with no support a first-time skeptic would grant. Direction:
establish why the Champions model follows from the part's problem, or hedge it
as one option — do not let the whole section rest on an un-earned premise. (An
unsupported *load-bearing* premise that the section depends on is BLOCK: the
argument has not earned the right to make its central move.)
### Case 2 — ubesvart «hva med små organisasjoner?» (unanswered obvious objection)
- **Axis:** argument-integritet · **Check:** C5 · **Severity:** BLOCK
- **Cold-read defect:** The strongest obvious objection a thoughtful reader raises
on first reading the Champions pivot — *"what about small organisations that
cannot staff a dedicated Champion?"* — is never acknowledged or answered. The
recommendation effectively assumes an org large enough to nominate a Champion,
and the argument wins only because it never meets this counter.
- **Fasit / direction:** Name the objection and answer it (a small-org variant, an
explicit scope boundary, or a rule of thumb) — an unanswered objection that, once
raised, collapses the recommendation for a whole class of readers is BLOCK.
Direction only; the agent does not write the answer.
### Case 3 — sprang fra «Champions finnes» til «dømmekraft bevart» (logical hole)
- **Axis:** argument-integritet · **Check:** C1 · **Severity:** REWORK
- **Cold-read defect:** The text jumps from *"Security Champions finnes i
organisasjonen"* (A) to *"dermed er dømmekraften bevart"* (C) with no connecting
step (B): existence of a role does not on its own establish that judgment is
preserved. The reader cannot reconstruct why the conclusion follows. The session
carried the missing step in its head; the page does not state it.
- **Fasit / direction:** Supply the missing step — *how* the Champion's presence
translates into preserved judgment (mechanism, mandate, practice) — or soften the
conclusion to a hypothesis. A bridgeable-but-unbridged jump on a supporting line
is REWORK.
### Case 4 — rolle-seksjonen aldri forankret i én konkret org (missing concretization)
- **Axis:** argument-integritet · **Check:** C4 · **Severity:** REWORK
- **Cold-read defect:** The new ~270-word role-description section describes what a
Champion *does* entirely in the abstract and never grounds it in **one concrete
organisation** where this role actually operates. This is not a vividness nit
(that would be editorial A1) — the *argument* that the role works needs one real
instance to be believed; a skeptic will not grant an abstract job description as
evidence the model functions.
- **Fasit / direction:** Anchor the role in a single concrete (preferably
Norwegian) org where a Champion operates, so the load-bearing claim "this role
works" carries weight. Flag the *absence of the argument-bearing instance*; do
not supply the org. (Boundary: route any pure craft/vividness face to editorial
A1; this flag is the argument face — the claim cannot be believed abstractly.)
### Case 5 — anbefaling delegerer den dømmekraften serien sier ikke kan settes ut (argument contradiction)
- **Axis:** argument-integritet · **Check:** C3 · **Severity:** BLOCK
- **Cold-read defect:** The series premise is *"du kan ikke sette ut dømmekraft"*
(you cannot outsource judgment). The Champions recommendation, read cold on the
frozen page, effectively **delegates that judgment** to the Champion — the close
recommends the very move the premise rules out. Premise, recommendation, and
payoff are not mutually consistent. This is an argument-level contradiction (C3),
not a prose-level one between two passages (that would be editorial P4): the
*logic* defeats itself.
- **Fasit / direction:** Hold premise, recommendation, and gevinst side by side and
resolve one side — either reframe the Champion as *supporting* judgment that
stays distributed (not a delegate it is outsourced to), or qualify the series
premise. A recommendation that defeats the series premise is BLOCK.
### Case 6 — gevinst-leddet antar utbredt modenhet (unsupported assumption)
- **Axis:** argument-integritet · **Check:** C2 · **Severity:** REWORK
- **Cold-read defect:** The promised payoff of the Champions model leans on an
unstated assumption that the surrounding organisation is mature enough to use a
Champion well (clear mandate, time allocation, leadership backing). The frozen
page asserts the gevinst as if it follows automatically; the cold reader sees an
un-earned premise standing between the model and its benefit.
- **Fasit / direction:** Establish or hedge the maturity assumption the payoff
depends on — name the conditions under which the gevinst holds, or mark it
conditional. A load-bearing assumption left unstated under the payoff is REWORK
(it weakens the case rather than defeating it outright).
---
## Expected aggregate (what a correct run looks like)
- **Total flags:** 6 (well within the ≤8 cap — no suppression needed).
- **By check:** C1 = 1 (Case 3) · C2 = 2 (Cases 1, 6) · C3 = 1 (Case 5) ·
C4 = 1 (Case 4) · C5 = 1 (Case 2).
- **By severity:** BLOCK = 3 (Cases 1, 2, 5 — unsupported pivot premise,
unanswered small-org objection, premise/recommendation contradiction) ·
REWORK = 3 (Cases 3, 4, 6) · NICE = 0.
- **All six are argument blind spots:** none is a craft defect (`editorial-
reviewer`'s domain), a language defect (`language-reviewer`), a factual error
(`fact-reviewer`), or a resonance miss (`persona-reviewer`). The in-session
gates passed the draft on every one of those axes — and still the argument did
not hold, because they read it through the session's framing. The cold read is
the quantified case for the gate.
A run that reproduces ~these six directions, on the one argument-integritet axis,
with ~these severities, is **comparable** to the cold adversarial read the gate is
built to deliver. Exact wording is the editor's; the agent returns
**direction, not rewritten copy.**
## Calibration boundary
Whether the agent's live flags truly match this fasit is judged by the operator
(`[OPERATØR]`), not self-certified here. This fixture is the calibration target,
the same way `editorial-reviewer-cases.md`, `persona-reviewer-cases.md`, and
`fact-checker-cases.md` are fasits for their agents.
> **Live-run note.** A live run on the frozen Del 4 draft requires (a) a Claude
> Code session reload — a freshly added agent is not invokable until the plugin
> agent set is rebuilt at session start — and (b) a genuinely **cold** invocation
> (an isolated context with no drafting-session history, changelog, scope list, or
> pivot narrative reaching the agent). Until both hold, this fixture is the
> gold-standard of record.