Cold, adversarial review package for the long-form pipeline + configurable per-edition personas. Motivated by Del 4 (Security Champions pivot): the in-session editor + persona sweep shared the drafting session's framing-bias, so the shipped version was never independently re-reviewed. Headless package (9a/9b): - New Step 6.5 (headless-review) in /linkedin:newsletter, after the persona sweep, before lock — the independence layer the in-session gates can't be. - New standalone /linkedin:headless-review command (run in a fresh session for maximum isolation; reconstructs frozen draft + contract + personas from disk). - 3 new Opus archetypes, each with a cardinal context-isolation block that refuses drafting-session framing as "context pollution": - content-reviewer (argument integrity C1–C5, ≤8 flags) - language-reviewer (Norwegian language L1–L5, ≤10 flags) - fact-reviewer (cold re-verification F1–F4, risk-sort + pivot-risk, WebSearch) - Deliberate redundancy with fact-checker / editorial-reviewer documented so the pairs are never de-duplicated. Pivot-reopen (9c): - New /linkedin:pivot command: logs articles.NN.pivots[], resets currentPhase, un-locks, marks gates to re-run. - Pivot-detection gate in Step 8 lock precondition (>20% word-count change or >2 new sections re-opens cleared gates). Del 4 v8→v11 worked example. Per-artifact personas (new requirement): - articles.NN.personas with resolution order (edition-state → series file → plugin library → interactive). One or more readers configurable per edition. Schema/docs: - edition-state.template.json: additive personas[], pivots[], headlessReview, headless-review phase (16 phases); personaSweep.resonance.wordCount baseline. - 3 fasit fixtures + 3 structural lint tests (Del 4 worked cases). - Counts: 24→26 commands, 16→19 agents, 15→16 newsletter phases. - README + CLAUDE.md (plugin + root) + CHANGELOG synced. Verification: 35 agent-fixture + 59 hook + 20 render tests green. Backward- compatible (additive state); reload required before the 3 new agents resolve. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
11 KiB
Fact-Reviewer Fasit Fixture
The Del 4 production round (Security Champions, Maskinrommet, 2026-05-29) as the
gold standard for the fact-reviewer agent. The in-session fact-checker
(Step 5) ran on a still-moving draft. A late Security Champions pivot — a new
argument anchor — arrived after that Step 5 sweep, so the pivot's premise was
never fact-checked. The pivot then went through the in-session persona sweep
(Step 6) and reached the frozen, publish-ready version with an unverified premise
intact. KTG's cold re-reading caught a misattribution, a quote-precision error, a
postulated number with no provenance, a "settled standard" that actually varies,
and a secondary source trusted for a precise figure — six points in all. Those six
points are the fasit below: a correct fact-reviewer run on the frozen/pivoted Del
4 should surface comparable verdicts, mapped to the four checks with consistent
risk verdicts and the pivot-risk flag.
This file is a fasit, not a test harness. The structural lint lives in
agents/__tests__/fact-reviewer-fixture.test.mjs. Whether the agent's live
verdicts actually reproduce these is [GATE]/[OPERATØR] — it is not
self-certified here.
The jury judges; the writer writes. Every expected output below is direction, not rewritten copy. A correct agent run hands back a verdict + a source (or "none found") + a fix-as-direction (source it / hedge it / cut it) — never edited prose.
Why this gate exists. Fact-checking was post-hoc relative to the pivot in Del 4: the in-session Step 5 sweep ran before the Security Champions pivot was added, so the pivot premise never met it. A cold re-verification on the frozen/pivoted version closes that gap — it re-checks every claim with equal suspicion, with no knowledge of which passages were pivot-fresh, and so catches exactly the premise Step 5 missed.
The axis and the four checks (the agent judges on exactly these)
The single axis is faktisk-korrekthet — factual correctness, re-verified COLD on the frozen/pivoted version. It is checked through four lenses:
- F1 — Verifiserbare påstander (verifiable claims): every checkable assertion (numbers, dates, named examples, attributions, causal claims) searched against primary/credible sources; opinions and predictions skipped.
- F2 — Sitat-presisjon (quote precision): any quotation must match the source verbatim — wording, attribution, and who said it. «Vi» vs «Vi i Nav» is a precision failure even when the gist is right.
- F3 — Tall-attribusjon (number attribution): every figure must trace to a
named source; a postulated number with no provenance is 🟡/🔴. Here provenance is
VERIFIED (distinct from
editorial-reviewer's P3, which only flags the absence of a source/hedge without searching). - F4 — Kilde-kvalitet (source quality): primary over secondary; a source supporting "around a third" does not verify "exactly 37 %"; post-cutoff claims must be web-searched.
Context isolation — cold reader (the agent's cardinal rule)
The agent runs in a cold context: its only input is this prompt, the frozen draft, and the writing contract. Any pivot narrative, changelog, omission list, or "what the author intended" is context pollution — the agent states it is ignoring it and judges only the text. That independence (no main-session framing-bias) is the whole reason a defect that survived the in-session gates can still be caught here.
Pivot-premise risk (the design feature). Because the agent does not know which passages were added in a late pivot, it re-checks every claim with equal suspicion — a claim's age in the draft buys it no trust. A claim that looks freshly bolted on (new anchor, new section topic, a 2025/2026 figure) is surfaced in a pivot-risk subsection. A pivot-risk claim that fails verification is the headline catch: the pivot premise that never met Step 5.
Risk sort and gate (every claim carries exactly one verdict)
- 🔴 høy risiko (high risk) → BLOCK — contradicted by evidence, or a precise claim with no usable source.
- 🟡 uverifisert (unverified) → REWORK — cannot be confirmed / weak sources; asserted as fact must be hedged, sourced, or cut.
- 🟢 verifisert (verified) → keep — confirmed by a primary/credible source matching the claim.
The agent recommends PASS / REWORK / BLOCK; the operator ([OPERATØR]) holds the
gate. Each case block below carries exactly one verdict emoji in its Verdict
field; the surrounding prose deliberately avoids the emoji so the structural lint
can read a single unambiguous verdict per case.
The six Del 4 worked cases (fasit)
Each case states the point, the check it belongs to (F1–F4), the verdict, whether it is a pivot-risk claim, and the direction a correct cold run returns.
Case 1 — pivot-premissen aldri faktasjekket (pivot premise never met Step 5) — PIVOT-RISK
- Check: F1 (verifiable claim — the pivot's anchor assertion) · Verdict: 🔴
- Pivot-risk: YES — this is the late Security Champions pivot, added after the Step 5 fact-check; its premise was never verified in-session.
- Fasit / direction: The Security Champions pivot rests on a premise asserted as established fact, but no primary source confirms it as stated. Because the pivot arrived after Step 5, the in-session sweep never touched it. The cold re-check — applying equal suspicion to a claim it does not know is pivot-fresh — searches primary sources, finds none that confirm the premise as worded, and returns high risk. This is the exact catch the gate exists for: the cold pass on the frozen version surfaces the pivot premise that Step 5 missed. Direction: source the premise to a primary record or recast it as a hedged hypothesis; do not assert. The agent returns the verdict + "none found", not a rewritten premise.
Case 2 — feilattribusjon (misattribution) — editor caught on cold read
- Check: F1 (attribution) + F2 (who said it) · Verdict: 🔴
- Pivot-risk: no.
- Fasit / direction: A statement is attributed to the wrong source/originator — the named party is not who said or published it. Contradicted by the primary record, so high risk (a contradicted claim is 🔴 regardless of partial score). Direction: correct the attribution to the verified originator, or cut the attribution; the agent names the contradicting source, never invents one.
Case 3 — sitat-presisjon «Vi» vs «Vi i Nav» (quote precision) — editor caught
- Check: F2 (quote precision) · Verdict: 🟡
- Pivot-risk: no.
- Fasit / direction: A quotation's gist is right but the wording/attribution is not verbatim: the source said «Vi i Nav», the draft renders «Vi». Changing who the «vi» refers to is a precision failure even though the meaning is close. The source exists, so this is unverified-as-worded rather than contradicted. Direction: match the source verbatim — restore «Vi i Nav» — or mark it as a paraphrase, not a quote. The agent flags the precision gap; it does not supply the corrected line.
Case 4 — postulert tall uten proveniens (postulated number, no provenance)
- Check: F3 (number attribution) · Verdict: 🟡
- Pivot-risk: plausible — a figure of this kind often arrives with a late anchor; surface it in the pivot-risk subsection if it reads freshly bolted on.
- Fasit / direction: A specific figure is stated as fact with no named
source. Distinct from
editorial-reviewer's P3, which would only flag the absence of a source/hedge: here the agent searches for the provenance, finds none that supports the exact figure, and returns unverified. Direction: trace it to a named source or hedge it ("anslagsvis"); else cut. The agent never invents a provenance to promote it to verified.
Case 5 — «Security Champions» som settet standard (a settled standard that varies)
- Check: F1 (claim) + source-scope · Verdict: 🔴
- Pivot-risk: YES — part of the same Security Champions pivot.
- Fasit / direction: The "Security Champions" practice is presented as a settled, universal standard, but in reality it is a practice that varies per organization — implementations, scope, and definitions differ. A local convention dressed as a universal standard is a source-scope failure; asserting it as settled with no source that supports the universal framing is high risk. Direction: scope the claim to where it actually holds ("varies; in some orgs…") or source the universal framing to a standard that does establish it. The agent flags the over-broad scope; it does not rewrite the passage.
Case 6 — sekundærkilde brukt for et presist tall (secondary source for a precise figure)
- Check: F4 (source quality) + F3 (number) · Verdict: 🟡
- Pivot-risk: no.
- Fasit / direction: A precise figure is backed only by a secondary source that summarizes the number — the primary record supports a directional claim ("around a third"), not the precise figure ("37 %") the draft asserts. A source supporting "around a third" does not verify "exactly 37 %". Direction: trace to the primary source and confirm the exact figure, soften the draft to the directional claim the secondary source actually supports, or hedge. The agent records the secondary source found and the precision gap, not a corrected number.
Expected aggregate (what a correct cold run looks like)
- Total verdicts surfaced: 6 (within a reasonable verification-log cap; no 🔴 silently dropped).
- By check: F1 = 3 (Cases 1, 2, 5) · F2 = 2 (Cases 2, 3) · F3 = 2 (Cases 4, 6) · F4 = 1 (Case 6). (Cases overlap checks; the headline check is listed first.)
- By risk verdict: 🔴 høy risiko = 3 (Cases 1, 2, 5 → BLOCK) · 🟡 uverifisert = 3 (Cases 3, 4, 6 → REWORK) · 🟢 verifisert = 0 among the flagged points (the rest of the draft's claims verify clean and are not listed here).
- Pivot-risk: Cases 1 and 5 are the Security Champions pivot; Case 4 is a plausible pivot-risk. Case 1 is the headline catch — the pivot premise that was never fact-checked in-session, caught only because the cold pass re-checks every claim with equal suspicion.
A run that reproduces ~these six verdicts, on ~these checks, with ~these risk levels — and that surfaces the Security Champions pivot premise as a pivot-risk 🔴 — is comparable to KTG's actual cold re-reading. Exact wording is the editor's; the agent returns direction, not rewritten copy.
Calibration boundary
Whether the agent's live verdicts truly match this fasit is judged by the operator
([OPERATØR]), not self-certified here. This fixture is the calibration target, the
same way fact-checker-cases.md, editorial-reviewer-cases.md, and
persona-reviewer-cases.md are fasits for their agents.
Live-run note. A live cold run on the frozen Del 4 requires (a) a Claude Code session reload — a freshly added agent is not invokable until the plugin agent set is rebuilt at session start — and (b) the agent run in a genuinely cold context (no drafting-session history, no pivot narrative) with read access to the frozen draft and web search. Until both hold, this fixture is the gold-standard of record.