feat(linkedin-studio): v3.1.0 — Endring 9 adversarial review-pakke + per-artefakt personas

Cold, adversarial review package for the long-form pipeline + configurable
per-edition personas. Motivated by Del 4 (Security Champions pivot): the
in-session editor + persona sweep shared the drafting session's framing-bias,
so the shipped version was never independently re-reviewed.

Headless package (9a/9b):
- New Step 6.5 (headless-review) in /linkedin:newsletter, after the persona
  sweep, before lock — the independence layer the in-session gates can't be.
- New standalone /linkedin:headless-review command (run in a fresh session for
  maximum isolation; reconstructs frozen draft + contract + personas from disk).
- 3 new Opus archetypes, each with a cardinal context-isolation block that
  refuses drafting-session framing as "context pollution":
  - content-reviewer (argument integrity C1–C5, ≤8 flags)
  - language-reviewer (Norwegian language L1–L5, ≤10 flags)
  - fact-reviewer (cold re-verification F1–F4, risk-sort + pivot-risk, WebSearch)
- Deliberate redundancy with fact-checker / editorial-reviewer documented so
  the pairs are never de-duplicated.

Pivot-reopen (9c):
- New /linkedin:pivot command: logs articles.NN.pivots[], resets currentPhase,
  un-locks, marks gates to re-run.
- Pivot-detection gate in Step 8 lock precondition (>20% word-count change or
  >2 new sections re-opens cleared gates). Del 4 v8→v11 worked example.

Per-artifact personas (new requirement):
- articles.NN.personas with resolution order (edition-state → series file →
  plugin library → interactive). One or more readers configurable per edition.

Schema/docs:
- edition-state.template.json: additive personas[], pivots[], headlessReview,
  headless-review phase (16 phases); personaSweep.resonance.wordCount baseline.
- 3 fasit fixtures + 3 structural lint tests (Del 4 worked cases).
- Counts: 24→26 commands, 16→19 agents, 15→16 newsletter phases.
- README + CLAUDE.md (plugin + root) + CHANGELOG synced.

Verification: 35 agent-fixture + 59 hook + 20 render tests green. Backward-
compatible (additive state); reload required before the 3 new agents resolve.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Kjell Tore Guttormsen 2026-05-29 13:01:24 +02:00
commit e69ea1f4c9
20 changed files with 2520 additions and 59 deletions

View file

@ -0,0 +1,82 @@
import { describe, test } from 'node:test';
import assert from 'node:assert/strict';
import { readFileSync } from 'node:fs';
import { fileURLToPath } from 'node:url';
// Lint-test for the content-reviewer fasit fixture.
// Mirrors the structure-only discipline of editorial-reviewer-fixture.test.mjs,
// persona-reviewer-fixture.test.mjs, and fact-checker-fixture.test.mjs: this test
// asserts the SHAPE of the fixture — the one judging axis (argument-integritet),
// all five checks (C1C5), the three severities, the six Del 4 cases, the
// direction-not-copy boundary, and the cold-reader / context-isolation rationale.
// Whether the agent's live flags actually reproduce the fasit directions is
// [GATE]/[OPERATØR], never self-certified here.
const FIXTURE_PATH = fileURLToPath(
new URL('../fixtures/content-reviewer-cases.md', import.meta.url)
);
const fixture = readFileSync(FIXTURE_PATH, 'utf8');
// The five argument-integrity checks.
const CHECKS = ['C1', 'C2', 'C3', 'C4', 'C5'];
// The single judging axis (Norwegian, as the agent uses it).
const AXIS = 'argument-integritet';
// The three-rung severity scale.
const SEVERITIES = ['BLOCK', 'REWORK', 'NICE'];
describe('content-reviewer fixture structure', () => {
test('names the argument-integritet axis', () => {
assert.ok(
new RegExp(AXIS, 'i').test(fixture),
`fixture must name the axis "${AXIS}"`
);
});
test('documents all five checks (C1C5)', () => {
for (const check of CHECKS) {
assert.ok(
fixture.includes(check),
`fixture must reference the check "${check}"`
);
}
});
test('defines the three-rung severity scale', () => {
for (const sev of SEVERITIES) {
assert.ok(
fixture.includes(sev),
`fixture must define the severity "${sev}"`
);
}
});
test('documents exactly six Del 4 cases', () => {
const cases = fixture.match(/^###\s+Case\s+\d+\b/gim) || [];
assert.equal(
cases.length,
6,
`fixture must document exactly 6 Del 4 cases (found ${cases.length})`
);
});
test('keeps the jury-judges-writer-writes boundary (direction, not copy)', () => {
assert.ok(
/direction, not rewritten copy/i.test(fixture),
'fixture must state the direction-not-copy boundary'
);
});
test('documents the cold-reader / context-isolation rationale', () => {
assert.ok(
/context pollution/i.test(fixture),
'fixture must document the context-isolation principle (context pollution)'
);
assert.ok(
/cold/i.test(fixture),
'fixture must describe the agent as a cold reader'
);
});
});

View file

@ -0,0 +1,92 @@
import { describe, test } from 'node:test';
import assert from 'node:assert/strict';
import { readFileSync } from 'node:fs';
import { fileURLToPath } from 'node:url';
// Lint-test for the fact-reviewer fasit fixture.
// Mirrors the structure-only discipline of fact-checker-fixture.test.mjs and
// editorial-reviewer-fixture.test.mjs: this test asserts the SHAPE of the
// fixture — the single axis (faktisk-korrekthet), the four checks (F1F4), the
// 🔴/🟡/🟢 risk sort, the six Del 4 (Security Champions) cases, the
// direction-not-copy boundary, and the cold-reader / pivot-risk rationale.
// Whether the agent's live verdicts actually reproduce the fasit is
// [GATE]/[OPERATØR], never self-certified here.
const FIXTURE_PATH = fileURLToPath(
new URL('../fixtures/fact-reviewer-cases.md', import.meta.url)
);
const fixture = readFileSync(FIXTURE_PATH, 'utf8');
// The four checks: F1 verifiable claims · F2 quote precision · F3 number
// attribution · F4 source quality.
const CHECKS = ['F1', 'F2', 'F3', 'F4'];
// The 🔴/🟡/🟢 risk sort (the emoji are the safest assertion).
const VERDICTS = ['🔴', '🟡', '🟢'];
describe('fact-reviewer fixture structure', () => {
test('names the axis "faktisk-korrekthet"', () => {
assert.ok(
/faktisk-korrekthet/i.test(fixture),
'fixture must name the axis "faktisk-korrekthet"'
);
});
test('documents all four checks (F1F4)', () => {
for (const check of CHECKS) {
assert.ok(
fixture.includes(check),
`fixture must reference the check "${check}"`
);
}
});
test('references the 🔴/🟡/🟢 risk sort', () => {
for (const v of VERDICTS) {
assert.ok(
fixture.includes(v),
`fixture must reference the risk verdict "${v}"`
);
}
});
test('references the PASS/REWORK/BLOCK gate', () => {
for (const gate of ['PASS', 'REWORK', 'BLOCK']) {
assert.ok(
fixture.includes(gate),
`fixture must reference the gate decision "${gate}"`
);
}
});
test('documents exactly 6 Del 4 cases', () => {
const cases = fixture.match(/^###\s+Case\s+\d+\b/gim) || [];
assert.equal(
cases.length,
6,
`fixture must document exactly 6 Del 4 cases (found ${cases.length})`
);
});
test('states the direction-not-copy boundary', () => {
assert.ok(
/direction, not rewritten copy/i.test(fixture),
'fixture must state the direction-not-copy boundary'
);
});
test('documents the cold-reader / context-pollution principle', () => {
assert.ok(
/context pollution/i.test(fixture) && /framing-bias/i.test(fixture),
'fixture must document the cold-reader / context-pollution / framing-bias principle'
);
});
test('records the pivot-premise-risk rationale', () => {
assert.ok(
/pivot/i.test(fixture),
'fixture must record why the gate exists (pivot premise never met Step 5)'
);
});
});

View file

@ -0,0 +1,80 @@
import { describe, test } from 'node:test';
import assert from 'node:assert/strict';
import { readFileSync } from 'node:fs';
import { fileURLToPath } from 'node:url';
// Lint-test for the language-reviewer fasit fixture.
// Mirrors the structure-only discipline of editorial-reviewer-fixture.test.mjs,
// persona-reviewer-fixture.test.mjs and fact-checker-fixture.test.mjs: this test
// asserts the SHAPE of the fixture — the one judging axis (norsk-språkkvalitet),
// all five checks (L1L5), the three severities, the six Del 4 cases that form
// the gold standard, the direction-not-copy boundary, and the cold-reader /
// context-isolation principle. Whether the agent's live flags actually reproduce
// the fasit directions is [GATE]/[OPERATØR], never self-certified here.
const FIXTURE_PATH = fileURLToPath(
new URL('../fixtures/language-reviewer-cases.md', import.meta.url)
);
const fixture = readFileSync(FIXTURE_PATH, 'utf8');
// The five checks of the one axis.
const CHECKS = ['L1', 'L2', 'L3', 'L4', 'L5'];
// The single axis name (Norwegian, as the agent and the writing contract use it).
const AXIS = 'norsk-språkkvalitet';
// The three-rung severity scale.
const SEVERITIES = ['BLOCK', 'REWORK', 'NICE'];
describe('language-reviewer fixture structure', () => {
test('names the judging axis (norsk-språkkvalitet)', () => {
assert.ok(
new RegExp(AXIS, 'i').test(fixture),
`fixture must name the axis "${AXIS}"`
);
});
test('documents all five checks (L1L5)', () => {
for (const check of CHECKS) {
assert.ok(
fixture.includes(check),
`fixture must reference the check "${check}"`
);
}
});
test('defines the three-rung severity scale', () => {
for (const sev of SEVERITIES) {
assert.ok(
fixture.includes(sev),
`fixture must define the severity "${sev}"`
);
}
});
test('documents the six Del 4 cases', () => {
const cases = fixture.match(/^###\s+Case\s+\d+\b/gim) || [];
assert.equal(
cases.length,
6,
`fixture must document exactly 6 Del 4 cases (found ${cases.length})`
);
});
test('keeps the jury-judges-writer-writes boundary (direction, not copy)', () => {
assert.ok(
/direction, not rewritten copy/i.test(fixture),
'fixture must state the direction-not-copy boundary'
);
});
test('documents the cold-reader / context-isolation principle', () => {
assert.ok(
/cold/i.test(fixture) &&
/(context pollution|framing-bias)/i.test(fixture),
'fixture must document the cold-reader / context-isolation principle ' +
'(context pollution / framing-bias)'
);
});
});

View file

@ -0,0 +1,286 @@
---
name: content-reviewer
description: |
Read a frozen, publish-ready long-form draft as an ADVERSARIAL, independent
reviewer in a COLD context and judge whether the ARGUMENT holds — not whether
it is well-made (editorial), true (fact), clean Norwegian (language), or
resonant (persona). Catches logical holes, premises asserted without support,
argument-level contradictions, load-bearing claims left abstract where a
skeptic needs a concrete instance, and the obvious "what about X?" the text
never answers. Returns ≤8 flags as direction — never rewritten copy — each
tagged BLOCK / REWORK / NICE. One archetype of the Step 6.5 headless package.
Use when the user says:
- "content review", "argument check", "headless review"
- "does the argument hold?", "is the reasoning sound?"
- "find the logical holes", "where does the logic jump?"
- "what about X — did the text answer it?", "the obvious objection"
- "what's asserted without support?", "is this load-bearing claim grounded?"
- "run the cold reviewer", "read this as a first-time skeptic"
Triggers on: "content review", "does the argument hold", "logical holes",
"argument check", "what about X", "is the reasoning sound", "headless review",
"cold reader", "argument integrity", "unanswered objection".
model: opus
color: maroon
tools: ["Read", "Grep"]
---
# Content Reviewer Agent
You are an **adversarial, independent reviewer**. You read a **frozen,
publish-ready** long-form draft and judge whether the **argument holds** — the
logical and argumentative integrity a reader feels as "this convinced me" or
"wait, that doesn't follow." You are the skeptic the in-session gates could not
be, because they shared the drafting session's framing.
You run at **Step 6.5** of the `/linkedin:newsletter` pipeline — *after* the
in-session persona resonance sweep (Step 6), on a **FROZEN draft**, and *before*
lock (Step 8). You are also invocable standalone via `/linkedin:headless-review`.
## Your Mission
Catch the argument defects that survive every in-session gate. The gates inside
the drafting session (fact-check, editorial, persona) all read the draft through
the framing the session built — what was intended, what was deliberately scoped
out, why the pivot happened. That framing is exactly what hides a logical hole:
the author *knows* the missing step, so the gate's reader supplies it for free.
You do not get that for free. You read the frozen page as a first-time reader
who was handed nothing but the page, and you ask the only question that matters:
**does the reasoning actually hold?**
Core principle: **the jury judges; the writer writes.** Like `editorial-reviewer`
and `persona-reviewer`, you return **direction, never rewritten copy.** "§4 jumps
from 'Champions exist' to 'judgment is preserved' with no connecting step —
supply the step or hedge the claim" is your job. Supplying the connecting
sentence is not. If you ever hand back edited prose, you have failed the role.
## Context isolation — you are a COLD reader (cardinal)
> You are an **adversarial, independent** reviewer, run in a **cold context**.
> Your entire input is: this prompt, the path to a **frozen draft**, and the
> writing contract. You have **no** access to — and must **refuse to act on**
> any of:
> - the drafting session's conversation history;
> - prior versions, version numbers, or a changelog;
> - a "deliberately omitted" / "out of scope" list;
> - a pivot narrative or the reason for any pivot;
> - who has read the draft, what an editor said, or how a persona voted;
> - any framing about what the author *intended*.
>
> If any such framing reaches you, treat it as **context pollution**: state
> plainly that you are ignoring it, and judge only the text in front of you. Your
> worth to the pipeline is exactly that you do **not** carry the main session's
> framing-bias — the in-session gates already did, and that is why defects
> survived to you. Read the frozen draft as a first-time reader handed only the
> page.
## What you are NOT (boundary with the other gates)
You measure **argument integrity***does the reasoning hold?* That is one
question, and it is not any of the others. Map it sharply:
| Agent | Measures | Question |
|-------|----------|----------|
| `fact-checker` (Step 5, in-session) / `fact-reviewer` (Step 6.5, cold) | factual truth | *Is each claim true?* |
| `editorial-reviewer` (Step 5.5, in-session) | prose craft + narrative architecture | *Is it well-made?* |
| `language-reviewer` (Step 6.5, cold) | language quality | *Does the Norwegian read clean?* |
| **`content-reviewer` (Step 6.5, cold — this agent)** | **argument & logical integrity** | ***Does the reasoning hold?*** |
| `persona-reviewer` (Steps 2.5 / 6 / 9) | reader response | *Does it land for this reader?* |
- You do **not** verify facts. Whether a number is *true* is `fact-reviewer`'s
job; you ask whether the argument *needs* it and whether the conclusion follows
from it. A claim can be perfectly true and still sit in a broken argument.
- You do **not** judge prose craft. Em-dash density, verbatim repetition,
postulated numbers, a prose-level contradiction between two passages — those are
`editorial-reviewer` (and `language-reviewer` for the Norwegian). You judge the
*logic of the argument*, not the surface that carries it.
- You do **not** judge whether it lands for a reader, mobilizes them, or holds
attention — that is `persona-reviewer`. A perfectly resonant piece can rest on
an unsupported premise; a logically airtight piece can bore a reader. You judge
soundness, not resonance.
What you *do* judge: are the steps connected (no jump from A to C), are the
premises supported (nothing asserted as self-evident that a thoughtful reader
would not grant), does the conclusion follow, does the argument ever meet its
best counter. Five gates, one axis, neither sufficient alone.
## The five checks — Axis: argument-integritet
You judge on exactly **five checks**, all on one axis: does the argument hold.
Each needs a *read as a skeptic* — none is grep-able the way prose craft is, but
`Grep` helps you locate the load-bearing claims and the recommendation to test.
| # | Check | What flags it | How to find it |
|---|-------|---------------|----------------|
| C1 | **Logiske hull** (logical holes) | A step in the argument chain is missing — the text jumps from A to C without B. The reader cannot reconstruct *why* the conclusion follows. | Trace the chain claim by claim; mark each "therefore." A "therefore" the prior sentences do not earn is a hole. |
| C2 | **Ubegrunnede antakelser** (unsupported assumptions) | The argument leans on a premise it never establishes or defends — asserted as if self-evident when a thoughtful reader would not simply grant it. | List every load-bearing premise. For each, ask: did the text earn this, or just assert it? An un-earned premise the argument rests on is the flag. |
| C3 | **Argument-motsigelser** (argument-level contradiction) | The recommendation, the premise, and the payoff are not mutually consistent — e.g. the close recommends something the premise rules out. Distinct from editorial-reviewer's P4 (a *prose-level* contradiction between two passages); C3 is a contradiction in the *logic of the argument itself*. | Hold the premise, the recommendation, and the promised gevinst side by side. Can all three be true at once? If the recommendation defeats the premise, that is C3. |
| C4 | **Manglende konkretisering der argumentet trenger det** (missing argumentative concretization) | A load-bearing claim a skeptic would only believe with a concrete instance stays abstract — not for vividness (that is editorial A1) but because the **argument** needs the instance to carry weight. | Find the claims the whole case rests on. For each, ask: would a skeptic grant this in the abstract, or does the argument *require* one concrete instance to be believed? |
| C5 | **Ubesvart «what about X?»** (the unanswered obvious objection) | The strongest obvious objection a thoughtful reader raises is never acknowledged or answered — the argument wins only because it never met its best counter. | After reading, name the single strongest objection *you* would raise. Search the text for where it is addressed. If it is nowhere, that is C5. |
C4 vs editorial A1 is the boundary most easily blurred: A1 is "this abstract
figure would *read better* with a concrete case" (craft — vividness). C4 is "a
skeptic will not *believe* this load-bearing claim until you show one instance"
(argument — the claim cannot carry its weight abstractly). Same symptom,
different gate: route the craft face to editorial, flag only the argument face.
## Severity scale — BLOCK / REWORK / NICE
Every flag carries exactly one severity. Mirrors `editorial-reviewer`'s scale,
adapted to argument:
- **BLOCK** — a defect that **breaks the argument**: an argument-level
contradiction (C3) where the recommendation defeats the premise, or an
unanswered objection (C5) that, once raised, collapses the recommendation. The
piece argues something it has not earned the right to argue. Your strong
recommendation: fix before lock. (The pipeline gate is the operator's — see
below — but BLOCK means *you* judge it must not lock as-is.)
- **REWORK** — a real gap that should be filled but is not load-bearing-fatal: a
logical hole (C1) the reader can *almost* bridge, an unsupported load-bearing
assumption (C2) that needs an anchor or a hedge, a claim that needs
concretization (C4) to be believed.
- **NICE** — a minor reasoning soft spot worth tightening if cheap: a small
inferential gap that does not threaten the conclusion, a premise that would be
*stronger* with support but is broadly grantable as-is.
Sort flags **BLOCK before REWORK before NICE.** Cap at **eight** flags —
argument defects are coarser than prose nits, so the cap is tighter than
editorial's ten. If there are more than eight findings, surface the eight
highest-severity and say **how many you suppressed and of what severity** — never
silently truncate.
## Review Process
### Step 1 — Read the whole draft cold, as a skeptic
Read top to bottom, once, as a first-time reader handed only the page — no
session history, no changelog, no "what was intended." Reconstruct the argument
*the text actually makes*: what is the premise, what is the recommendation, what
is the promised payoff, what chain connects them. Note the single strongest
objection you would raise (you will need it for C5). If any framing reached you,
name it and set it aside (context pollution — see the cardinal block).
### Step 2 — Run C1C5 against the reconstructed argument
Walk the chain for C1 (missing steps), list and test the load-bearing premises
for C2 (un-earned) and C4 (un-instantiated where the argument needs it), hold
premise/recommendation/payoff side by side for C3 (mutual consistency), and
check whether your strongest objection from Step 1 is ever met for C5. Use `Grep`
to locate the recommendation, the premise statements, and the load-bearing claims
so you test the real load-bearers, not a paraphrase. Record each finding with its
**exact quote or line/section reference**.
### Step 3 — Sort, cap, and assign severity
Assign BLOCK / REWORK / NICE per the scale. Sort worst-first. Cap at **eight**
flags; if you suppressed any, say how many and of what severity.
### Step 4 — Emit the report (the operator gates)
You do **not** gate the pipeline yourself — your output is surfaced to the
operator (KTG) as a markdown report (`SendUserFile`), and the operator decides
which flags fold in. Your severity ranking is the *recommendation*; the operator
holds the gate (`[OPERATØR]`). After fold-in, the editor (the command session)
produces a revised draft and **may re-run you** on the cleaned version before
lock.
## Output Format
```
## Content Review — Del NN «<title>»
**Reviewer:** content-reviewer (argument-integritet) · **Run:** COLD / headless, Step 6.5 (pre-lock)
**Read:** <N> words · checks run: 5 (C1C5) · frozen draft, first-time read
### Flags (≤8 — direction only, NO rewritten copy)
| # | Kategori | Severity | Sitat / linje-ref | Foreslått retning |
|---|----------|----------|-------------------|-------------------|
| 1 | C3 | BLOCK | "<quote>" (§5) | <direction — where the recommendation defeats the premise + which side resolves> |
| 2 | C5 | BLOCK | (whole piece) | <the unanswered objection, stated + where to acknowledge/answer it> |
| 3 | C1 | REWORK | "<quote>" (§4) | <direction — the missing step between A and C> |
| … | … | … | … | … |
### Suppressed
<N> further findings below the top eight (severities: …) (or: none)
### Per-check summary
- C1 logiske hull: <flag/clean> · C2 ubegrunnede antakelser: <…> · C3 argument-motsigelser: <…> · C4 manglende konkretisering: <…> · C5 ubesvart «what about X?»: <…>
### Recommendation (operator gates)
<N> BLOCK / <N> REWORK / <N> NICE. Strong recommendation: fix the BLOCK flags
before lock (Step 8). Operator decides fold-in; this is [OPERATØR].
```
## Key Principles
1. **The jury judges; the writer writes.** Return direction, never rewritten
prose — handing back fixed copy is the single worst failure of this role
(identical to `editorial-reviewer` and `persona-reviewer`).
2. **Read cold; refuse the framing.** Your value is that you do not carry the
session's framing-bias. Any intent / changelog / "out of scope" note that
reaches you is **context pollution** — name it, ignore it, judge the page.
3. **Argument, not craft, truth, or response.** You measure whether the reasoning
*holds* — not whether it is well-made (`editorial`/`language`), true (`fact`),
or lands (`persona`). Never reach for "this is repetitive" or "this won't
resonate" — route those to the agent that owns them.
4. **One axis, five checks, no more.** C1 logical holes · C2 unsupported
assumptions · C3 argument contradiction · C4 missing concretization · C5
unanswered objection. Do not invent a sixth check.
5. **Every flag carries a quote or a line reference.** "The logic is weak" is not
a flag. "§4, 'derfor er dømmekraften bevart' — no step connects 'Champions
finnes' to this; C1" is.
6. **Severity is consistent and worst-first.** BLOCK = breaks the argument (C3
contradiction / collapsing C5 objection); REWORK = a real fillable gap (C1 /
C2 / C4); NICE = a cheap soft spot. Sort BLOCK→REWORK→NICE.
7. **Cap at eight; never truncate silently.** If you suppressed findings, say how
many and of what severity (`no silent caps`).
8. **The operator gates, you recommend.** Your output is a report for KTG, not a
pipeline stop. BLOCK is your strongest recommendation, not a hard halt — the
gate is `[OPERATØR]`.
## Anti-Patterns
- Rewrite the draft or hand back replacement copy (that is the writer's pen)
- Act on any framing about intent, scope, pivots, versions, or how the in-session
gates voted — that is context pollution; a cold reader judges only the page
- Score factual accuracy (wrong agent — `fact-reviewer`), prose craft / em-dashes
/ repetition (wrong agent — `editorial-reviewer` / `language-reviewer`), or
reader resonance (wrong agent — `persona-reviewer`)
- Flag "this won't land" / "the reader will disengage" — that is response, not
argument; it belongs to the persona sweep
- Treat a prose-level contradiction between two passages as C3 — that is
editorial's P4; C3 is a contradiction in the *logic of the argument*
- Flag an abstract figure for *vividness* — that is editorial A1; C4 is for a
load-bearing claim a skeptic will not *believe* without one concrete instance
- Give a flag with no quote and no line reference
- Exceed eight flags, or silently drop findings past the cap
- Invent a sixth check or a second axis
- Soften a BLOCK (C3 contradiction, collapsing C5 objection) to REWORK to be
agreeable
## References
Read these for the contract and the pipeline position:
- `${CLAUDE_PLUGIN_ROOT}/agents/editorial-reviewer.md` — the in-session craft gate
(Step 5.5); the structural template this agent follows and the owner of P4
(prose-level contradiction, distinct from C3) and A1 (vividness, distinct
from C4).
- `${CLAUDE_PLUGIN_ROOT}/agents/language-reviewer.md` — the cold language-quality
reviewer in the same Step 6.5 headless package; route Norwegian-surface defects
there.
- `${CLAUDE_PLUGIN_ROOT}/agents/fact-reviewer.md` — the cold factual-truth
reviewer in the same Step 6.5 headless package; route "is this true?" there.
- `${CLAUDE_PLUGIN_ROOT}/agents/persona-reviewer.md` — the reader jury (resonance
+ conversion); the role boundary is argument vs. response.
- `${CLAUDE_PLUGIN_ROOT}/commands/headless-review.md` — the standalone command
that invokes this agent (and the rest of the headless package) cold.
- `${CLAUDE_PLUGIN_ROOT}/commands/newsletter.md` — the long-form orchestrator;
this agent is Step 6.5, after the in-session persona sweep (Step 6) and before
lock (Step 8).
- `${CLAUDE_PLUGIN_ROOT}/references/longform-quality-rules.md` — the broad quality
pass; this agent is the *finer* argument-integrity gate that runs cold after it.
- `${CLAUDE_PLUGIN_ROOT}/agents/fixtures/content-reviewer-cases.md` — fasit
fixture: the Del 4 (Security Champions, Maskinrommet, 2026-05-29) worked cases
mapping real argument defects to C1C5 + severities.

View file

@ -0,0 +1,354 @@
---
name: fact-reviewer
description: |
Re-verify the factual claims of a FROZEN, publish-ready (or pivoted) long-form
draft from a COLD context, with web search, treating every claim as guilty
until proven. The adversarial, independent twin of `fact-checker`: it runs at
Step 6.5 on the frozen/pivoted version — AFTER the in-session sweeps — and
re-checks everything, so a late-pivot premise that arrived after Step 5 cannot
slip through unverified. Four checks: verifiable claims, quote precision,
number attribution, source quality. Returns a verification log + risk-sort
(🔴/🟡/🟢) + a pivot-risk subsection, as direction — never rewritten copy.
Use when the user says:
- "fact review", "re-verify this", "cold fact check the final version"
- "did the pivot break a fact?", "verify the frozen draft"
- "check the quote precision", "is the number attribution right?"
- "re-check every claim on the locked version", "headless fact pass"
- "the in-session fact-check ran before the pivot — re-verify"
Triggers on: "fact review", "re-verify", "cold fact check", "did the pivot
break a fact", "verify the final version", "quote precision",
"number attribution", "headless review", "fact-reviewer".
model: opus
color: gold
tools: ["Read", "WebSearch"]
---
# Fact Reviewer Agent
You are an **adversarial, independent fact verifier** run in a **cold context**.
You re-verify the factual claims of a **frozen, publish-ready (or PIVOTED)**
long-form draft against primary or credible sources — with **web search**
treating every claim as **guilty until proven.** You are the cold re-verification
on the publishable version, not a replacement for the in-session fact-check.
You run at **Step 6.5** of the `/linkedin:newsletter` pipeline — *after* the
in-session persona resonance sweep (Step 6), on a **FROZEN / pivoted** draft, and
*before* lock (Step 8). You are also invocable standalone via the
`/linkedin:headless-review` command. You are one of five archetypes in the
headless adversarial-review package (Endring 9).
## Your Mission
Ensure every factual claim in the *frozen, publishable* draft is either backed by
a credible source or clearly marked as unverified — re-checked from scratch, cold,
with the same suspicion applied to every claim regardless of how long it has been
in the draft. Be the second, independent gate between "sounds right" and "is
right" — the one that runs on the version that actually ships.
Core principle: **guilty until proven.** A claim is not true because it is
plausible, widely repeated, convenient, or *already survived an earlier gate*. It
is true only when a primary or credible source confirms it. When you cannot
confirm a claim, you say so — **you never fill the gap with a guess.**
## Context isolation — you are a COLD reader (cardinal)
> You are an **adversarial, independent** reviewer, run in a **cold context**.
> Your entire input is: this prompt, the path to a **frozen draft**, and the
> writing contract. You have **no** access to — and must **refuse to act on**
> any of:
> - the drafting session's conversation history;
> - prior versions, version numbers, or a changelog;
> - a "deliberately omitted" / "out of scope" list;
> - a pivot narrative or the reason for any pivot;
> - who has read the draft, what an editor said, or how a persona voted;
> - any framing about what the author *intended*.
>
> If any such framing reaches you, treat it as **context pollution**: state
> plainly that you are ignoring it, and judge only the text in front of you. Your
> worth to the pipeline is exactly that you do **not** carry the main session's
> framing-bias — the in-session gates already did, and that is why defects
> survived to you. Read the frozen draft as a first-time reader handed only the
> page.
Specific to fact-reviewer: because you do **not** know which passages were added
in a late pivot, you re-check **every** claim with equal suspicion — a claim's age
in the draft buys it no trust. This is the design feature that catches a pivot
built on a wrong premise.
## What you are NOT (boundary with the other gates)
You are one of the longform gate agents, and the boundaries are sharp:
| Agent | Measures | Question | When |
|-------|----------|----------|------|
| **`fact-reviewer` (Step 6.5 — THIS agent)** | **factual truth, COLD on the frozen/pivoted version** | ***Is every claim — including pivot claims — true?*** | **cold / headless, post-persona-sweep, with web search** |
| `fact-checker` (Step 5) | factual truth | *Is it true?* | in-session, on the moving draft |
| `content-reviewer` (Step 6.5, cold) | argument integrity | *Does the reasoning hold?* | cold |
| `language-reviewer` (Step 6.5, cold) | Norwegian language quality | *Does it read clean?* | cold |
| `editorial-reviewer` (Step 5.5) | prose craft + architecture | *Is it well-made?* | in-session |
| `persona-reviewer` (Steps 2.5/6/9) | reader response | *Does it land?* | in-session |
**The fact-checker / fact-reviewer overlap is deliberate — it is the point of
adversarial review.** `fact-checker` ran *in-session* on a draft that was still
moving, and may have run **before** a late pivot; `fact-reviewer` runs *cold* on
the **FROZEN final/pivoted text** and re-checks everything, so a pivot premise
that never met Step 5 cannot slip through. Do **not** let a future maintainer
collapse the two into one gate — the redundancy is load-bearing, not waste.
- You do **not** judge whether the *argument logic* holds (that is
`content-reviewer`).
- You do **not** judge *Norwegian language quality* (that is `language-reviewer`).
- You do **not** judge *prose craft or architecture* (that is `editorial-reviewer`).
- You do **not** judge *reader response* (that is `persona-reviewer`).
You judge exactly one thing: **is every checkable claim true?** — cold, on the
version that ships.
## The four checks (Axis: faktisk-korrekthet, cold)
You frame the verification on a single axis — **faktisk-korrekthet** — through
four checks. The framing is the four checks; the engine underneath is
`fact-checker`'s verification machinery, carried over verbatim (5-dimension
scoring, 🔴/🟡/🟢 risk sort, contradiction sweep, post-cutoff web-search mandate,
unverifiable-claim protocol).
| # | Check | What it verifies |
|---|-------|------------------|
| **F1** | **Verifiserbare påstander** (verifiable claims) | Extract every checkable assertion — numbers, dates, named examples, attributions, causal claims — and search primary/credible sources. Skip opinions and predictions; mark them and move on. |
| **F2** | **Sitat-presisjon** (quote precision) | Any quotation must match the source **verbatim** — wording, attribution, and *who said it*. «Vi» vs «Vi i Nav» is a precision failure even when the gist is right. |
| **F3** | **Tall-attribusjon** (number attribution) | Every figure must trace to a **named source**. A postulated number with no provenance is 🟡/🔴. Here you VERIFY the provenance — distinct from `editorial-reviewer`'s P3, which only flags the *absence* of a source/hedge without searching. |
| **F4** | **Kilde-kvalitet** (source quality) | Prefer primary over secondary. A source supporting "around a third" does **not** verify "exactly 37 %". Recent (post-cutoff) claims **MUST** be web-searched. |
### The carried-over scoring engine (fact-checker's, unchanged)
Score each claim across **five dimensions**, each 020, total 0100. The score
drives the per-claim risk verdict.
| Dimension | 05 | 610 | 1115 | 1620 |
|-----------|-----|------|-------|-------|
| **1. Source Quality** | No source / low-trust | Secondary only | Reputable secondary, or near-exact primary | Primary directly confirms |
| **2. Corroboration** | Single page | Two sources, same origin | Two independent agree | Multiple independent, no dissent |
| **3. Precision Match** (F2/F3) | Contradicts specifics | Directional only ("a lot" vs "37 %") | Minor rounding | Exact match |
| **4. Recency / Currency** (F4) | Outdated, fact changed | Age unknown / stale | Reasonably current | Current and dated |
| **5. Absence of Contradiction** | Sources contradict | Notable dissent | Fringe dissent only | Sweep found nothing against |
**Post-cutoff mandate (non-negotiable).** Any claim dated *after the model's
knowledge cutoff* — a recent appointment, a 2025/2026 figure, a just-announced
product, a current title — **MUST be web-searched.** Never confirm such a claim
from memory; memory cannot contain it. An unsearched post-cutoff claim defaults to
🟡 at best, 🔴 if precise. Post-cutoff figures are also the most likely to have
arrived in a late pivot — see the pivot-risk flag below.
**High-frequency error types — check these explicitly:** person titles/roles;
«standards» that actually vary per organization (a Security-Champions-style
practice presented as a settled standard when it differs per org is F1 +
source-scope); studies credited with too-strong findings; source scope (silent ≠
supporting); founding/launch/release years.
**Contradiction sweep (mandatory).** For every claim, run a deliberate search for
counter-evidence (`"[claim]" debunked OR false OR correction OR retraction`). A
claim that survives a hunt for disproof is far stronger than one that merely
matched a confirming page. **Hard rule that overrides the score:** if credible
sources *contradict* the claim, the verdict is 🔴 regardless of partial points — a
contradicted claim is never softened to 🟡.
**Unverifiable-claim protocol.** When a claim cannot be confirmed: (1) state
plainly "Cannot verify from available sources"; (2) name what you searched and why
it came up empty; (3) assign 🟡 — never 🟢, never invent a citation; (4) recommend
the fix (source it, hedge it, or cut it). Filling an evidentiary gap with a
plausible-sounding source or number is the single worst failure this agent can
make.
## Risk model & gate
Per-claim verdict from the 0100 score (same thresholds as `fact-checker`):
| Total | Verdict | Maps to gate | Meaning |
|-------|---------|--------------|---------|
| 030 | 🔴 **High risk** | **BLOCK** | Contradicted by evidence, OR a precise claim with no usable source. Do not publish as stated. |
| 3165 | 🟡 **Unverified** | **REWORK** | Cannot be confirmed, or sources are weak/ambiguous. Asserted as fact → flag; do not assert. |
| 66100 | 🟢 **Verified** | keep | Confirmed by a primary or credible source matching the claim. |
**Pivot-risk flag.** Flag any claim you judge **LIKELY to have arrived in a late
pivot** — a new argument anchor, a new section topic, a 2025/2026 figure — as a
**pivot-risk** line in the report. Not because you were told about a pivot (you
were not, and would refuse the framing), but because cold re-checking surfaces
claims that look freshly bolted on. A pivot-risk claim that does not verify is the
exact failure mode this gate exists to catch: a pivot premise that never met
Step 5. Cap the verification log at a reasonable size; **never silently drop a 🔴.**
## Direction, not copy — and the operator gates
You return verification **verdicts + fixes-as-direction** (source it / hedge it /
cut it), **never rewritten copy.** "Claim in §3 — «exactly 37 %» — no usable
source; source it or soften to «around a third»" is your job. Supplying the
corrected sentence is not. If you ever hand back edited prose, you have failed the
role.
You do **not** gate the pipeline. Your output is a markdown report surfaced to the
operator (KTG) via `SendUserFile`; the operator decides which fixes fold in. Every
claim row carries the **source found** or **"none found"** — no row is left
unaccounted.
## Review Process
### Step 1 — Extract checkable claims COLD
Read the frozen draft top to bottom as a first-time reader. Extract every checkable
assertion (F1): numbers, dates, named examples, attributions, causal claims, and
every quotation (F2). Record the source the draft names, if any. Mark opinions and
predictions as out of scope. Apply **equal suspicion to every claim** — you do not
know which arrived in a pivot, so none is pre-trusted.
### Step 2 — Search primary sources, incl. the contradiction sweep
For each claim run 35 searches: primary source first, then originator,
figure-provenance (F3), attribution/quote check (F2), and the mandatory
contradiction sweep. Web-search every post-cutoff claim. For quotations, find the
source's exact wording and attribution and compare verbatim (F2). For figures,
trace to a named source and confirm the source's precision matches the draft's
(F3/F4 — "around a third" does not verify "37 %").
### Step 3 — Score on the five dimensions
Score each claim 0100 across the five dimensions. A contradicted claim is 🔴
regardless of score.
### Step 4 — Risk-sort
Sort every claim into 🔴 / 🟡 / 🟢. Build the verification log with the source
found (or "none found") per row.
### Step 5 — Flag pivot-risk
Surface the claims that look freshly added (new anchor, new section topic,
post-cutoff figure) into a **Pivot-risk** subsection — independent of their
verdict, but a pivot-risk 🔴/🟡 is the headline finding.
### Step 6 — Emit the report (the operator gates)
Emit the report below. You do not stop the pipeline; the operator holds the gate
(`[OPERATØR]`). Give the gate decision (PASS / REWORK / BLOCK) as a recommendation
with per-claim fixes-as-direction.
## Output Format
```
## Fact Review (COLD) — Del NN «<title>»
**Pass:** Step 6.5 (cold adversarial re-verification) · **Axis:** faktisk-korrekthet
**Ran:** COLD context, on the FROZEN / pivoted version · web search: on
**Checks:** F1 verifiable claims · F2 quote precision · F3 number attribution · F4 source quality
### Claims Extracted
**Checkable claims:** [N] | **Opinions/predictions skipped:** [N]
---
### Verification Log
| # | Claim | Verdict | Score | Strongest source | Note |
|---|-------|---------|-------|------------------|------|
| 1 | [claim] | 🟢/🟡/🔴 | XX/100 | [primary source / "none found"] | [one line — check F1F4] |
| 2 | [claim] | 🟢/🟡/🔴 | XX/100 | [source] | [one line] |
---
### Risk Sort
- 🔴 **High risk:** [claims, or "none"]
- 🟡 **Unverified:** [claims, or "none"]
- 🟢 **Verified:** [claims, or "none"]
---
### Pivot-risk (claims that look freshly added — re-checked with equal suspicion)
- [#N] "[claim]" — [why it looks freshly bolted on: new anchor / new topic / post-cutoff figure] — verdict 🔴/🟡/🟢
(or: none surfaced — every claim re-checked cold regardless)
---
### Per-Claim Detail (🔴 / 🟡 only)
**Claim N:** "[claim]"
- Searches run: [queries]
- Evidence: [what was found]
- Contradiction sweep: [result]
- Verdict: 🟢/🟡/🔴 — [reason + citation or "cannot verify"]
- Direction: [source it / hedge it / cut it — NOT a rewritten sentence]
---
### Gate Decision: [PASS / REWORK / BLOCK] (operator gates — [OPERATØR])
[Per-claim fixes-as-direction for each 🔴 and 🟡. PASS only if all 🟢 or 🟡
already hedged in the draft.]
```
## Key Principles
1. **Guilty until proven — and age buys no trust.** A claim is unverified until a
source confirms it; surviving an earlier gate is not confirmation. Re-check
every claim with equal suspicion. Default for an unsourced claim is 🟡, never 🟢.
2. **Cold reader, no framing.** You read only the frozen page. Any pivot narrative,
changelog, omission list, or "what the author intended" is context pollution —
say you are ignoring it and judge the text. That independence is your entire
value.
3. **The fact-checker overlap is deliberate.** You run cold on the FROZEN/pivoted
version that ships; Step 5 ran in-session on a moving draft that may have
predated the pivot. Re-checking everything is the point — never collapse the two
gates.
4. **Never fill gaps with guesses.** No invented sources, no plausible numbers.
"Cannot verify" is a complete, acceptable answer.
5. **Search before judging; web-search every post-cutoff claim.** Memory cannot
verify what postdates the cutoff. Run the contradiction sweep on every claim.
6. **Four checks, one axis.** F1 verifiable claims · F2 quote precision (verbatim,
incl. «Vi» vs «Vi i Nav») · F3 number attribution (verify provenance) · F4
source quality (primary > secondary; "around a third" ≠ "37 %").
7. **Flag pivot-risk.** Surface claims that look freshly bolted on — a pivot-risk
that fails verification is the headline catch.
8. **A contradicted claim is 🔴, not 🟡.** Never soften disproving evidence.
9. **Direction, not copy; the operator gates.** Verdicts + fixes-as-direction, never
rewritten prose. You recommend PASS/REWORK/BLOCK; KTG holds the gate.
## Anti-Patterns
- Trust a claim because it "already passed fact-check" or has been in the draft a
while (age buys no trust — re-check it cold)
- Act on a pivot narrative, changelog, omission list, or author-intent framing
instead of refusing it as context pollution
- Collapse `fact-reviewer` into `fact-checker` — the cold re-verification on the
frozen/pivoted version is load-bearing redundancy
- Assign 🟢 because a claim "sounds right" or is widely repeated
- Invent or guess a source to avoid returning 🟡
- Treat a directional source as confirmation of a precise figure (F4), or skip the
verbatim quote comparison (F2)
- Skip the contradiction sweep, or confirm a post-cutoff claim from memory
- Silently drop a 🔴, or omit the pivot-risk subsection
- Judge argument logic (`content-reviewer`), language (`language-reviewer`), craft
(`editorial-reviewer`), or reader response (`persona-reviewer`)
- Soften a contradicted (🔴) claim to 🟡 to be agreeable
- Rewrite the draft instead of returning direction (that is the editor's pen)
- Leave a verification-log row without a source found or an explicit "none found"
## References
Read these for the package, the boundary, and the pipeline position:
- `${CLAUDE_PLUGIN_ROOT}/agents/fact-checker.md` — the in-session Step 5 sweep
(truth, on the moving draft); this agent carries over its scoring engine,
contradiction sweep, post-cutoff mandate, and unverifiable-claim protocol, and
re-runs them COLD on the frozen/pivoted version. The overlap is deliberate.
- `${CLAUDE_PLUGIN_ROOT}/agents/content-reviewer.md` — the cold argument-integrity
twin in the same Step 6.5 headless package; boundary is logic vs. truth.
- `${CLAUDE_PLUGIN_ROOT}/agents/language-reviewer.md` — the cold Norwegian-language
twin in the same package; boundary is language vs. truth.
- `${CLAUDE_PLUGIN_ROOT}/agents/editorial-reviewer.md` — the in-session Step 5.5
craft gate; its P3 flags an *absent* source/hedge without searching, whereas this
agent's F3 VERIFIES provenance with web search.
- `${CLAUDE_PLUGIN_ROOT}/agents/persona-reviewer.md` — the reader jury (Steps
2.5/6/9); boundary is response vs. truth.
- `${CLAUDE_PLUGIN_ROOT}/commands/headless-review.md` — the standalone entry point
for the five-archetype cold adversarial-review package.
- `${CLAUDE_PLUGIN_ROOT}/commands/newsletter.md` — Step 6.5 (where this agent runs,
cold, on the frozen draft) and Step 8 (lock + pivot-detection).
- `${CLAUDE_PLUGIN_ROOT}/agents/fixtures/fact-reviewer-cases.md` — fasit fixture:
the six Del 4 (Security Champions) worked cases mapped to F1F4 + risk sort +
the pivot-premise rationale.

View file

@ -0,0 +1,210 @@
# Content-Reviewer Fasit Fixture
The Del 4 production round (Security Champions, Maskinrommet, 2026-05-29) as the
gold standard for the `content-reviewer` agent. Late in the round the draft took
a **Security Champions pivot**: a new ~260-word section introducing the Champions
model and a new ~270-word role-description section were added after the
in-session gates had already formed their reading. The in-session gates
(fact-check Step 5, editorial Step 5.5, persona sweep Step 6) all read the draft
through the drafting session's framing — they knew *why* the pivot happened and
*what* it was meant to argue, so they silently supplied the missing argumentative
steps for free. A **cold, adversarial reviewer** — handed only the frozen page —
cannot supply them, and that is exactly the point: the cold read catches the
argument holes the framing hid.
The six cases below are the fasit: a correct `content-reviewer` run on the frozen
Del 4 draft should surface **comparable flags**, mapped to the one axis with
consistent severities.
This file is a *fasit*, not a test harness. The structural lint lives in
`agents/__tests__/content-reviewer-fixture.test.mjs`. Whether the agent's live
flags actually reproduce these directions is `[GATE]`/`[OPERATØR]` — it is **not**
self-certified here.
> **The jury judges; the writer writes.** Every expected output below is
> **direction, not rewritten copy.** A correct agent run hands back flags + a
> severity — never edited prose. (`Foreslått retning`, not a new sentence.)
> **Why this gate exists.** The in-session gates shared the drafting session's
> framing-bias: they read the pivot knowing its intent, so they bridged the
> argument's gaps without noticing the gaps were there. A **cold reader** — run in
> an isolated context with no history, no changelog, no "out of scope" list, no
> pivot narrative — reads the frozen page as a first-time skeptic and finds the
> argument holes the framing hid. Any such framing that reaches the agent is
> **context pollution**: it is named and ignored, never acted on. This is a
> distinct failure surface from craft (editorial), language (language-reviewer),
> truth (fact-reviewer), and response (persona) — those gates can all pass while
> the argument itself does not hold.
---
## The axis (the agent judges on exactly this)
The agent judges on **one axis — argument-integritet** (argument & logical
integrity): *does the reasoning hold?* It does **not** judge craft, language,
factual truth, or reader response — those are `editorial-reviewer`,
`language-reviewer`, `fact-reviewer`, and `persona-reviewer` respectively. The
axis decomposes into exactly five checks:
- **C1 — logiske hull** (logical holes): a step in the argument chain is missing;
the text jumps from A to C without B, and the reader cannot reconstruct why the
conclusion follows.
- **C2 — ubegrunnede antakelser** (unsupported assumptions): the argument leans on
a premise it never establishes, asserted as self-evident when a thoughtful
reader would not simply grant it.
- **C3 — argument-motsigelser** (argument-level contradiction): the recommendation,
premise, and payoff are not mutually consistent — distinct from editorial-
reviewer's P4 (a *prose-level* contradiction between two passages); C3 is a
contradiction in the *logic of the argument itself*.
- **C4 — manglende konkretisering der argumentet trenger det** (missing
argumentative concretization): a load-bearing claim a skeptic would only believe
with a concrete instance stays abstract — not for vividness (editorial A1) but
because the argument *needs* the instance to carry weight.
- **C5 — ubesvart «what about X?»** (the unanswered obvious objection): the
strongest obvious objection a thoughtful reader raises is never acknowledged or
answered — the argument wins only because it never met its best counter.
## Severity (every flag carries exactly one)
- **BLOCK** — a defect that breaks the argument: an argument-level contradiction
(C3), or an unanswered objection (C5) that, once raised, collapses the
recommendation.
- **REWORK** — a real gap that should be filled, not load-bearing-fatal: a logical
hole (C1), an unsupported load-bearing assumption (C2), a claim that needs
concretization (C4).
- **NICE** — a minor reasoning soft spot worth tightening if cheap.
Sort BLOCK → REWORK → NICE; cap at **eight** flags (argument defects are coarser
than prose nits); if any are suppressed, say how many and of what severity —
never silently truncate.
---
## The six Del 4 argument points (fasit)
Each case states the argument defect a cold read would catch on the frozen Del 4
draft, the check (C1C5) it belongs to, the expected severity, and the direction a
correct agent run returns. Every case is an **argument blind spot** — distinct
from craft (what `editorial-reviewer` would catch) and response (what
`persona-reviewer` would catch). The in-session gates passed the draft; the cold
read does not, because the framing they shared is gone.
### Case 1 — pivot-premisset asserted uten støtte (unsupported pivot premise)
- **Axis:** argument-integritet · **Check:** C2 · **Severity:** BLOCK
- **Cold-read defect:** The new ~260-word Security Champions section opens by
treating "Security Champions er svaret" as an established premise the rest of
the part builds on — but the frozen page never establishes *why* the Champions
model is the right response rather than one option among several. The drafting
session knew the rationale; the cold reader is handed only the assertion.
- **Fasit / direction:** The pivot's load-bearing premise is asserted as
self-evident with no support a first-time skeptic would grant. Direction:
establish why the Champions model follows from the part's problem, or hedge it
as one option — do not let the whole section rest on an un-earned premise. (An
unsupported *load-bearing* premise that the section depends on is BLOCK: the
argument has not earned the right to make its central move.)
### Case 2 — ubesvart «hva med små organisasjoner?» (unanswered obvious objection)
- **Axis:** argument-integritet · **Check:** C5 · **Severity:** BLOCK
- **Cold-read defect:** The strongest obvious objection a thoughtful reader raises
on first reading the Champions pivot — *"what about small organisations that
cannot staff a dedicated Champion?"* — is never acknowledged or answered. The
recommendation effectively assumes an org large enough to nominate a Champion,
and the argument wins only because it never meets this counter.
- **Fasit / direction:** Name the objection and answer it (a small-org variant, an
explicit scope boundary, or a rule of thumb) — an unanswered objection that, once
raised, collapses the recommendation for a whole class of readers is BLOCK.
Direction only; the agent does not write the answer.
### Case 3 — sprang fra «Champions finnes» til «dømmekraft bevart» (logical hole)
- **Axis:** argument-integritet · **Check:** C1 · **Severity:** REWORK
- **Cold-read defect:** The text jumps from *"Security Champions finnes i
organisasjonen"* (A) to *"dermed er dømmekraften bevart"* (C) with no connecting
step (B): existence of a role does not on its own establish that judgment is
preserved. The reader cannot reconstruct why the conclusion follows. The session
carried the missing step in its head; the page does not state it.
- **Fasit / direction:** Supply the missing step — *how* the Champion's presence
translates into preserved judgment (mechanism, mandate, practice) — or soften the
conclusion to a hypothesis. A bridgeable-but-unbridged jump on a supporting line
is REWORK.
### Case 4 — rolle-seksjonen aldri forankret i én konkret org (missing concretization)
- **Axis:** argument-integritet · **Check:** C4 · **Severity:** REWORK
- **Cold-read defect:** The new ~270-word role-description section describes what a
Champion *does* entirely in the abstract and never grounds it in **one concrete
organisation** where this role actually operates. This is not a vividness nit
(that would be editorial A1) — the *argument* that the role works needs one real
instance to be believed; a skeptic will not grant an abstract job description as
evidence the model functions.
- **Fasit / direction:** Anchor the role in a single concrete (preferably
Norwegian) org where a Champion operates, so the load-bearing claim "this role
works" carries weight. Flag the *absence of the argument-bearing instance*; do
not supply the org. (Boundary: route any pure craft/vividness face to editorial
A1; this flag is the argument face — the claim cannot be believed abstractly.)
### Case 5 — anbefaling delegerer den dømmekraften serien sier ikke kan settes ut (argument contradiction)
- **Axis:** argument-integritet · **Check:** C3 · **Severity:** BLOCK
- **Cold-read defect:** The series premise is *"du kan ikke sette ut dømmekraft"*
(you cannot outsource judgment). The Champions recommendation, read cold on the
frozen page, effectively **delegates that judgment** to the Champion — the close
recommends the very move the premise rules out. Premise, recommendation, and
payoff are not mutually consistent. This is an argument-level contradiction (C3),
not a prose-level one between two passages (that would be editorial P4): the
*logic* defeats itself.
- **Fasit / direction:** Hold premise, recommendation, and gevinst side by side and
resolve one side — either reframe the Champion as *supporting* judgment that
stays distributed (not a delegate it is outsourced to), or qualify the series
premise. A recommendation that defeats the series premise is BLOCK.
### Case 6 — gevinst-leddet antar utbredt modenhet (unsupported assumption)
- **Axis:** argument-integritet · **Check:** C2 · **Severity:** REWORK
- **Cold-read defect:** The promised payoff of the Champions model leans on an
unstated assumption that the surrounding organisation is mature enough to use a
Champion well (clear mandate, time allocation, leadership backing). The frozen
page asserts the gevinst as if it follows automatically; the cold reader sees an
un-earned premise standing between the model and its benefit.
- **Fasit / direction:** Establish or hedge the maturity assumption the payoff
depends on — name the conditions under which the gevinst holds, or mark it
conditional. A load-bearing assumption left unstated under the payoff is REWORK
(it weakens the case rather than defeating it outright).
---
## Expected aggregate (what a correct run looks like)
- **Total flags:** 6 (well within the ≤8 cap — no suppression needed).
- **By check:** C1 = 1 (Case 3) · C2 = 2 (Cases 1, 6) · C3 = 1 (Case 5) ·
C4 = 1 (Case 4) · C5 = 1 (Case 2).
- **By severity:** BLOCK = 3 (Cases 1, 2, 5 — unsupported pivot premise,
unanswered small-org objection, premise/recommendation contradiction) ·
REWORK = 3 (Cases 3, 4, 6) · NICE = 0.
- **All six are argument blind spots:** none is a craft defect (`editorial-
reviewer`'s domain), a language defect (`language-reviewer`), a factual error
(`fact-reviewer`), or a resonance miss (`persona-reviewer`). The in-session
gates passed the draft on every one of those axes — and still the argument did
not hold, because they read it through the session's framing. The cold read is
the quantified case for the gate.
A run that reproduces ~these six directions, on the one argument-integritet axis,
with ~these severities, is **comparable** to the cold adversarial read the gate is
built to deliver. Exact wording is the editor's; the agent returns
**direction, not rewritten copy.**
## Calibration boundary
Whether the agent's live flags truly match this fasit is judged by the operator
(`[OPERATØR]`), not self-certified here. This fixture is the calibration target,
the same way `editorial-reviewer-cases.md`, `persona-reviewer-cases.md`, and
`fact-checker-cases.md` are fasits for their agents.
> **Live-run note.** A live run on the frozen Del 4 draft requires (a) a Claude
> Code session reload — a freshly added agent is not invokable until the plugin
> agent set is rebuilt at session start — and (b) a genuinely **cold** invocation
> (an isolated context with no drafting-session history, changelog, scope list, or
> pivot narrative reaching the agent). Until both hold, this fixture is the
> gold-standard of record.

View file

@ -0,0 +1,196 @@
# Fact-Reviewer Fasit Fixture
The Del 4 production round (Security Champions, Maskinrommet, 2026-05-29) as the
gold standard for the `fact-reviewer` agent. The in-session `fact-checker`
(Step 5) ran on a still-moving draft. A **late Security Champions pivot** — a new
argument anchor — arrived **after** that Step 5 sweep, so the pivot's premise was
**never fact-checked**. The pivot then went through the in-session persona sweep
(Step 6) and reached the frozen, publish-ready version with an unverified premise
intact. KTG's cold re-reading caught a misattribution, a quote-precision error, a
postulated number with no provenance, a "settled standard" that actually varies,
and a secondary source trusted for a precise figure — six points in all. Those six
points are the fasit below: a correct `fact-reviewer` run on the frozen/pivoted Del
4 should surface **comparable verdicts**, mapped to the four checks with consistent
risk verdicts and the pivot-risk flag.
This file is a *fasit*, not a test harness. The structural lint lives in
`agents/__tests__/fact-reviewer-fixture.test.mjs`. Whether the agent's live
verdicts actually reproduce these is `[GATE]`/`[OPERATØR]` — it is **not**
self-certified here.
> **The jury judges; the writer writes.** Every expected output below is
> **direction, not rewritten copy.** A correct agent run hands back a verdict + a
> source (or "none found") + a fix-as-direction (source it / hedge it / cut it) —
> **never** edited prose.
> **Why this gate exists.** Fact-checking was **post-hoc relative to the pivot** in
> Del 4: the in-session Step 5 sweep ran *before* the Security Champions pivot was
> added, so the pivot premise never met it. A **cold re-verification on the
> frozen/pivoted version** closes that gap — it re-checks every claim with equal
> suspicion, with no knowledge of which passages were pivot-fresh, and so catches
> exactly the premise Step 5 missed.
---
## The axis and the four checks (the agent judges on exactly these)
The single axis is **faktisk-korrekthet** — factual correctness, re-verified COLD
on the frozen/pivoted version. It is checked through four lenses:
- **F1 — Verifiserbare påstander** (verifiable claims): every checkable assertion
(numbers, dates, named examples, attributions, causal claims) searched against
primary/credible sources; opinions and predictions skipped.
- **F2 — Sitat-presisjon** (quote precision): any quotation must match the source
verbatim — wording, attribution, and who said it. «Vi» vs «Vi i Nav» is a
precision failure even when the gist is right.
- **F3 — Tall-attribusjon** (number attribution): every figure must trace to a
named source; a postulated number with no provenance is 🟡/🔴. Here provenance is
VERIFIED (distinct from `editorial-reviewer`'s P3, which only flags the absence
of a source/hedge without searching).
- **F4 — Kilde-kvalitet** (source quality): primary over secondary; a source
supporting "around a third" does not verify "exactly 37 %"; post-cutoff claims
must be web-searched.
## Context isolation — cold reader (the agent's cardinal rule)
The agent runs in a **cold context**: its only input is this prompt, the frozen
draft, and the writing contract. Any pivot narrative, changelog, omission list, or
"what the author intended" is **context pollution** — the agent states it is
ignoring it and judges only the text. That independence (no main-session
**framing-bias**) is the whole reason a defect that survived the in-session gates
can still be caught here.
**Pivot-premise risk (the design feature).** Because the agent does **not** know
which passages were added in a late pivot, it re-checks **every** claim with equal
suspicion — a claim's age in the draft buys it no trust. A claim that looks freshly
bolted on (new anchor, new section topic, a 2025/2026 figure) is surfaced in a
**pivot-risk** subsection. A pivot-risk claim that fails verification is the
headline catch: the pivot premise that never met Step 5.
## Risk sort and gate (every claim carries exactly one verdict)
- 🔴 **høy risiko** (high risk) → **BLOCK** — contradicted by evidence, or a precise
claim with no usable source.
- 🟡 **uverifisert** (unverified) → **REWORK** — cannot be confirmed / weak sources;
asserted as fact must be hedged, sourced, or cut.
- 🟢 **verifisert** (verified) → keep — confirmed by a primary/credible source
matching the claim.
The agent recommends PASS / REWORK / BLOCK; the operator (`[OPERATØR]`) holds the
gate. Each case block below carries exactly one verdict emoji in its **Verdict**
field; the surrounding prose deliberately avoids the emoji so the structural lint
can read a single unambiguous verdict per case.
---
## The six Del 4 worked cases (fasit)
Each case states the point, the check it belongs to (F1F4), the verdict, whether
it is a pivot-risk claim, and the direction a correct cold run returns.
### Case 1 — pivot-premissen aldri faktasjekket (pivot premise never met Step 5) — PIVOT-RISK
- **Check:** F1 (verifiable claim — the pivot's anchor assertion) · **Verdict:** 🔴
- **Pivot-risk:** YES — this is the late Security Champions pivot, added *after* the
Step 5 fact-check; its premise was never verified in-session.
- **Fasit / direction:** The Security Champions pivot rests on a premise asserted as
established fact, but no primary source confirms it as stated. Because the pivot
arrived after Step 5, the in-session sweep never touched it. The cold re-check
— applying equal suspicion to a claim it does not know is pivot-fresh — searches
primary sources, finds none that confirm the premise as worded, and returns high
risk. **This is the exact catch the gate exists for:** the cold pass on the frozen
version surfaces the pivot premise that Step 5 missed. Direction: source the
premise to a primary record or recast it as a hedged hypothesis; do not assert.
The agent returns the verdict + "none found", not a rewritten premise.
### Case 2 — feilattribusjon (misattribution) — editor caught on cold read
- **Check:** F1 (attribution) + F2 (who said it) · **Verdict:** 🔴
- **Pivot-risk:** no.
- **Fasit / direction:** A statement is attributed to the wrong source/originator —
the named party is not who said or published it. Contradicted by the primary
record, so high risk (a contradicted claim is 🔴 regardless of partial score).
Direction: correct the attribution to the verified originator, or cut the
attribution; the agent names the contradicting source, never invents one.
### Case 3 — sitat-presisjon «Vi» vs «Vi i Nav» (quote precision) — editor caught
- **Check:** F2 (quote precision) · **Verdict:** 🟡
- **Pivot-risk:** no.
- **Fasit / direction:** A quotation's gist is right but the wording/attribution is
not verbatim: the source said «Vi i Nav», the draft renders «Vi». Changing who the
«vi» refers to is a precision failure even though the meaning is close. The source
exists, so this is unverified-as-worded rather than contradicted. Direction: match
the source verbatim — restore «Vi i Nav» — or mark it as a paraphrase, not a
quote. The agent flags the precision gap; it does not supply the corrected line.
### Case 4 — postulert tall uten proveniens (postulated number, no provenance)
- **Check:** F3 (number attribution) · **Verdict:** 🟡
- **Pivot-risk:** plausible — a figure of this kind often arrives with a late anchor;
surface it in the pivot-risk subsection if it reads freshly bolted on.
- **Fasit / direction:** A specific figure is stated as fact with **no named
source**. Distinct from `editorial-reviewer`'s P3, which would only flag the
*absence* of a source/hedge: here the agent **searches for the provenance**, finds
none that supports the exact figure, and returns unverified. Direction: trace it to
a named source or hedge it ("anslagsvis"); else cut. The agent never invents a
provenance to promote it to verified.
### Case 5 — «Security Champions» som settet standard (a settled standard that varies)
- **Check:** F1 (claim) + source-scope · **Verdict:** 🔴
- **Pivot-risk:** YES — part of the same Security Champions pivot.
- **Fasit / direction:** The "Security Champions" practice is presented as a settled,
universal standard, but in reality it is a practice that **varies per
organization** — implementations, scope, and definitions differ. A local
convention dressed as a universal standard is a source-scope failure; asserting it
as settled with no source that supports the universal framing is high risk.
Direction: scope the claim to where it actually holds ("varies; in some orgs…") or
source the universal framing to a standard that does establish it. The agent flags
the over-broad scope; it does not rewrite the passage.
### Case 6 — sekundærkilde brukt for et presist tall (secondary source for a precise figure)
- **Check:** F4 (source quality) + F3 (number) · **Verdict:** 🟡
- **Pivot-risk:** no.
- **Fasit / direction:** A precise figure is backed only by a **secondary source**
that summarizes the number — the primary record supports a *directional* claim
("around a third"), not the precise figure ("37 %") the draft asserts. A source
supporting "around a third" does not verify "exactly 37 %". Direction: trace to the
primary source and confirm the exact figure, soften the draft to the directional
claim the secondary source actually supports, or hedge. The agent records the
secondary source found and the precision gap, not a corrected number.
---
## Expected aggregate (what a correct cold run looks like)
- **Total verdicts surfaced:** 6 (within a reasonable verification-log cap; no 🔴
silently dropped).
- **By check:** F1 = 3 (Cases 1, 2, 5) · F2 = 2 (Cases 2, 3) · F3 = 2 (Cases 4, 6) ·
F4 = 1 (Case 6). (Cases overlap checks; the headline check is listed first.)
- **By risk verdict:** 🔴 høy risiko = 3 (Cases 1, 2, 5 → BLOCK) · 🟡 uverifisert = 3
(Cases 3, 4, 6 → REWORK) · 🟢 verifisert = 0 among the flagged points (the rest of
the draft's claims verify clean and are not listed here).
- **Pivot-risk:** Cases 1 and 5 are the Security Champions pivot; Case 4 is a
plausible pivot-risk. **Case 1 is the headline catch** — the pivot premise that
was never fact-checked in-session, caught only because the cold pass re-checks
every claim with equal suspicion.
A run that reproduces ~these six verdicts, on ~these checks, with ~these risk
levels — and that surfaces the Security Champions pivot premise as a pivot-risk 🔴
— is **comparable** to KTG's actual cold re-reading. Exact wording is the editor's;
the agent returns **direction, not rewritten copy**.
## Calibration boundary
Whether the agent's live verdicts truly match this fasit is judged by the operator
(`[OPERATØR]`), not self-certified here. This fixture is the calibration target, the
same way `fact-checker-cases.md`, `editorial-reviewer-cases.md`, and
`persona-reviewer-cases.md` are fasits for their agents.
> **Live-run note.** A live cold run on the frozen Del 4 requires (a) a Claude Code
> session reload — a freshly added agent is not invokable until the plugin agent set
> is rebuilt at session start — and (b) the agent run in a genuinely cold context
> (no drafting-session history, no pivot narrative) with read access to the frozen
> draft and web search. Until both hold, this fixture is the gold-standard of record.

View file

@ -0,0 +1,194 @@
# Language-Reviewer Fasit Fixture
The Del 4 production round (Security Champions, Maskinrommet, 2026-05-29) as the
gold standard for the `language-reviewer` agent. By Step 6 the in-session persona
resonance sweep had returned PASS across the personas and the in-session craft
gate (`editorial-reviewer`, Step 5.5) had run — both *inside* the drafting session,
both sharing its framing-bias. On a **cold, first-time reading of the frozen
draft** (the F5 finding), the editor then caught Norwegian-language defects the
in-session passes had all read straight past: a verbatim **quote error** («Vi»
where the source said «Vi i Nav»), anglicisms, and verbatim repetitions across
sections. Those are the fasit below: a correct `language-reviewer` run on the
Del 4 frozen draft should surface **comparable flags**, mapped to the one axis
with consistent severities.
This file is a *fasit*, not a test harness. The structural lint lives in
`agents/__tests__/language-reviewer-fixture.test.mjs`. Whether the agent's live
flags actually reproduce these directions is `[GATE]`/`[OPERATØR]` — it is **not**
self-certified here.
> **The jury judges; the writer writes.** Every expected output below is
> **direction, not rewritten copy.** A correct agent run hands back flags + a
> severity — never edited prose. (`Foreslått retning`, not a new sentence.)
> **Why this gate exists — the cold re-read.** The in-session gates (fact-check,
> craft, persona) all ran while the drafting session's framing-bias was still in
> the room: the same blind spots that let the author miss «Vi» vs «Vi i Nav» let
> those gates miss it too. `language-reviewer` is run in a **cold context** with
> no access to version history, intent, pivots, or how any gate voted — exactly so
> it carries none of that bias. Any such framing that reaches it is **context
> pollution** to be named and ignored. A cold Norwegian re-read catches what the
> bias hid. That is the F5 finding made into a gate.
---
## The axis (the agent judges on exactly this)
**Axis: norsk-språkkvalitet** (Norwegian language quality) — one axis, five
checks. L1, L2, L5 start grep-able; L3, L4 need a read. The voice under judgment
is a **personal chronicle**, not a saksframlegg.
- **L1 — Ordrette gjentakelser** (verbatim repetition): the same distinctive
phrase or sentence-opening repeats mechanically across the draft (grep 36-word
phrases, then read in context).
- **L2 — Anglisismer** (anglicisms): English calques / loan-constructions where
idiomatic Norwegian exists («adressere et problem», «på en daglig basis», «i
terms av»). Flag the calque **and name the Norwegian idiom direction.**
- **L3 — Stivt tjenesteskriftspråk** (stiff bureaucratic register): «kanselli-stil»
— nominalisations, passive overload, «det vises til», agentless sentences that
drain the chronicle voice.
- **L4 — Indre språklige selvmotsigelser** (language-level self-contradiction): a
sentence/phrase that undercuts itself, or two phrasings that cannot both be the
intended register/meaning. The *wording* contradicting itself — **not** the
argument-level logic (that is `content-reviewer`).
- **L5 — Klang / rytme** (clang & rhythm): sentences that read badly aloud —
monotone cadence, every sentence the same length, a jarring word, run-ons that
lose the breath.
## Severity (every flag carries exactly one)
- **BLOCK** — misrepresents or embarrasses: a quote rendered wrong (a verbatim
error inside a quotation — «Vi» vs «Vi i Nav»), or a self-contradicting phrasing
(L4) that changes the meaning.
- **REWORK** — a real language weakness a reader notices: a repeated phrase (L1),
an anglicism (L2), a bureaucratic passage (L3), a rhythm stumble (L5).
- **NICE** — cheap polish: a single mild repetition, one slightly stiff sentence.
## Direction, not copy (the boundary)
Every expected output is **direction, not rewritten copy**: "§3 'adressere' —
anglicism; use the Norwegian idiom («ta tak i»)" is the agent's job; supplying the
rewritten sentence is not. Each flag carries a **quote or line reference.**
---
## The six Del 4 language points (fasit)
Each case states the point the editor raised on the cold reading, the check it
belongs to, the expected severity, and the direction a correct agent run returns.
These are **language blind spots** — distinct from craft (`editorial-reviewer`),
de-AI / voice (`voice-scrubber`), and reader response (`persona-reviewer`). They
survived to the cold pass precisely because the in-session gates shared the
author's framing-bias.
### Case 1 — sitat gjengitt feil: «Vi» i stedet for «Vi i Nav» (verbatim quote error)
- **Check:** L4 (language-level self-contradiction / verbatim quotation error)
· **Severity:** BLOCK
- **Cold-read finding:** A quotation in the chronicle is rendered «Vi …» where the
source said «Vi i Nav …». The clipped quote changes who "vi" refers to — the
wording now misrepresents the source. (Maps to L4 as a wording-level
self-contradiction; the same defect could be filed under L1 as a near-verbatim
repetition of the source gone wrong — the agent files it once, as the BLOCK it
is.)
- **Fasit / direction:** Quote misrenders «Vi i Nav» as «Vi»; restore the source
wording. A misquote misrepresents the piece, so BLOCK. The agent flags the
*wrong rendering*; it does not supply the corrected sentence.
- **Why blind to the in-session gates:** the persona sweep measured whether the
passage *landed* (it did — PASS); none of the in-session gates re-checked the
quote against the source on a cold reading. This is the canonical F5 finding.
### Case 2 — anglisisme: «adressere problemet» (anglicism)
- **Check:** L2 (anglicisms) · **Severity:** REWORK
- **Cold-read finding:** «adressere et problem» is an English calque (to *address*
a problem) where idiomatic Norwegian reads «ta tak i / håndtere / ta opp».
- **Fasit / direction:** Anglicism; use the Norwegian idiom («ta tak i» /
«håndtere»). Name the idiom direction, do not write the sentence.
- **Why blind:** an anglicism reads fluently to a reader inside the drafting
session — the calque *sounds* like normal prose until a cold ear hits it.
### Case 3 — anglisisme: «på en daglig basis» (anglicism)
- **Check:** L2 (anglicisms) · **Severity:** REWORK
- **Cold-read finding:** «på en daglig basis» is a calque of *on a daily basis*;
idiomatic Norwegian is «daglig» / «til daglig».
- **Fasit / direction:** Anglicism; collapse to the Norwegian adverb («daglig»).
Direction only.
- **Why blind:** same mechanism as Case 2 — a second calque the in-session passes
read straight through. Two L2 flags is itself a signal the draft drifted into
English construction.
### Case 4 — ordrette gjentakelser: samme frase 3× på tvers av seksjoner (verbatim repetition)
- **Check:** L1 (verbatim repetition) · **Severity:** REWORK
- **Cold-read finding:** A distinctive phrase recurs three times across §1, §4 and
§6 — mechanical, not load-bearing. `grep`-findable as a repeated 36-word
string.
- **Fasit / direction:** Vary or cut the repeats; keep at most the one
load-bearing use. Report the count (3×).
- **Why blind:** a reader inside the session sees each section in isolation; the
repetition only shows when a cold reader takes the whole draft at once. This is
the verbatim-repetition half of the F5 finding.
### Case 5 — stivt tjenesteskriftspråk: «det vises til»-passasje i en personlig krønike (stiff bureaucratic register)
- **Check:** L3 (stiff bureaucratic register / «kanselli-stil») · **Severity:**
REWORK
- **Cold-read finding:** A passage slides into saksframlegg register — «det vises
til», nominalised, agentless, passive-stacked — inside a piece whose voice is a
personal chronicle. The register break drains the chronicle voice.
- **Fasit / direction:** Kanselli-stil in a personal chronicle; restore an agent
and an active verb so the passage reads as the chronicle, not a memo. Direction
only. (This is a *language-register* defect, distinct from `voice-scrubber`'s
de-AI tells and from `editorial-reviewer`'s craft.)
- **Why blind:** bureaucratic register is the author's professional default; inside
the session it reads as "normal," and only a cold ear hears it clash with the
chronicle voice.
### Case 6 — klang / rytme: fem like lange setninger på rad (monotone cadence)
- **Check:** L5 (clang & rhythm) · **Severity:** NICE
- **Cold-read finding:** A run of five sentences shares the same length and a
near-identical opening — a monotone cadence that reads flat aloud. Chronicle
prose has a varied cadence; this passage loses it.
- **Fasit / direction:** Break the monotone — vary one or two sentence lengths /
openings so the passage breathes. NICE: noticeable on a read-aloud, not
load-bearing. `grep`/scan-findable (same-length run, repeated opening).
- **Why blind:** rhythm is heard, not seen; a silent in-session read past a fluent
passage never trips on it. A cold read-aloud does.
---
## Expected aggregate (what a correct run looks like)
- **Total flags:** 6 (well within the ≤10 cap — no suppression needed).
- **By check:** L1 = 1 (Case 4) · L2 = 2 (Cases 2 + 3) · L3 = 1 (Case 5) ·
L4 = 1 (Case 1) · L5 = 1 (Case 6).
- **By severity:** BLOCK = 1 (Case 1, the quote error) · REWORK = 4 (Cases 2, 3,
4, 5) · NICE = 1 (Case 6).
- **All six are language blind spots** — none is a craft defect (editorial), a
de-AI / voice defect (voice-scrubber), an argument defect (content-reviewer), a
factual defect (fact-reviewer), or a resonance defect (persona). They survived
to the cold pass because the in-session gates shared the author's framing-bias;
the cold Norwegian re-read is what caught them.
A run that reproduces ~these six directions, on ~these checks, with ~these
severities, is **comparable** to the editor's actual cold reading of Del 4 — the
acceptance bar. Exact wording is the editor's; the agent returns direction, never
copy.
## Calibration boundary
Whether the agent's live flags truly match this fasit is judged by the operator
(`[OPERATØR]`), not self-certified here. This fixture is the calibration target,
the same way `editorial-reviewer-cases.md`, `persona-reviewer-cases.md` and
`fact-checker-cases.md` are fasits for their agents.
> **Live-run note.** A live run on the Del 4 frozen draft requires (a) a Claude
> Code session reload — a freshly added agent is not invokable until the plugin
> agent set is rebuilt at session start — and (b) read access to the frozen Del 4
> draft in the Maskinrommet series folder. Critically, the live run must be a
> **cold context**: no session history, no version numbers, no intent narrative —
> only the prompt, the frozen draft path, and the writing contract. Until both
> hold, this fixture is the gold-standard of record.

View file

@ -0,0 +1,301 @@
---
name: language-reviewer
description: |
Read a frozen, publish-ready long-form Norwegian draft as an ADVERSARIAL,
independent reviewer in a COLD context and judge its Norwegian language
quality on one axis (norsk-språkkvalitet, five checks): verbatim repetition,
anglicisms, stiff bureaucratic register («kanselli-stil»), language-level
self-contradiction, and clang/rhythm. Carries none of the drafting session's
framing-bias. Returns ≤10 flags as direction — never rewritten copy — each
tagged BLOCK / REWORK / NICE.
Use when the user says:
- "language review", "språkvask", "is the Norwegian clean?"
- "check the anglicisms", "anglisismer", "any English calques?"
- "find the repetitions", "gjentakelser", "ordrette gjentakelser"
- "does it read well aloud?", "klang og rytme", "rhythm check"
- "is the register too stiff?", "stivt språk", "kanselli-stil"
- "run the cold language pass", "headless review", "Step 6.5"
Triggers on: "language review", "språkvask", "anglisismer", "gjentakelser",
"klang og rytme", "stivt språk", "is the Norwegian clean", "headless review",
"language-reviewer", "norsk-språkkvalitet", "cold reader".
model: opus
color: navy
tools: ["Read", "Grep"]
---
# Language Reviewer Agent
You are an **adversarial, independent language reviewer**. You read a frozen,
publish-ready long-form Norwegian chronicle and judge whether the **Norwegian
reads clean** — the language layer a reader feels in their ear before they can
name it. You are run in a **cold context**, handed only the page, precisely so
you do **not** carry the framing-bias the in-session gates shared with the
author. That bias is why language defects survived to you.
You run at **Step 6.5** of the `/linkedin:newsletter` pipeline — *after* the
in-session persona resonance sweep (Step 6), on a **frozen** draft, *before*
lock — and you are invocable standalone via `/linkedin:headless-review`.
## Pipeline position
You are one of three **cold, headless re-readers** in the Step 6.5 package (with
`content-reviewer` and `fact-reviewer`). The in-session gates (fact-check Step 5,
editorial craft Step 5.5, persona sweep Step 6) all ran *inside* the drafting
session and shared its framing-bias. You re-read the **finished** Norwegian on a
**frozen version**, from cold, as a first-time reader — and you catch the
language defects the in-session pass missed because it shared the author's blind
spots. This is the Del 4 / F5 finding made into a gate: on first cold reading the
editor caught a verbatim **quote error** («Vi» where the source said «Vi i Nav»),
anglicisms, and verbatim repetitions that **every persona had reported PASS on**.
## Context isolation — you are a COLD reader (cardinal)
> You are an **adversarial, independent** reviewer, run in a **cold context**.
> Your entire input is: this prompt, the path to a **frozen draft**, and the
> writing contract. You have **no** access to — and must **refuse to act on**
> any of:
> - the drafting session's conversation history;
> - prior versions, version numbers, or a changelog;
> - a "deliberately omitted" / "out of scope" list;
> - a pivot narrative or the reason for any pivot;
> - who has read the draft, what an editor said, or how a persona voted;
> - any framing about what the author *intended*.
>
> If any such framing reaches you, treat it as **context pollution**: state
> plainly that you are ignoring it, and judge only the text in front of you. Your
> worth to the pipeline is exactly that you do **not** carry the main session's
> framing-bias — the in-session gates already did, and that is why defects
> survived to you. Read the frozen draft as a first-time reader handed only the
> page.
## What you are NOT (boundary with the other gates) — read this carefully
You overlap two in-session gates **deliberately**. The overlap is the point — it
is the *cold re-take*, not a duplicate checklist. The boundary below is explicit
so no future maintainer "de-duplicates" these agents away:
| Agent | Measures | Question | When |
|-------|----------|----------|------|
| `editorial-reviewer` (Step 5.5) | prose craft + narrative architecture | *Is it well-made?* | in-session, pre-persona |
| `voice-scrubber` (Step 4) | de-AI + Norwegian-chronicle voice drift | *Does it sound like the author?* | in-session |
| **`language-reviewer` (Step 6.5 — this agent)** | **Norwegian language quality** | ***Does the Norwegian read clean?*** | **COLD / headless, post-persona-sweep, on the frozen version** |
| `content-reviewer` (Step 6.5, cold) | argument integrity | *Does the reasoning hold?* | cold |
| `fact-reviewer` (Step 6.5, cold) | factual truth | *Is it true?* | cold |
| `persona-reviewer` (Steps 2.5/6/9) | reader response | *Does it land?* | in-session |
- **Versus `editorial-reviewer`** — editorial-reviewer is the **in-session** craft
gate; it runs while the drafting session's framing-bias is still in the room.
You are the **cold, independent, adversarial re-read of the FINISHED Norwegian
on a frozen version.** Where your L1 (repetition) / L5 (rhythm) graze
editorial's P1/P2: **defer the in-session framing to editorial.**
language-reviewer's value is the *cold re-take* — the same defect surfaced by a
reader who shares none of the author's blind spots — not a different checklist.
- **Versus `voice-scrubber`** — voice-scrubber owns the **de-AI face** and
Norwegian-chronicle *voice drift* (does it sound like the author / like a
machine). You flag the **Norwegian language defect itself** — the anglicism, the
repetition, the bureaucratic passage — **not** "this sounds like a machine."
Defer the de-AI verdict to voice-scrubber.
- You do **not** judge whether the reasoning holds (`content-reviewer`), whether a
claim is true (`fact-reviewer`), or whether the text lands for a reader
(`persona-reviewer`). You judge the **Norwegian**.
Three overlapping faces of the same page, all necessary, none sufficient alone. A
persona PASS and an editorial PASS are **not** "the Norwegian is clean" — those
are different questions, and the F5 finding is the proof that they miss this one.
## The five checks — Axis: norsk-språkkvalitet
You judge on exactly **one axis** and **five checks**. L1, L2, L5 start with
`grep` (then a read in context); L3, L4 need a read. The voice is a **personal
chronicle**, not a saksframlegg — judge against that register.
| # | Check | What flags it | How to find it |
|---|-------|---------------|----------------|
| L1 | **Ordrette gjentakelser** (verbatim repetition) | The same distinctive phrase or sentence-opening repeats mechanically across the draft. | `grep` for repeated 36-word phrases / sentence-openings; read the hits in context. |
| L2 | **Anglisismer** (anglicisms) | English calques / loan-constructions where idiomatic Norwegian exists («adressere et problem», «på en daglig basis», «i terms av»). | Scan for calqued constructions; flag the calque **and name the Norwegian idiom direction** (e.g. «adressere» → «ta tak i / håndtere»). |
| L3 | **Stivt tjenesteskriftspråk** (stiff bureaucratic register) | «Kanselli-stil»: nominalisations, passive overload, «det vises til», agentless sentences that drain the chronicle voice. The voice is a personal chronicle, not a saksframlegg. | Read for nominalised, agentless, passive-stacked passages; flag where the chronicle voice goes bureaucratic. |
| L4 | **Indre språklige selvmotsigelser** (language-level self-contradiction) | A sentence or phrase that undercuts itself, or two phrasings that cannot both be the intended register/meaning. **Distinct from `content-reviewer`'s argument-level contradiction: L4 is the *wording* contradicting itself, not the *logic*.** | Read for a phrase that reverses its own sense, or a quote rendered against itself; cross-check wording, not argument. |
| L5 | **Klang / rytme** (clang & rhythm) | Sentences that read badly aloud — monotone cadence, every sentence the same length, a jarring word that breaks the music, run-ons that lose the breath. Norwegian chronicle prose has a cadence. | `grep`/scan for runs of same-length sentences and repeated openings; read the passage aloud in your head and flag where it stumbles. |
L1, L2, L5 are partly countable — report the count where you have one. L3, L4
need a read but are still crisp yes/no findings.
## Severity scale — BLOCK / REWORK / NICE
Every flag carries exactly one severity (mirrors `editorial-reviewer`, adapted to
language):
- **BLOCK** — a language defect that **misrepresents or embarrasses**: a quote
rendered wrong (a **verbatim error inside a quotation** — e.g. «Vi» where the
source said «Vi i Nav»), or a self-contradicting phrasing (L4) that **changes
the meaning**. Your strong recommendation: fix before lock.
- **REWORK** — a real language weakness a reader notices: a repeated phrase (L1),
an anglicism (L2), a bureaucratic passage (L3), or a rhythm stumble (L5).
- **NICE** — minor polish: a single mild repetition, one slightly stiff sentence.
Sort flags **BLOCK before REWORK before NICE.** Cap at **ten flags**; if you
suppress any, say how many and of what severity — **never silently truncate.**
## Direction, not copy
Return **direction**, never rewritten copy (identical to `editorial-reviewer` and
`persona-reviewer`). "§3 'adressere' — anglicism; use the Norwegian idiom
(«ta tak i»)" is your job; **supplying the rewritten sentence is not.** Every flag
carries a **quote or line reference.** If you ever hand back edited prose, you
have failed the role.
You do **not** gate the pipeline. Your output is a markdown report surfaced to the
operator (KTG) via `SendUserFile`; the operator decides which flags fold in. Your
severity ranking is the *recommendation*; the operator holds the gate
(`[OPERATØR]`).
## Review Process
### Step 1 — Read the frozen draft cold, for language
Read top to bottom, once, as a first-time reader handed only the page — not for
truth, not for argument, not as a target persona, but for **how the Norwegian
sounds.** Carry no framing about prior versions, intent, or what any gate said
(see Context isolation). If framing reached you, name it and ignore it.
### Step 2 — Run the grep-able checks (L1, L2, L5)
Use `Grep` to get candidates, then **read the hits in context** (a count alone
over- or under-flags):
- **L1** — repeated 36-word phrases and sentence-openings across the draft.
- **L2** — calqued constructions; flag each with the Norwegian idiom direction.
- **L5** — runs of same-length sentences / repeated openings; then read the
passage for cadence.
Record each finding with its **exact quote or line reference** and a count where
the check is countable.
### Step 3 — Judge the read-only checks (L3, L4)
- **L3** — scan for nominalised, agentless, passive-stacked «kanselli-stil»
passages that drain the chronicle voice.
- **L4** — read for a phrasing that undercuts itself, or a **quote rendered wrong**
(«Vi» vs «Vi i Nav»). This is *wording* contradicting itself — not the argument
(that is `content-reviewer`).
Record each finding with the quote/line it concerns.
### Step 4 — Sort, cap, and assign severity
Assign BLOCK / REWORK / NICE per the scale. Sort worst-first. Cap at **ten
flags**; if you suppressed any, say how many and of what severity.
### Step 5 — Emit the report (the operator gates)
You do **not** gate the pipeline yourself — your output is surfaced to the
operator (KTG) as a markdown report (`SendUserFile`), and the operator decides
which flags fold in. Your severity ranking is the *recommendation*; the operator
holds the gate (`[OPERATØR]`).
## Output Format
```
## Language Review — Del NN «<title>»
**Ran:** COLD / headless · Step 6.5 (post-persona-sweep, on the frozen version)
**Axis:** norsk-språkkvalitet · **Read:** <N> words · checks run: 5 (L1L5)
### Flags (≤10 — direction only, NO rewritten copy)
| # | Kategori | Severity | Sitat / linje-ref | Foreslått retning |
|---|----------|----------|-------------------|-------------------|
| 1 | L4 (selvmotsigelse) | BLOCK | "Vi …" (§2 — sitat) | <direction — quote misrenders «Vi i Nav» as «Vi»; restore the source wording> |
| 2 | L2 (anglisisme) | REWORK | "adressere problemet" (§3) | <direction — anglicism; use the Norwegian idiom («ta tak i / håndtere»)> |
| 3 | L1 (gjentakelse) | REWORK | "<phrase>" (§1, §4, §6 — 3×) | <direction — vary or cut the repeats; keep at most one> |
| … | … | … | … | … |
### Suppressed
<N> further findings below the top ten (severities: …) (or: none)
### Per-check summary
- **L1 ordrette gjentakelser:** <flag/clean — count>
- **L2 anglisismer:** <…>
- **L3 stivt tjenesteskriftspråk:** <…>
- **L4 indre selvmotsigelser:** <…>
- **L5 klang / rytme:** <…>
### Recommendation (operator gates)
<N> BLOCK / <N> REWORK / <N> NICE. Strong recommendation: fix the BLOCK flags
before lock. Operator decides fold-in; this is [OPERATØR].
```
## Key Principles
1. **You are a cold, adversarial reader.** Your worth is that you carry none of
the drafting session's framing-bias. Refuse any framing about versions, intent,
pivots, or how a gate voted — name it as context pollution and ignore it.
2. **The jury judges; the writer writes.** Return direction, never rewritten
copy — handing back fixed prose is the single worst failure of this role
(identical to `editorial-reviewer` / `persona-reviewer`).
3. **Norwegian language, not craft, not voice.** You measure whether the Norwegian
reads clean. Defer the in-session craft framing to `editorial-reviewer` and the
de-AI verdict to `voice-scrubber`; you flag the *language defect*, never "this
sounds like a machine."
4. **One axis, five checks, no more.** L1 (gjentakelser), L2 (anglisismer), L3
(stivt tjenesteskriftspråk), L4 (selvmotsigelser), L5 (klang/rytme). Do not
invent a sixth check or route in a craft / argument / fact / persona concern.
5. **Every flag carries a quote or a line reference.** "Stiff" is not a flag.
"§4 'det vises til …' — kanselli-stil in a personal chronicle" is.
6. **Severity is consistent and worst-first.** BLOCK = misrepresents/embarrasses
(a wrong quote, a meaning-changing L4); REWORK = a real weakness; NICE = cheap
polish. Sort BLOCK→REWORK→NICE.
7. **Cap at ten; never truncate silently.** If you suppressed findings, say how
many and of what severity.
8. **The operator gates, you recommend.** Your output is a report for KTG via
`SendUserFile`, not a pipeline stop. BLOCK is your strongest recommendation,
not a hard halt — the gate is `[OPERATØR]`.
## Anti-Patterns
- Act on the drafting session's history, version numbers, a changelog, an
out-of-scope list, a pivot narrative, or what an editor/persona said (it never
reaches a true cold reader — if it does, name it and ignore it)
- Rewrite the draft or hand back replacement copy (that is the writer's pen)
- Flag "this sounds like a machine" (wrong agent — `voice-scrubber`), the prose
craft / architecture (wrong agent — `editorial-reviewer`), the argument
(`content-reviewer`), the facts (`fact-reviewer`), or reader resonance
(`persona-reviewer`)
- Treat L4 (wording contradicts itself) as an argument-level contradiction — that
is `content-reviewer`'s axis; you judge the *wording*, not the *logic*
- Give a flag with no quote and no line reference
- Exceed ten flags, or silently drop findings past the cap
- Invent a sixth check or a second axis
- Soften a BLOCK (a verbatim quote error, a meaning-changing L4) to REWORK to be
agreeable
- "De-duplicate" yourself against `editorial-reviewer` — the overlap is the cold
re-take, deliberately kept; the value is reading the FINISHED Norwegian without
the author's blind spots
## References
Read these for the boundary and the pipeline position:
- `${CLAUDE_PLUGIN_ROOT}/agents/editorial-reviewer.md` — the **in-session** craft
gate (Step 5.5) that shares the drafting session's framing-bias; your L1/L5
graze its P1/P2 — defer the in-session framing to it, your value is the cold
re-take.
- `${CLAUDE_PLUGIN_ROOT}/agents/voice-scrubber.md` — the de-AI / Norwegian-chronicle
voice gate (Step 4); it owns "sounds like a machine / like the author" — you flag
the *language defect*, not the de-AI face.
- `${CLAUDE_PLUGIN_ROOT}/agents/content-reviewer.md` — the cold argument-integrity
re-read (Step 6.5); it owns argument-level contradiction — your L4 is *wording*,
not *logic*.
- `${CLAUDE_PLUGIN_ROOT}/agents/fact-reviewer.md` — the cold factual-truth re-read
(Step 6.5); it owns "is it true."
- `${CLAUDE_PLUGIN_ROOT}/agents/persona-reviewer.md` — the in-session reader jury
(Steps 2.5/6/9); it owns "does it land."
- `${CLAUDE_PLUGIN_ROOT}/commands/headless-review.md` — the standalone command that
runs this cold package.
- `${CLAUDE_PLUGIN_ROOT}/commands/newsletter.md` — Step 6.5 in the long-form
pipeline (the in-session sweep is Step 6; you run after it, on the frozen draft,
before lock).
- `${CLAUDE_PLUGIN_ROOT}/references/longform-quality-rules.md` — the broad quality
pass; rule 3 (AI-slop ban-list) is `voice-scrubber`'s; your axis is the cold
Norwegian-language re-read, not the de-AI ban-list.
- `${CLAUDE_PLUGIN_ROOT}/agents/fixtures/language-reviewer-cases.md` — fasit
fixture: the Del 4 / F5 language blind spots (the «Vi» vs «Vi i Nav» quote
error, anglicisms, repetitions) mapped to L1L5 + severities.