Kjell Tore Guttormsen 9df3de795c feat(linkedin): v2.4.0 — editorial-reviewer agent + Step 5.5 craft gate in /linkedin:newsletter

Endring 8 from the change spec (Del 4 production, Maskinrommet). The persona
resonance sweep measures reader-response (does it land?); nothing measured prose
craft or narrative architecture (is it well-made?). In Del 4 every persona
reported PASS, yet the editor found 8 fresh editorial points on first reading —
~6/8 craft/architecture blind spots no agent could see. v2.4.0 adds the missing
editor role.

New Step 5.5 (editorial-review) runs between fact-check (Step 5) and the persona
sweep (Step 6): a new editorial-reviewer agent (Opus) judges two axes —
prosa-handverk (em-dash density, verbatim repetition, postulated numbers,
contradictions, versal-tic) + narrativ-arkitektur (concrete instantiation,
theory-anchored hypotheses, series-title symmetry, equal action per addressee,
un-overloaded conclusion). Returns <=10 flags as direction (never copy), each
BLOCK/REWORK/NICE, operator-gated via SendUserFile. Runs before the persona
sweep so the personas measure resonance instead of stumbling on craft noise.
Mirrors the Maskinrommet writing-contract section C2 (bidirectional mirror rule).

- agents/editorial-reviewer.md (NEW, Opus, orange) + fasit fixture
  (editorial-reviewer-cases.md: Del 4 v5 gold standard, 8 points -> 2 axes +
  severities, 3 BLOCK / 5 REWORK, 6/8 blind spots) + structural lint (7 tests).
- Step 5.5 wired into commands/newsletter.md; pipeline 14 -> 15 phases.
- editorial-review phase + additive editorialReview state in
  config/edition-state.template.json; resumption: factcheck-sweep -> Step 5.5,
  editorial-review -> Step 6 (spec said fact-check; canonical key is
  factcheck-sweep).
- persona-reviewer contract unchanged: editorial-reviewer is supplementary
  (one measures craft, one measures response).
- All doc levels synced (plugin + root README/CLAUDE.md, CHANGELOG, plugin.json
  2.3.0 -> 2.4.0; agents 15 -> 16). 94 tests green.

Acceptance-criterion #8 (live run on Del 4 v5) delivered as fasit fixture:
a live run needs a session reload (new agent not invokable until then) + read
access to the Del 4 v5 draft in Maskinrommet.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

2026-05-29 06:17:50 +02:00

7.8 KiB

Raw Blame History

Editorial-Reviewer Fasit Fixture

The Del 4 production round (Maskinrommet, 2026-05-28) as the gold standard for the editorial-reviewer agent. The persona resonance sweep returned 15 flags across three personas (Linjeleder C primær + KI-seksjon B sekundær + IT-direktør A sekundær) and every persona reported PASS / ready-to-publish. KTG then found eight editorial points on first reading — and only ~25 % of them overlapped anything the personas had touched. The other six were craft/architecture blind spots. Those eight points are the fasit below: a correct editorial-reviewer run on Del 4 v5 should surface comparable flags, mapped to the two axes with consistent severities.

This file is a fasit, not a test harness. The structural lint lives in agents/__tests__/editorial-reviewer-fixture.test.mjs. Whether the agent's live flags actually reproduce these directions is [GATE]/[OPERATØR] — it is not self-certified here.

The jury judges; the writer writes. Every expected output below is direction, not rewritten copy. A correct agent run hands back flags + a severity — never edited prose. (Foreslått retning, not a new sentence.)

Why this gate exists. A persona PASS measures reader response, not craft. Six of these eight points were invisible to the persona sweep because they are architecture and prose-craft defects, not resonance defects. The numbers tell the story: persona overlap ≈ 2/8; editorial-only ≈ 6/8.

The two axes (the agent judges on exactly these)

Both axes are the operationalized mirror of the Maskinrommet skrivekontrakt §C2 — §C2 is the source of truth; this fixture and the agent's checklist follow it. (§C2 is the craft half of the writing contract; §A — skeleton-before-prose — is codified in references/longform-quality-rules.md rule 8.)

Prosa-håndverk (mechanical, grep-able): P1 tankestrek-tetthet · P2 ordrette gjentakelser · P3 postulerte tall uten kilde/hedge · P4 indre selvmotsigelse · P5 versal-tic.
Narrativ-arkitektur (evaluative): A1 konkret instansiering · A2 teori-anker for hypoteser · A3 serietittel-symmetri · A4 like-brukbar handling per adressat · A5 konklusjon ikke overlastet.

Severity (every flag carries exactly one)

BLOCK — misrepresents the piece or loses the takeaway.
REWORK — a real craft/architecture weakness, not load-bearing-fatal.
NICE — cheap polish, fold in if convenient.

The eight Del 4 editorial points (fasit)

Each case states the point KTG raised, the axis it belongs to, the expected severity, and the direction a correct agent run returns. The persona-overlap column records whether the persona sweep had already (partially) touched it — the whole reason the gate is needed is the rows marked blindsone.

Case 1 — manglende konkret eksempel (abstract figure never instantiated)

Axis: A1 (arkitektur) · Severity: REWORK
Persona overlap: delvis (KI-seksjon B brushed it) — not a pure blind spot.
Fasit / direction: An abstract figure carries the section but never lands on one concrete case the reader can picture. Direction: instantiate with a single verifiable (preferably Norwegian) case — do not list several. The agent flags the absence of instantiation; it does not supply the case.

Case 2 — postulert tall uten kilde eller hedge (postulated number)

Axis: P3 (prosa) · Severity: REWORK
Persona overlap: delvis (IT-direktør A brushed it).
Fasit / direction: A specific figure is stated as fact with neither an inline source marker nor a hedge. This is distinct from a fact-check finding — Step 5 verifies numbers that have a provenance; here the provenance is simply absent. Direction: source it or hedge it ("anslagsvis"); else cut.

Case 3 — manglende SDT-anker for tillit-effekt (unanchored hypothesis) — BLINDSONE

Axis: A2 (arkitektur) · Severity: BLOCK
Persona overlap: none — pure blind spot.
Fasit / direction: A psychological hypothesis about a trust effect is asserted as if established, with no named theory anchor (e.g. Self-Determination Theory) and no explicit hedge. A hypothesis dressed as a finding is an architecture defect. Direction: anchor in a named model OR mark it explicitly as hypothesis.

Case 4 — brutt serietittel-kobling (broken series-title symmetry) — BLINDSONE

Axis: A3 (arkitektur) · Severity: REWORK
Persona overlap: none — pure blind spot.
Fasit / direction: The part does not bind back to the series premise / its own title — it floats free of the whole. Direction: tie the part's argument back to the series title so the reader feels the part-of-a-whole. (N/A only for a standalone edition; Del 4 is part of a series, so it applies.)

Case 5 — manglende småbedrifts-tommelfingerregel (stranded addressee) — BLINDSONE

Axis: A4 (arkitektur) · Severity: BLOCK
Persona overlap: none — pure blind spot.
Fasit / direction: The text addresses more than one reader but the actionable takeaway only serves one; the small-business reader leaves with nothing they can do. Direction: add a small-business rule of thumb so each addressee gets an equally-usable action. (Stranding an addressee = BLOCK: that reader has no takeaway at all.)

Case 6 — ordrette gjentakelser (verbatim repetition) — BLINDSONE

Axis: P2 (prosa) · Severity: REWORK
Persona overlap: none — pure blind spot.
Fasit / direction: A distinctive phrase recurs more than twice. Direction: vary or cut the repeats; keep at most the one load-bearing use. grep-findable.

Case 7 — tankestrek-tetthet (em-dash over-density) — BLINDSONE

Axis: P1 (prosa) · Severity: REWORK
Persona overlap: none — pure blind spot.
Fasit / direction: Em-dashes run above ~1 per 50 words (clusters within paragraphs). Direction: thin them to the local target; the em-dash is a tool, not a tic. Report the count. grep-findable.

Case 8 — indre selvmotsigelse (internal contradiction) — BLINDSONE

Axis: P4 (prosa) · Severity: BLOCK
Persona overlap: none — pure blind spot.
Fasit / direction: Two passages cannot both be true (an assertion the conclusion silently reverses). Direction: name the contradiction and resolve one side — a contradiction misrepresents the piece, so BLOCK.

Expected aggregate (what a correct run looks like)

Total flags: 8 (well within the ≤10 cap — no suppression needed).
By axis: prosa-håndverk = 4 (P1, P2, P3, P4) · narrativ-arkitektur = 4 (A1, A2, A3, A4). A5 (overloaded conclusion) and P5 (versal-tic) did not flag on Del 4 v5 — record them clean.
By severity: BLOCK = 3 (A2, A4, P4) · REWORK = 5 (A1, P3, A3, P2, P1) · NICE = 0.
Persona overlap: 2/8 (Cases 1 + 2, both delvis) · editorial-only blind spots: 6/8 (Cases 3–8). This 6/8 is the quantified case for the gate.

A run that reproduces ~these eight directions, on ~these axes, with ~these severities, is comparable to KTG's actual editorial round — the bar acceptance-criterion #8 sets. Exact wording is the editor's; the agent returns direction, never copy.

Calibration boundary

Whether the agent's live flags truly match this fasit is judged by the operator ([OPERATØR]), not self-certified here. This fixture is the calibration target, the same way persona-reviewer-cases.md and fact-checker-cases.md are fasits for their agents.

Live-run note. A live run on Del 4 v5 requires (a) a Claude Code session reload — a freshly added agent is not invokable until the plugin agent set is rebuilt at session start — and (b) read access to the Del 4 v5 draft in the Maskinrommet series folder. Until both hold, this fixture is the gold-standard of record.

7.8 KiB Raw Blame History Unescape Escape