ktg-plugin-marketplace/plugins/linkedin-studio/agents/fixtures/language-reviewer-cases.md
Kjell Tore Guttormsen e69ea1f4c9 feat(linkedin-studio): v3.1.0 — Endring 9 adversarial review-pakke + per-artefakt personas
Cold, adversarial review package for the long-form pipeline + configurable
per-edition personas. Motivated by Del 4 (Security Champions pivot): the
in-session editor + persona sweep shared the drafting session's framing-bias,
so the shipped version was never independently re-reviewed.

Headless package (9a/9b):
- New Step 6.5 (headless-review) in /linkedin:newsletter, after the persona
  sweep, before lock — the independence layer the in-session gates can't be.
- New standalone /linkedin:headless-review command (run in a fresh session for
  maximum isolation; reconstructs frozen draft + contract + personas from disk).
- 3 new Opus archetypes, each with a cardinal context-isolation block that
  refuses drafting-session framing as "context pollution":
  - content-reviewer (argument integrity C1–C5, ≤8 flags)
  - language-reviewer (Norwegian language L1–L5, ≤10 flags)
  - fact-reviewer (cold re-verification F1–F4, risk-sort + pivot-risk, WebSearch)
- Deliberate redundancy with fact-checker / editorial-reviewer documented so
  the pairs are never de-duplicated.

Pivot-reopen (9c):
- New /linkedin:pivot command: logs articles.NN.pivots[], resets currentPhase,
  un-locks, marks gates to re-run.
- Pivot-detection gate in Step 8 lock precondition (>20% word-count change or
  >2 new sections re-opens cleared gates). Del 4 v8→v11 worked example.

Per-artifact personas (new requirement):
- articles.NN.personas with resolution order (edition-state → series file →
  plugin library → interactive). One or more readers configurable per edition.

Schema/docs:
- edition-state.template.json: additive personas[], pivots[], headlessReview,
  headless-review phase (16 phases); personaSweep.resonance.wordCount baseline.
- 3 fasit fixtures + 3 structural lint tests (Del 4 worked cases).
- Counts: 24→26 commands, 16→19 agents, 15→16 newsletter phases.
- README + CLAUDE.md (plugin + root) + CHANGELOG synced.

Verification: 35 agent-fixture + 59 hook + 20 render tests green. Backward-
compatible (additive state); reload required before the 3 new agents resolve.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-29 13:01:24 +02:00

11 KiB
Raw Permalink Blame History

Language-Reviewer Fasit Fixture

The Del 4 production round (Security Champions, Maskinrommet, 2026-05-29) as the gold standard for the language-reviewer agent. By Step 6 the in-session persona resonance sweep had returned PASS across the personas and the in-session craft gate (editorial-reviewer, Step 5.5) had run — both inside the drafting session, both sharing its framing-bias. On a cold, first-time reading of the frozen draft (the F5 finding), the editor then caught Norwegian-language defects the in-session passes had all read straight past: a verbatim quote error («Vi» where the source said «Vi i Nav»), anglicisms, and verbatim repetitions across sections. Those are the fasit below: a correct language-reviewer run on the Del 4 frozen draft should surface comparable flags, mapped to the one axis with consistent severities.

This file is a fasit, not a test harness. The structural lint lives in agents/__tests__/language-reviewer-fixture.test.mjs. Whether the agent's live flags actually reproduce these directions is [GATE]/[OPERATØR] — it is not self-certified here.

The jury judges; the writer writes. Every expected output below is direction, not rewritten copy. A correct agent run hands back flags + a severity — never edited prose. (Foreslått retning, not a new sentence.)

Why this gate exists — the cold re-read. The in-session gates (fact-check, craft, persona) all ran while the drafting session's framing-bias was still in the room: the same blind spots that let the author miss «Vi» vs «Vi i Nav» let those gates miss it too. language-reviewer is run in a cold context with no access to version history, intent, pivots, or how any gate voted — exactly so it carries none of that bias. Any such framing that reaches it is context pollution to be named and ignored. A cold Norwegian re-read catches what the bias hid. That is the F5 finding made into a gate.


The axis (the agent judges on exactly this)

Axis: norsk-språkkvalitet (Norwegian language quality) — one axis, five checks. L1, L2, L5 start grep-able; L3, L4 need a read. The voice under judgment is a personal chronicle, not a saksframlegg.

  • L1 — Ordrette gjentakelser (verbatim repetition): the same distinctive phrase or sentence-opening repeats mechanically across the draft (grep 36-word phrases, then read in context).
  • L2 — Anglisismer (anglicisms): English calques / loan-constructions where idiomatic Norwegian exists («adressere et problem», «på en daglig basis», «i terms av»). Flag the calque and name the Norwegian idiom direction.
  • L3 — Stivt tjenesteskriftspråk (stiff bureaucratic register): «kanselli-stil» — nominalisations, passive overload, «det vises til», agentless sentences that drain the chronicle voice.
  • L4 — Indre språklige selvmotsigelser (language-level self-contradiction): a sentence/phrase that undercuts itself, or two phrasings that cannot both be the intended register/meaning. The wording contradicting itself — not the argument-level logic (that is content-reviewer).
  • L5 — Klang / rytme (clang & rhythm): sentences that read badly aloud — monotone cadence, every sentence the same length, a jarring word, run-ons that lose the breath.

Severity (every flag carries exactly one)

  • BLOCK — misrepresents or embarrasses: a quote rendered wrong (a verbatim error inside a quotation — «Vi» vs «Vi i Nav»), or a self-contradicting phrasing (L4) that changes the meaning.
  • REWORK — a real language weakness a reader notices: a repeated phrase (L1), an anglicism (L2), a bureaucratic passage (L3), a rhythm stumble (L5).
  • NICE — cheap polish: a single mild repetition, one slightly stiff sentence.

Direction, not copy (the boundary)

Every expected output is direction, not rewritten copy: "§3 'adressere' — anglicism; use the Norwegian idiom («ta tak i»)" is the agent's job; supplying the rewritten sentence is not. Each flag carries a quote or line reference.


The six Del 4 language points (fasit)

Each case states the point the editor raised on the cold reading, the check it belongs to, the expected severity, and the direction a correct agent run returns. These are language blind spots — distinct from craft (editorial-reviewer), de-AI / voice (voice-scrubber), and reader response (persona-reviewer). They survived to the cold pass precisely because the in-session gates shared the author's framing-bias.

Case 1 — sitat gjengitt feil: «Vi» i stedet for «Vi i Nav» (verbatim quote error)

  • Check: L4 (language-level self-contradiction / verbatim quotation error) · Severity: BLOCK
  • Cold-read finding: A quotation in the chronicle is rendered «Vi …» where the source said «Vi i Nav …». The clipped quote changes who "vi" refers to — the wording now misrepresents the source. (Maps to L4 as a wording-level self-contradiction; the same defect could be filed under L1 as a near-verbatim repetition of the source gone wrong — the agent files it once, as the BLOCK it is.)
  • Fasit / direction: Quote misrenders «Vi i Nav» as «Vi»; restore the source wording. A misquote misrepresents the piece, so BLOCK. The agent flags the wrong rendering; it does not supply the corrected sentence.
  • Why blind to the in-session gates: the persona sweep measured whether the passage landed (it did — PASS); none of the in-session gates re-checked the quote against the source on a cold reading. This is the canonical F5 finding.

Case 2 — anglisisme: «adressere problemet» (anglicism)

  • Check: L2 (anglicisms) · Severity: REWORK
  • Cold-read finding: «adressere et problem» is an English calque (to address a problem) where idiomatic Norwegian reads «ta tak i / håndtere / ta opp».
  • Fasit / direction: Anglicism; use the Norwegian idiom («ta tak i» / «håndtere»). Name the idiom direction, do not write the sentence.
  • Why blind: an anglicism reads fluently to a reader inside the drafting session — the calque sounds like normal prose until a cold ear hits it.

Case 3 — anglisisme: «på en daglig basis» (anglicism)

  • Check: L2 (anglicisms) · Severity: REWORK
  • Cold-read finding: «på en daglig basis» is a calque of on a daily basis; idiomatic Norwegian is «daglig» / «til daglig».
  • Fasit / direction: Anglicism; collapse to the Norwegian adverb («daglig»). Direction only.
  • Why blind: same mechanism as Case 2 — a second calque the in-session passes read straight through. Two L2 flags is itself a signal the draft drifted into English construction.

Case 4 — ordrette gjentakelser: samme frase 3× på tvers av seksjoner (verbatim repetition)

  • Check: L1 (verbatim repetition) · Severity: REWORK
  • Cold-read finding: A distinctive phrase recurs three times across §1, §4 and §6 — mechanical, not load-bearing. grep-findable as a repeated 36-word string.
  • Fasit / direction: Vary or cut the repeats; keep at most the one load-bearing use. Report the count (3×).
  • Why blind: a reader inside the session sees each section in isolation; the repetition only shows when a cold reader takes the whole draft at once. This is the verbatim-repetition half of the F5 finding.

Case 5 — stivt tjenesteskriftspråk: «det vises til»-passasje i en personlig krønike (stiff bureaucratic register)

  • Check: L3 (stiff bureaucratic register / «kanselli-stil») · Severity: REWORK
  • Cold-read finding: A passage slides into saksframlegg register — «det vises til», nominalised, agentless, passive-stacked — inside a piece whose voice is a personal chronicle. The register break drains the chronicle voice.
  • Fasit / direction: Kanselli-stil in a personal chronicle; restore an agent and an active verb so the passage reads as the chronicle, not a memo. Direction only. (This is a language-register defect, distinct from voice-scrubber's de-AI tells and from editorial-reviewer's craft.)
  • Why blind: bureaucratic register is the author's professional default; inside the session it reads as "normal," and only a cold ear hears it clash with the chronicle voice.

Case 6 — klang / rytme: fem like lange setninger på rad (monotone cadence)

  • Check: L5 (clang & rhythm) · Severity: NICE
  • Cold-read finding: A run of five sentences shares the same length and a near-identical opening — a monotone cadence that reads flat aloud. Chronicle prose has a varied cadence; this passage loses it.
  • Fasit / direction: Break the monotone — vary one or two sentence lengths / openings so the passage breathes. NICE: noticeable on a read-aloud, not load-bearing. grep/scan-findable (same-length run, repeated opening).
  • Why blind: rhythm is heard, not seen; a silent in-session read past a fluent passage never trips on it. A cold read-aloud does.

Expected aggregate (what a correct run looks like)

  • Total flags: 6 (well within the ≤10 cap — no suppression needed).
  • By check: L1 = 1 (Case 4) · L2 = 2 (Cases 2 + 3) · L3 = 1 (Case 5) · L4 = 1 (Case 1) · L5 = 1 (Case 6).
  • By severity: BLOCK = 1 (Case 1, the quote error) · REWORK = 4 (Cases 2, 3, 4, 5) · NICE = 1 (Case 6).
  • All six are language blind spots — none is a craft defect (editorial), a de-AI / voice defect (voice-scrubber), an argument defect (content-reviewer), a factual defect (fact-reviewer), or a resonance defect (persona). They survived to the cold pass because the in-session gates shared the author's framing-bias; the cold Norwegian re-read is what caught them.

A run that reproduces ~these six directions, on ~these checks, with ~these severities, is comparable to the editor's actual cold reading of Del 4 — the acceptance bar. Exact wording is the editor's; the agent returns direction, never copy.

Calibration boundary

Whether the agent's live flags truly match this fasit is judged by the operator ([OPERATØR]), not self-certified here. This fixture is the calibration target, the same way editorial-reviewer-cases.md, persona-reviewer-cases.md and fact-checker-cases.md are fasits for their agents.

Live-run note. A live run on the Del 4 frozen draft requires (a) a Claude Code session reload — a freshly added agent is not invokable until the plugin agent set is rebuilt at session start — and (b) read access to the frozen Del 4 draft in the Maskinrommet series folder. Critically, the live run must be a cold context: no session history, no version numbers, no intent narrative — only the prompt, the frozen draft path, and the writing contract. Until both hold, this fixture is the gold-standard of record.