ktg-plugin-marketplace/plugins/linkedin-studio/agents/fixtures/language-reviewer-cases.md
Kjell Tore Guttormsen e69ea1f4c9 feat(linkedin-studio): v3.1.0 — Endring 9 adversarial review-pakke + per-artefakt personas
Cold, adversarial review package for the long-form pipeline + configurable
per-edition personas. Motivated by Del 4 (Security Champions pivot): the
in-session editor + persona sweep shared the drafting session's framing-bias,
so the shipped version was never independently re-reviewed.

Headless package (9a/9b):
- New Step 6.5 (headless-review) in /linkedin:newsletter, after the persona
  sweep, before lock — the independence layer the in-session gates can't be.
- New standalone /linkedin:headless-review command (run in a fresh session for
  maximum isolation; reconstructs frozen draft + contract + personas from disk).
- 3 new Opus archetypes, each with a cardinal context-isolation block that
  refuses drafting-session framing as "context pollution":
  - content-reviewer (argument integrity C1–C5, ≤8 flags)
  - language-reviewer (Norwegian language L1–L5, ≤10 flags)
  - fact-reviewer (cold re-verification F1–F4, risk-sort + pivot-risk, WebSearch)
- Deliberate redundancy with fact-checker / editorial-reviewer documented so
  the pairs are never de-duplicated.

Pivot-reopen (9c):
- New /linkedin:pivot command: logs articles.NN.pivots[], resets currentPhase,
  un-locks, marks gates to re-run.
- Pivot-detection gate in Step 8 lock precondition (>20% word-count change or
  >2 new sections re-opens cleared gates). Del 4 v8→v11 worked example.

Per-artifact personas (new requirement):
- articles.NN.personas with resolution order (edition-state → series file →
  plugin library → interactive). One or more readers configurable per edition.

Schema/docs:
- edition-state.template.json: additive personas[], pivots[], headlessReview,
  headless-review phase (16 phases); personaSweep.resonance.wordCount baseline.
- 3 fasit fixtures + 3 structural lint tests (Del 4 worked cases).
- Counts: 24→26 commands, 16→19 agents, 15→16 newsletter phases.
- README + CLAUDE.md (plugin + root) + CHANGELOG synced.

Verification: 35 agent-fixture + 59 hook + 20 render tests green. Backward-
compatible (additive state); reload required before the 3 new agents resolve.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-29 13:01:24 +02:00

194 lines
11 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Language-Reviewer Fasit Fixture
The Del 4 production round (Security Champions, Maskinrommet, 2026-05-29) as the
gold standard for the `language-reviewer` agent. By Step 6 the in-session persona
resonance sweep had returned PASS across the personas and the in-session craft
gate (`editorial-reviewer`, Step 5.5) had run — both *inside* the drafting session,
both sharing its framing-bias. On a **cold, first-time reading of the frozen
draft** (the F5 finding), the editor then caught Norwegian-language defects the
in-session passes had all read straight past: a verbatim **quote error** («Vi»
where the source said «Vi i Nav»), anglicisms, and verbatim repetitions across
sections. Those are the fasit below: a correct `language-reviewer` run on the
Del 4 frozen draft should surface **comparable flags**, mapped to the one axis
with consistent severities.
This file is a *fasit*, not a test harness. The structural lint lives in
`agents/__tests__/language-reviewer-fixture.test.mjs`. Whether the agent's live
flags actually reproduce these directions is `[GATE]`/`[OPERATØR]` — it is **not**
self-certified here.
> **The jury judges; the writer writes.** Every expected output below is
> **direction, not rewritten copy.** A correct agent run hands back flags + a
> severity — never edited prose. (`Foreslått retning`, not a new sentence.)
> **Why this gate exists — the cold re-read.** The in-session gates (fact-check,
> craft, persona) all ran while the drafting session's framing-bias was still in
> the room: the same blind spots that let the author miss «Vi» vs «Vi i Nav» let
> those gates miss it too. `language-reviewer` is run in a **cold context** with
> no access to version history, intent, pivots, or how any gate voted — exactly so
> it carries none of that bias. Any such framing that reaches it is **context
> pollution** to be named and ignored. A cold Norwegian re-read catches what the
> bias hid. That is the F5 finding made into a gate.
---
## The axis (the agent judges on exactly this)
**Axis: norsk-språkkvalitet** (Norwegian language quality) — one axis, five
checks. L1, L2, L5 start grep-able; L3, L4 need a read. The voice under judgment
is a **personal chronicle**, not a saksframlegg.
- **L1 — Ordrette gjentakelser** (verbatim repetition): the same distinctive
phrase or sentence-opening repeats mechanically across the draft (grep 36-word
phrases, then read in context).
- **L2 — Anglisismer** (anglicisms): English calques / loan-constructions where
idiomatic Norwegian exists («adressere et problem», «på en daglig basis», «i
terms av»). Flag the calque **and name the Norwegian idiom direction.**
- **L3 — Stivt tjenesteskriftspråk** (stiff bureaucratic register): «kanselli-stil»
— nominalisations, passive overload, «det vises til», agentless sentences that
drain the chronicle voice.
- **L4 — Indre språklige selvmotsigelser** (language-level self-contradiction): a
sentence/phrase that undercuts itself, or two phrasings that cannot both be the
intended register/meaning. The *wording* contradicting itself — **not** the
argument-level logic (that is `content-reviewer`).
- **L5 — Klang / rytme** (clang & rhythm): sentences that read badly aloud —
monotone cadence, every sentence the same length, a jarring word, run-ons that
lose the breath.
## Severity (every flag carries exactly one)
- **BLOCK** — misrepresents or embarrasses: a quote rendered wrong (a verbatim
error inside a quotation — «Vi» vs «Vi i Nav»), or a self-contradicting phrasing
(L4) that changes the meaning.
- **REWORK** — a real language weakness a reader notices: a repeated phrase (L1),
an anglicism (L2), a bureaucratic passage (L3), a rhythm stumble (L5).
- **NICE** — cheap polish: a single mild repetition, one slightly stiff sentence.
## Direction, not copy (the boundary)
Every expected output is **direction, not rewritten copy**: "§3 'adressere' —
anglicism; use the Norwegian idiom («ta tak i»)" is the agent's job; supplying the
rewritten sentence is not. Each flag carries a **quote or line reference.**
---
## The six Del 4 language points (fasit)
Each case states the point the editor raised on the cold reading, the check it
belongs to, the expected severity, and the direction a correct agent run returns.
These are **language blind spots** — distinct from craft (`editorial-reviewer`),
de-AI / voice (`voice-scrubber`), and reader response (`persona-reviewer`). They
survived to the cold pass precisely because the in-session gates shared the
author's framing-bias.
### Case 1 — sitat gjengitt feil: «Vi» i stedet for «Vi i Nav» (verbatim quote error)
- **Check:** L4 (language-level self-contradiction / verbatim quotation error)
· **Severity:** BLOCK
- **Cold-read finding:** A quotation in the chronicle is rendered «Vi …» where the
source said «Vi i Nav …». The clipped quote changes who "vi" refers to — the
wording now misrepresents the source. (Maps to L4 as a wording-level
self-contradiction; the same defect could be filed under L1 as a near-verbatim
repetition of the source gone wrong — the agent files it once, as the BLOCK it
is.)
- **Fasit / direction:** Quote misrenders «Vi i Nav» as «Vi»; restore the source
wording. A misquote misrepresents the piece, so BLOCK. The agent flags the
*wrong rendering*; it does not supply the corrected sentence.
- **Why blind to the in-session gates:** the persona sweep measured whether the
passage *landed* (it did — PASS); none of the in-session gates re-checked the
quote against the source on a cold reading. This is the canonical F5 finding.
### Case 2 — anglisisme: «adressere problemet» (anglicism)
- **Check:** L2 (anglicisms) · **Severity:** REWORK
- **Cold-read finding:** «adressere et problem» is an English calque (to *address*
a problem) where idiomatic Norwegian reads «ta tak i / håndtere / ta opp».
- **Fasit / direction:** Anglicism; use the Norwegian idiom («ta tak i» /
«håndtere»). Name the idiom direction, do not write the sentence.
- **Why blind:** an anglicism reads fluently to a reader inside the drafting
session — the calque *sounds* like normal prose until a cold ear hits it.
### Case 3 — anglisisme: «på en daglig basis» (anglicism)
- **Check:** L2 (anglicisms) · **Severity:** REWORK
- **Cold-read finding:** «på en daglig basis» is a calque of *on a daily basis*;
idiomatic Norwegian is «daglig» / «til daglig».
- **Fasit / direction:** Anglicism; collapse to the Norwegian adverb («daglig»).
Direction only.
- **Why blind:** same mechanism as Case 2 — a second calque the in-session passes
read straight through. Two L2 flags is itself a signal the draft drifted into
English construction.
### Case 4 — ordrette gjentakelser: samme frase 3× på tvers av seksjoner (verbatim repetition)
- **Check:** L1 (verbatim repetition) · **Severity:** REWORK
- **Cold-read finding:** A distinctive phrase recurs three times across §1, §4 and
§6 — mechanical, not load-bearing. `grep`-findable as a repeated 36-word
string.
- **Fasit / direction:** Vary or cut the repeats; keep at most the one
load-bearing use. Report the count (3×).
- **Why blind:** a reader inside the session sees each section in isolation; the
repetition only shows when a cold reader takes the whole draft at once. This is
the verbatim-repetition half of the F5 finding.
### Case 5 — stivt tjenesteskriftspråk: «det vises til»-passasje i en personlig krønike (stiff bureaucratic register)
- **Check:** L3 (stiff bureaucratic register / «kanselli-stil») · **Severity:**
REWORK
- **Cold-read finding:** A passage slides into saksframlegg register — «det vises
til», nominalised, agentless, passive-stacked — inside a piece whose voice is a
personal chronicle. The register break drains the chronicle voice.
- **Fasit / direction:** Kanselli-stil in a personal chronicle; restore an agent
and an active verb so the passage reads as the chronicle, not a memo. Direction
only. (This is a *language-register* defect, distinct from `voice-scrubber`'s
de-AI tells and from `editorial-reviewer`'s craft.)
- **Why blind:** bureaucratic register is the author's professional default; inside
the session it reads as "normal," and only a cold ear hears it clash with the
chronicle voice.
### Case 6 — klang / rytme: fem like lange setninger på rad (monotone cadence)
- **Check:** L5 (clang & rhythm) · **Severity:** NICE
- **Cold-read finding:** A run of five sentences shares the same length and a
near-identical opening — a monotone cadence that reads flat aloud. Chronicle
prose has a varied cadence; this passage loses it.
- **Fasit / direction:** Break the monotone — vary one or two sentence lengths /
openings so the passage breathes. NICE: noticeable on a read-aloud, not
load-bearing. `grep`/scan-findable (same-length run, repeated opening).
- **Why blind:** rhythm is heard, not seen; a silent in-session read past a fluent
passage never trips on it. A cold read-aloud does.
---
## Expected aggregate (what a correct run looks like)
- **Total flags:** 6 (well within the ≤10 cap — no suppression needed).
- **By check:** L1 = 1 (Case 4) · L2 = 2 (Cases 2 + 3) · L3 = 1 (Case 5) ·
L4 = 1 (Case 1) · L5 = 1 (Case 6).
- **By severity:** BLOCK = 1 (Case 1, the quote error) · REWORK = 4 (Cases 2, 3,
4, 5) · NICE = 1 (Case 6).
- **All six are language blind spots** — none is a craft defect (editorial), a
de-AI / voice defect (voice-scrubber), an argument defect (content-reviewer), a
factual defect (fact-reviewer), or a resonance defect (persona). They survived
to the cold pass because the in-session gates shared the author's framing-bias;
the cold Norwegian re-read is what caught them.
A run that reproduces ~these six directions, on ~these checks, with ~these
severities, is **comparable** to the editor's actual cold reading of Del 4 — the
acceptance bar. Exact wording is the editor's; the agent returns direction, never
copy.
## Calibration boundary
Whether the agent's live flags truly match this fasit is judged by the operator
(`[OPERATØR]`), not self-certified here. This fixture is the calibration target,
the same way `editorial-reviewer-cases.md`, `persona-reviewer-cases.md` and
`fact-checker-cases.md` are fasits for their agents.
> **Live-run note.** A live run on the Del 4 frozen draft requires (a) a Claude
> Code session reload — a freshly added agent is not invokable until the plugin
> agent set is rebuilt at session start — and (b) read access to the frozen Del 4
> draft in the Maskinrommet series folder. Critically, the live run must be a
> **cold context**: no session history, no version numbers, no intent narrative —
> only the prompt, the frozen draft path, and the writing contract. Until both
> hold, this fixture is the gold-standard of record.