ktg-plugin-marketplace/plugins/linkedin-thought-leadership/agents/fixtures/fact-checker-cases.md

2.4 KiB

Fact-Checker Fasit Fixture

Three reference claims with known ground truth, used to sanity-check the fact-checker agent. Each case states the claim, the fasit (the correct answer + why), and the expected risk verdict.

  • 🟢 = verified true against a primary/credible source
  • 🔴 = contradicted by evidence (false), or a high-risk claim asserted without support
  • 🟡 = unverifiable from available sources — flagged, never guessed

This file is a fasit, not a test harness. The structural lint lives in agents/__tests__/fact-checker-fixture.test.mjs. Whether the agent's live output actually reproduces these verdicts is [GATE]/[OPERATØR] — it is not self-certified.

Each case block below carries exactly one verdict emoji (in its Verdict field); the prose deliberately avoids emoji so the structural lint can read a single, unambiguous verdict per case.


Case 1 — verifiable true

  • Claim: The EU AI Act entered into force on 1 August 2024.
  • Verdict: 🟢
  • Fasit: True. Regulation (EU) 2024/1689 was published in the Official Journal on 12 July 2024 and entered into force 20 days later, on 1 August 2024. This is confirmable against the primary source (EUR-Lex) and the European Commission's own communications. A correct agent run returns the verified verdict with a primary-source citation.

Case 2 — verifiable false

  • Claim: GPT-4 was developed and released by Anthropic.
  • Verdict: 🔴
  • Fasit: False. GPT-4 was released by OpenAI (March 2023). Anthropic develops the Claude model family. The claim is contradicted by both vendors' primary documentation. A correct agent run returns the high-risk verdict and names the contradicting source — it must not soften a contradicted claim to the unverified tier.

Case 3 — unverifiable

  • Claim: A Norwegian public-sector agency cut its case-handling time by exactly 37% in Q3 2025 after deploying an internal AI assistant.
  • Verdict: 🟡
  • Fasit: Unverifiable. No named agency, no published report, and no primary source exists for this precise figure; an internal operational metric of this kind is not independently confirmable from open sources. A correct agent run returns the unverified verdict and states explicitly that the claim cannot be verified — it must not fill the gap by inventing a plausible source or promoting the claim to the verified tier.