ktg-plugin-marketplace/plugins/llm-security/examples/pre-compact-poisoning/expected-findings.md

# Expected findings — pre-compact-poisoning

This is the testable contract enforced by
`run-pre-compact-poisoning.mjs`. Nine independent assertions
across four scenarios. Any drift = hook regression or fixture rot.

## Required assertions (9 / 9 must pass)

### Scenario A — block mode + poisoned transcript

The poisoned transcript embeds two distinct triggers:

- An "ignore all previous instructions" phrase inside a synthetic
  `tool_result` block (matches `CRITICAL_PATTERNS` in
  `injection-patterns.mjs`).
- An AWS-shaped key built at runtime via string concatenation
  (matches `SECRET_PATTERNS` regex `/AKIA[0-9A-Z]{16}/`).

A.1 Hook exits with code `2`.
A.2 Stdout is JSON `{ "decision": "block", "reason": "..." }`.
A.3 The `reason` string mentions both:
    - an injection label (`/ignore previous|override/i`), AND
    - the AWS key label (`/AWS Access Key/i`).

If A.3 fails, either the injection-patterns regex set or the
SECRET_PATTERNS table changed in a way that dropped one of these
labels.

### Scenario B — warn mode + poisoned transcript

B.1 Hook exits with code `0` (advisory, not block).
B.2 Stdout is JSON `{ "systemMessage": "..." }` with no
    `decision` field. The `systemMessage` summary is the same as
    the block-mode `reason` text.

### Scenario C — off mode + poisoned transcript

C.1 Hook exits with code `0`.
C.2 Stdout is empty (no JSON). The `off` branch returns at the
    top of the script before reading the transcript at all,
    which is the documented "fully disabled" semantic.

### Scenario D — block mode + benign transcript

This is the brick-wall control: it proves the hook does not
reflexively block all compactions.

D.1 Hook exits with code `0`.
D.2 Stdout has no `decision: "block"` JSON. (Either no JSON or
    a non-block payload — the assertion only fails on a literal
    block decision, which would indicate a false positive.)

## Total finding shape (block mode)

```
pre-compact-scan (auto): 3 finding(s) in transcript. Compaction
may preserve poisoned content in condensed form. Top: override:
ignore previous instructions, indirect: instruction addressed
to AI/assistant, AWS Access Key ID.
```

The "3 finding(s)" count covers:

1. CRITICAL — `override: ignore previous instructions`
2. MEDIUM  — `indirect: instruction addressed to AI/assistant`
   (the synthetic tool-result text frames the injection as a
   "Note to assistant", which trips the indirect-address pattern)
3. SECRET  — `AWS Access Key ID`

If `injection-patterns.mjs` adds new MEDIUM rules that match the
fixture text, the count and `Top: ...` ordering may shift. The
contract only asserts the *labels* in the reason string, not the
finding count or order — that flexibility is intentional.

## Out of scope (intentionally)

- The other secret labels in `SECRET_PATTERNS`
  (GitHub / npm / PEM / bearer / generic). Demonstrating those
  would require either growing the fixture or building each at
  runtime; the AWS key alone is sufficient to prove the
  credential-finding path activates.
- The 512 KB tail cap (`LLM_SECURITY_PRECOMPACT_MAX_BYTES`) — not
  exercised because the synthetic transcript is small.
- The leetspeak / homoglyph / multi-language MEDIUM patterns —
  exercised by `examples/prompt-injection-showcase/`.
- The `compaction_trigger` legacy field name (the hook reads
  both `trigger` and `compaction_trigger`) — only `trigger` is
  exercised here.