Runnable demonstration of hooks/scripts/pre-compact-scan.mjs (the only PreCompact hook in the plugin) detecting both a CRITICAL injection pattern and an AWS-shaped credential inside a synthetic JSONL transcript, exercised across all three values of LLM_SECURITY_PRECOMPACT_MODE plus a benign-transcript control case in block mode that proves the gate is not a brick wall. The transcript is generated at runtime in a per-invocation tempdir under os.tmpdir() and the directory is removed in a finally block, so the user's real ~/.claude/projects/.../transcripts/ are never touched. The AWS-shaped key uses the same 'AK' + 'IA' + ... fragmentation idiom as tests/e2e/attack-chain.test.mjs so this source contains no literal credentials and pre-edit-secrets does not block writes during development. Nine independent assertions (9/9 must pass): - block mode + poisoned: exit 2, decision=block JSON, reason text covers both injection and AWS labels (3 assertions) - warn mode + poisoned: exit 0, systemMessage JSON, no decision field (2 assertions) - off mode + poisoned: exit 0, no JSON on stdout (2 assertions) - block mode + benign: exit 0, no decision=block JSON (2 assertions) OWASP / framework mapping: LLM01, LLM02, ASI01, AT-1, AT-3. Docs updated: plugin README "Other runnable examples", plugin CLAUDE.md "Examples" tabellen, CHANGELOG [Unreleased] Added. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
3.4 KiB
Expected findings — pre-compact-poisoning
This is the testable contract enforced by
run-pre-compact-poisoning.mjs. Nine independent assertions
across four scenarios. Any drift = hook regression or fixture rot.
Required assertions (9 / 9 must pass)
Scenario A — block mode + poisoned transcript
The poisoned transcript embeds two distinct triggers:
- An "ignore all previous instructions" phrase inside a synthetic
tool_resultblock (matchesCRITICAL_PATTERNSininjection-patterns.mjs). - An AWS-shaped key built at runtime via string concatenation
(matches
SECRET_PATTERNSregex/AKIA[0-9A-Z]{16}/).
A.1 Hook exits with code 2.
A.2 Stdout is JSON { "decision": "block", "reason": "..." }.
A.3 The reason string mentions both:
- an injection label (/ignore previous|override/i), AND
- the AWS key label (/AWS Access Key/i).
If A.3 fails, either the injection-patterns regex set or the SECRET_PATTERNS table changed in a way that dropped one of these labels.
Scenario B — warn mode + poisoned transcript
B.1 Hook exits with code 0 (advisory, not block).
B.2 Stdout is JSON { "systemMessage": "..." } with no
decision field. The systemMessage summary is the same as
the block-mode reason text.
Scenario C — off mode + poisoned transcript
C.1 Hook exits with code 0.
C.2 Stdout is empty (no JSON). The off branch returns at the
top of the script before reading the transcript at all,
which is the documented "fully disabled" semantic.
Scenario D — block mode + benign transcript
This is the brick-wall control: it proves the hook does not reflexively block all compactions.
D.1 Hook exits with code 0.
D.2 Stdout has no decision: "block" JSON. (Either no JSON or
a non-block payload — the assertion only fails on a literal
block decision, which would indicate a false positive.)
Total finding shape (block mode)
pre-compact-scan (auto): 3 finding(s) in transcript. Compaction
may preserve poisoned content in condensed form. Top: override:
ignore previous instructions, indirect: instruction addressed
to AI/assistant, AWS Access Key ID.
The "3 finding(s)" count covers:
- CRITICAL —
override: ignore previous instructions - MEDIUM —
indirect: instruction addressed to AI/assistant(the synthetic tool-result text frames the injection as a "Note to assistant", which trips the indirect-address pattern) - SECRET —
AWS Access Key ID
If injection-patterns.mjs adds new MEDIUM rules that match the
fixture text, the count and Top: ... ordering may shift. The
contract only asserts the labels in the reason string, not the
finding count or order — that flexibility is intentional.
Out of scope (intentionally)
- The other secret labels in
SECRET_PATTERNS(GitHub / npm / PEM / bearer / generic). Demonstrating those would require either growing the fixture or building each at runtime; the AWS key alone is sufficient to prove the credential-finding path activates. - The 512 KB tail cap (
LLM_SECURITY_PRECOMPACT_MAX_BYTES) — not exercised because the synthetic transcript is small. - The leetspeak / homoglyph / multi-language MEDIUM patterns —
exercised by
examples/prompt-injection-showcase/. - The
compaction_triggerlegacy field name (the hook reads bothtriggerandcompaction_trigger) — onlytriggeris exercised here.