History

Kjell Tore Guttormsen b6d912200e feat(llm-security): add pre-compact-poisoning example for PreCompact hook [skip-docs] Runnable demonstration of hooks/scripts/pre-compact-scan.mjs (the only PreCompact hook in the plugin) detecting both a CRITICAL injection pattern and an AWS-shaped credential inside a synthetic JSONL transcript, exercised across all three values of LLM_SECURITY_PRECOMPACT_MODE plus a benign-transcript control case in block mode that proves the gate is not a brick wall. The transcript is generated at runtime in a per-invocation tempdir under os.tmpdir() and the directory is removed in a finally block, so the user's real ~/.claude/projects/.../transcripts/ are never touched. The AWS-shaped key uses the same 'AK' + 'IA' + ... fragmentation idiom as tests/e2e/attack-chain.test.mjs so this source contains no literal credentials and pre-edit-secrets does not block writes during development. Nine independent assertions (9/9 must pass): - block mode + poisoned: exit 2, decision=block JSON, reason text covers both injection and AWS labels (3 assertions) - warn mode + poisoned: exit 0, systemMessage JSON, no decision field (2 assertions) - off mode + poisoned: exit 0, no JSON on stdout (2 assertions) - block mode + benign: exit 0, no decision=block JSON (2 assertions) OWASP / framework mapping: LLM01, LLM02, ASI01, AT-1, AT-3. Docs updated: plugin README "Other runnable examples", plugin CLAUDE.md "Examples" tabellen, CHANGELOG [Unreleased] Added. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>		2026-05-05 15:23:10 +02:00
..
expected-findings.md	feat(llm-security): add pre-compact-poisoning example for PreCompact hook [skip-docs]	2026-05-05 15:23:10 +02:00
README.md	feat(llm-security): add pre-compact-poisoning example for PreCompact hook [skip-docs]	2026-05-05 15:23:10 +02:00
run-pre-compact-poisoning.mjs	feat(llm-security): add pre-compact-poisoning example for PreCompact hook [skip-docs]	2026-05-05 15:23:10 +02:00

README.md

Pre-Compact Poisoning Walkthrough

WARNING: This is a demonstration fixture, NOT a real attack. The transcript is generated at runtime in a per-invocation tempdir. The user's real ~/.claude/projects/.../transcripts/ are never touched, and this source file contains no literal credentials.

What this demonstrates

hooks/scripts/pre-compact-scan.mjs is the only PreCompact hook in the plugin. It runs before Claude Code compacts the conversation context — auto-compaction at the context-window limit, or the user pressing /compact. Its job is to flag poisoned content before that content survives into a condensed form where the surrounding injection context is no longer visible to the model.

The hook reads at most the last 512 KB of the transcript JSONL file and applies two pattern sets:

Prompt-injection patterns — CRITICAL_PATTERNS and MEDIUM_PATTERNS from scanners/lib/injection-patterns.mjs (the same set used by pre-prompt-inject-scan and post-mcp-verify).
Credential regexes — a small SECRET_PATTERNS table for AWS access keys, GitHub tokens, npm tokens, PEM private-key block headers, generic credential assignments, and bearer tokens.

Behaviour is controlled by LLM_SECURITY_PRECOMPACT_MODE:

Mode	Finding present	Exit	Stdout
`off`	(any)	0	(empty — scan skipped entirely)
`warn`	yes	0	`{ "systemMessage": "..." }`
`warn`	no	0	(empty)
`block`	yes	2	`{ "decision": "block", "reason": "..." }`
`block`	no	0	(empty)

Default is warn.

Fixture layout

examples/pre-compact-poisoning/
  README.md                       # this file
  run-pre-compact-poisoning.mjs   # builds transcripts in tempdir, drives the hook
  expected-findings.md            # testable contract

There is no on-disk fixture. The run script:

Creates a tempdir under os.tmpdir() via mkdtempSync.
Writes two synthetic JSONL transcripts to that tempdir:
- poisoned-transcript.jsonl — contains an "ignore previous instructions" phrase inside a synthetic tool_result block, plus an AWS access-key ID built at runtime via string concatenation (matches /AKIA[0-9A-Z]{16}/).
- benign-transcript.jsonl — a plain Q&A about listing files.
Spawns hooks/scripts/pre-compact-scan.mjs with { session_id, transcript_path, hook_event_name: "PreCompact", trigger: "auto" } on stdin.
Cleans up the tempdir in a finally block.

The AWS-shaped key is constructed via the same fragmentation pattern used in tests/e2e/attack-chain.test.mjs ('AK' + 'IA' + 'IOSFODNN7' + 'EXAMPLE') so this source contains no literal credentials and pre-edit-secrets.mjs does not block it from being written.

How to run

cd plugins/llm-security
node examples/pre-compact-poisoning/run-pre-compact-poisoning.mjs

# Verbose — show full hook stdout/stderr per case
node examples/pre-compact-poisoning/run-pre-compact-poisoning.mjs --verbose

Expected: 9 pass, 0 fail across four scenarios:

block + poisoned → exit 2, structured decision=block JSON, reason text covers both an injection label and the AWS-key label.
warn + poisoned → exit 0, systemMessage JSON (no decision field).
off + poisoned → exit 0, no JSON on stdout (scan skipped).
block + benign → exit 0, no decision=block JSON (proves the gate is not a brick wall on benign content).

Hook involved

hooks/scripts/pre-compact-scan.mjs — invoked via child_process.spawnSync('node', [HOOK], { input: stdin }) to match the harness contract exactly. The hook reads the transcript via readTailCapped(filePath, MAX_BYTES), flattens JSONL message content via extractTextFromTranscript, then runs the two pattern sets. No Claude Code agent runtime is required.

The orchestrated /security audit flow does not run this hook (it's a runtime defence, not a scan-time check). This walkthrough exercises the runtime contract directly.

Why pre-compact poisoning matters

Compaction collapses long conversations into a summary that the model treats as authoritative context for the rest of the session. If a malicious tool result earlier in the conversation managed to sneak past post-mcp-verify (e.g., via a pattern not yet in the regex set), compaction can preserve a condensed form of the poison where the model can no longer see the surrounding "this came from a sketchy source" context. Worse, condensed summaries are smaller and so more likely to fit inside the attacker's preferred attention window.

pre-compact-scan is a second chance to catch poison that slipped past the runtime gates — a defence-in-depth pattern that matches the joint-paper finding that no single-layer defence holds against adaptive attacks.

OWASP / framework mapping

Code	Framework	Why
LLM01	OWASP LLM Top 10 (2025)	Prompt injection persisting through compaction
LLM02	OWASP LLM Top 10 (2025)	Sensitive information disclosure — credentials in transcript
ASI01	OWASP Agentic Top 10	Memory poisoning via condensed form
AT-1	DeepMind Agent Traps	Hidden cognitive priors carried across context boundary
AT-3	DeepMind Agent Traps	Tool-output indirection that survives summarisation

Limitations

MAX_BYTES defaults to 512 000 bytes. Earlier-in-history poison that does not appear in the last 512 KB of the transcript is not scanned. The cap exists for the documented <500 ms latency target on large transcripts. Tune via LLM_SECURITY_PRECOMPACT_MAX_BYTES.
The credential regex set is small by design (compaction is performance-sensitive). The full secrets regex set lives in pre-edit-secrets.mjs, which fires on a different event.
The hook does not modify the transcript — it only blocks compaction or emits an advisory. Poison that has already shaped the conversation may still influence the model in the current window.