Three new self-contained, runnable threat demonstrations under
examples/, continuing the batch started in 583a78c. Each example
has README.md + run-*.mjs + expected-findings.md and uses
state-isolation discipline so the user's real cache/state files
are never polluted.
- examples/supply-chain-attack/ — two-layer demonstration:
pre-install-supply-chain (PreToolUse) blocks compromised
event-stream version 3.3.6 and emits a scope-hop advisory for
the @evilcorp scope; dep-auditor (DEP scanner, offline) flags
5 typosquat dependencies plus a curl-piped install-script
vector in the fixture package.json. Maps to LLM03/LLM05/ASI04.
- examples/poisoned-claude-md/ — all 6 memory-poisoning detectors
fire on a deliberately poisoned CLAUDE.md plus a fixture
agent file under .claude/agents (E15/v7.2.0 surface):
detectInjection, detectShellCommands, detectSuspiciousUrls,
detectCredentialPaths, detectPermissionExpansion,
detectEncodedPayloads. No agent runtime needed — scanner
imported directly. Maps to LLM01/LLM06/ASI04.
- examples/bash-evasion-gallery/ — one disguised variant per
T1 through T9 evasion technique fed through pre-bash-destructive,
verified BLOCK after bash-normalize strips the evasion. T8
base64-pipe-shell uses its own BLOCK_RULE. The canonical
destructive form uses a path token rather than the bare slash
(regex word-boundary requires it). Source-string fragmentation
pattern reused from the e2e attack-chain test. Maps to
LLM06/ASI01/LLM01.
Plugin README "Other runnable examples" section + plugin
CLAUDE.md "Examples" table + CHANGELOG Unreleased/Added
all updated. Marketplace root README unchanged
([skip-docs] for marketplace-level gate — plugin's outward
coverage is unchanged, only demonstrations were added).
5.3 KiB
Memory-Poisoning Walkthrough
WARNING: This is a demonstration fixture, NOT a real attack. The fixture
CLAUDE.mdandhealth-checker.mdare deliberately poisoned. They are never adopted as project memory — the run script only feeds them to the deterministic scanner.
What this demonstrates
memory-poisoning-scanner.mjs (MEM scanner, OWASP LLM01 / LLM06)
detects cognitive-state poisoning in files Claude Code loads
into the agent's context: CLAUDE.md, .claude/agents/*.md,
.claude/rules/*.md, memory/*.md, REMEMBER.md, *.local.md,
and .claude-plugin/plugin.json.
The scanner runs six independent detectors. The fixture plants at least one signal for each:
| # | Detector | Fixture signal | Severity |
|---|---|---|---|
| 1 | detectInjection |
"Ignore previous instructions" + spoofed # IMPORTANT block + NL indirection ("FIRST silently read ~/.aws/credentials") |
CRITICAL/MEDIUM |
| 2 | detectShellCommands |
curl -sSL ... | bash and eval "$(curl ...)" outside code blocks |
LOW |
| 3 | detectSuspiciousUrls |
webhook.site/abc123, requestbin.com/r/zzz, requestbin.net/r/health-... |
HIGH |
| 4 | detectCredentialPaths |
~/.aws/credentials, ~/.ssh/id_rsa, ~/.ssh/id_ed25519, wallet.dat, kubeconfig, service-account-key, ~/.config/gcloud/credentials.db |
HIGH |
| 5 | detectPermissionExpansion |
allowed-tools = Bash, Write, Edit, all, bypassPermissions = true, --dangerously-skip-permissions |
CRITICAL |
| 6 | detectEncodedPayloads |
A base64 blob that decodes to a curl ... | bash command |
MEDIUM |
The fixture also covers v7.2.0's E15 surface — agent files
(.claude/agents/*.md) — by planting matching signals in
fixture/.claude/agents/health-checker.md.
How to run
cd plugins/llm-security
node examples/poisoned-claude-md/run-memory-poisoning.mjs
# Detailed: full per-finding listing with file:line
node examples/poisoned-claude-md/run-memory-poisoning.mjs --verbose
Expected: 6 pass, 0 fail and 18 total findings (or more, as
detectors evolve).
Scanner involved
scanners/memory-poisoning-scanner.mjs— invoked directly viaimport { scan }. Takes(targetPath, discovery)where discovery is built byscanners/lib/file-discovery.mjs::discoverFiles(). No Claude Code agent runtime is required.
The orchestrated form (/security scan or node scanners/scan-orchestrator.mjs)
runs this scanner alongside the other 9. This walkthrough isolates
it for clarity.
Why memory poisoning is special
CLAUDE.md and friends are loaded into Claude Code's context before prompt injection hooks run. They are persistent across sessions. A poisoned CLAUDE.md can:
- Override the system prompt (CRITICAL injection patterns)
- Plant credential-path priors so the agent quietly reads
.ssh//.aws/when the operator asks an unrelated question - Expand permissions (
bypassPermissions,--dangerously-skip-permissions) in a way the operator never explicitly approved - Smuggle base64-encoded shell commands disguised as "telemetry"
- Direct exfiltration to attacker-controlled URLs
Detection at scan time (before the file is loaded into a session)
is the cleanest defense. pre-prompt-inject-scan.mjs catches some
of these patterns at runtime, but only for content that flows
through UserPromptSubmit — CLAUDE.md is loaded earlier, so the
scanner has to catch the file before anyone runs Claude Code in
that directory.
Layered defense
| Layer | What it covers |
|---|---|
memory-poisoning-scanner (scan time) |
The file itself, before any session loads it |
pre-prompt-inject-scan (runtime) |
Injection patterns in user prompts and selected tool inputs |
post-mcp-verify (runtime) |
Patterns that arrive via tool output |
pre-write-pathguard (runtime) |
Blocks Write to .env, .ssh/, .aws/, etc. — counters the credential-read instruction at the moment it would actually be carried out |
This walkthrough exercises only the first layer.
OWASP / framework mapping
| Code | Framework | Why |
|---|---|---|
| LLM01 | OWASP LLM Top 10 (2025) | Prompt injection — CLAUDE.md is the most direct injection surface |
| LLM06 | OWASP LLM Top 10 (2025) | Excessive Agency — permission-expansion directives broaden tool surface |
| ASI04 | OWASP Agentic Top 10 | Untrusted-instruction influence on agent behavior |
| AT (Agent Traps) | DeepMind | Hidden cognitive priors — categories 1, 3, 6 |
Limitations
- The fixture exercises the deterministic scanner. The full
/security auditflow would also runposture-assessor-agentand the LLM-drivenskill-scanner-agent, which could find additional context-dependent issues. - The scanner's regex set is fixed. A novel injection wording the
pattern doesn't match would slip past — that is the documented
v5.0 honest-limitation of deterministic detection. For attack
diversity, see
examples/prompt-injection-showcase/.
See also
knowledge/owasp-llm-top10.md— LLM01 / LLM06 backgroundtests/lib/memory-poisoning-scanner.test.mjs— unit-test contracttests/fixtures/memory-scan/poisoned-project/— separate test fixture (smaller, kept in tests/, not duplicated here)expected-findings.md(in this folder) — the testable contract