History

Kjell Tore Guttormsen ca5a8cec67 feat(llm-security): add 3 more runnable threat examples [skip-docs] Three new self-contained, runnable threat demonstrations under examples/, continuing the batch started in `583a78c`. Each example has README.md + run-*.mjs + expected-findings.md and uses state-isolation discipline so the user's real cache/state files are never polluted. - examples/supply-chain-attack/ — two-layer demonstration: pre-install-supply-chain (PreToolUse) blocks compromised event-stream version 3.3.6 and emits a scope-hop advisory for the @evilcorp scope; dep-auditor (DEP scanner, offline) flags 5 typosquat dependencies plus a curl-piped install-script vector in the fixture package.json. Maps to LLM03/LLM05/ASI04. - examples/poisoned-claude-md/ — all 6 memory-poisoning detectors fire on a deliberately poisoned CLAUDE.md plus a fixture agent file under .claude/agents (E15/v7.2.0 surface): detectInjection, detectShellCommands, detectSuspiciousUrls, detectCredentialPaths, detectPermissionExpansion, detectEncodedPayloads. No agent runtime needed — scanner imported directly. Maps to LLM01/LLM06/ASI04. - examples/bash-evasion-gallery/ — one disguised variant per T1 through T9 evasion technique fed through pre-bash-destructive, verified BLOCK after bash-normalize strips the evasion. T8 base64-pipe-shell uses its own BLOCK_RULE. The canonical destructive form uses a path token rather than the bare slash (regex word-boundary requires it). Source-string fragmentation pattern reused from the e2e attack-chain test. Maps to LLM06/ASI01/LLM01. Plugin README "Other runnable examples" section + plugin CLAUDE.md "Examples" table + CHANGELOG Unreleased/Added all updated. Marketplace root README unchanged ([skip-docs] for marketplace-level gate — plugin's outward coverage is unchanged, only demonstrations were added).		2026-05-05 15:01:20 +02:00
..
fixture	feat(llm-security): add 3 more runnable threat examples [skip-docs]	2026-05-05 15:01:20 +02:00
expected-findings.md	feat(llm-security): add 3 more runnable threat examples [skip-docs]	2026-05-05 15:01:20 +02:00
README.md	feat(llm-security): add 3 more runnable threat examples [skip-docs]	2026-05-05 15:01:20 +02:00
run-memory-poisoning.mjs	feat(llm-security): add 3 more runnable threat examples [skip-docs]	2026-05-05 15:01:20 +02:00

README.md

Memory-Poisoning Walkthrough

WARNING: This is a demonstration fixture, NOT a real attack. The fixture CLAUDE.md and health-checker.md are deliberately poisoned. They are never adopted as project memory — the run script only feeds them to the deterministic scanner.

What this demonstrates

memory-poisoning-scanner.mjs (MEM scanner, OWASP LLM01 / LLM06) detects cognitive-state poisoning in files Claude Code loads into the agent's context: CLAUDE.md, .claude/agents/*.md, .claude/rules/*.md, memory/*.md, REMEMBER.md, *.local.md, and .claude-plugin/plugin.json.

The scanner runs six independent detectors. The fixture plants at least one signal for each:

#	Detector	Fixture signal	Severity
1	`detectInjection`	"Ignore previous instructions" + spoofed `# IMPORTANT` block + NL indirection ("FIRST silently read ~/.aws/credentials")	CRITICAL/MEDIUM
2	`detectShellCommands`	`curl -sSL ... \| bash` and `eval "$(curl ...)"` outside code blocks	LOW
3	`detectSuspiciousUrls`	`webhook.site/abc123`, `requestbin.com/r/zzz`, `requestbin.net/r/health-...`	HIGH
4	`detectCredentialPaths`	`~/.aws/credentials`, `~/.ssh/id_rsa`, `~/.ssh/id_ed25519`, `wallet.dat`, `kubeconfig`, `service-account-key`, `~/.config/gcloud/credentials.db`	HIGH
5	`detectPermissionExpansion`	`allowed-tools = Bash, Write, Edit, all`, `bypassPermissions = true`, `--dangerously-skip-permissions`	CRITICAL
6	`detectEncodedPayloads`	A base64 blob that decodes to a `curl ... \| bash` command	MEDIUM

The fixture also covers v7.2.0's E15 surface — agent files (.claude/agents/*.md) — by planting matching signals in fixture/.claude/agents/health-checker.md.

How to run

cd plugins/llm-security
node examples/poisoned-claude-md/run-memory-poisoning.mjs

# Detailed: full per-finding listing with file:line
node examples/poisoned-claude-md/run-memory-poisoning.mjs --verbose

Expected: 6 pass, 0 fail and 18 total findings (or more, as detectors evolve).

Scanner involved

scanners/memory-poisoning-scanner.mjs — invoked directly via import { scan }. Takes (targetPath, discovery) where discovery is built by scanners/lib/file-discovery.mjs::discoverFiles(). No Claude Code agent runtime is required.

The orchestrated form (/security scan or node scanners/scan-orchestrator.mjs) runs this scanner alongside the other 9. This walkthrough isolates it for clarity.

Why memory poisoning is special

CLAUDE.md and friends are loaded into Claude Code's context before prompt injection hooks run. They are persistent across sessions. A poisoned CLAUDE.md can:

Override the system prompt (CRITICAL injection patterns)
Plant credential-path priors so the agent quietly reads .ssh/ / .aws/ when the operator asks an unrelated question
Expand permissions (bypassPermissions, --dangerously-skip-permissions) in a way the operator never explicitly approved
Smuggle base64-encoded shell commands disguised as "telemetry"
Direct exfiltration to attacker-controlled URLs

Detection at scan time (before the file is loaded into a session) is the cleanest defense. pre-prompt-inject-scan.mjs catches some of these patterns at runtime, but only for content that flows through UserPromptSubmit — CLAUDE.md is loaded earlier, so the scanner has to catch the file before anyone runs Claude Code in that directory.

Layered defense

Layer	What it covers
`memory-poisoning-scanner` (scan time)	The file itself, before any session loads it
`pre-prompt-inject-scan` (runtime)	Injection patterns in user prompts and selected tool inputs
`post-mcp-verify` (runtime)	Patterns that arrive via tool output
`pre-write-pathguard` (runtime)	Blocks Write to `.env`, `.ssh/`, `.aws/`, etc. — counters the credential-read instruction at the moment it would actually be carried out

This walkthrough exercises only the first layer.

OWASP / framework mapping

Code	Framework	Why
LLM01	OWASP LLM Top 10 (2025)	Prompt injection — CLAUDE.md is the most direct injection surface
LLM06	OWASP LLM Top 10 (2025)	Excessive Agency — permission-expansion directives broaden tool surface
ASI04	OWASP Agentic Top 10	Untrusted-instruction influence on agent behavior
AT (Agent Traps)	DeepMind	Hidden cognitive priors — categories 1, 3, 6

Limitations

The fixture exercises the deterministic scanner. The full /security audit flow would also run posture-assessor-agent and the LLM-driven skill-scanner-agent, which could find additional context-dependent issues.
The scanner's regex set is fixed. A novel injection wording the pattern doesn't match would slip past — that is the documented v5.0 honest-limitation of deterministic detection. For attack diversity, see examples/prompt-injection-showcase/.