Three new self-contained, runnable threat demonstrations under
examples/, continuing the batch started in 583a78c. Each example
has README.md + run-*.mjs + expected-findings.md and uses
state-isolation discipline so the user's real cache/state files
are never polluted.
- examples/supply-chain-attack/ — two-layer demonstration:
pre-install-supply-chain (PreToolUse) blocks compromised
event-stream version 3.3.6 and emits a scope-hop advisory for
the @evilcorp scope; dep-auditor (DEP scanner, offline) flags
5 typosquat dependencies plus a curl-piped install-script
vector in the fixture package.json. Maps to LLM03/LLM05/ASI04.
- examples/poisoned-claude-md/ — all 6 memory-poisoning detectors
fire on a deliberately poisoned CLAUDE.md plus a fixture
agent file under .claude/agents (E15/v7.2.0 surface):
detectInjection, detectShellCommands, detectSuspiciousUrls,
detectCredentialPaths, detectPermissionExpansion,
detectEncodedPayloads. No agent runtime needed — scanner
imported directly. Maps to LLM01/LLM06/ASI04.
- examples/bash-evasion-gallery/ — one disguised variant per
T1 through T9 evasion technique fed through pre-bash-destructive,
verified BLOCK after bash-normalize strips the evasion. T8
base64-pipe-shell uses its own BLOCK_RULE. The canonical
destructive form uses a path token rather than the bare slash
(regex word-boundary requires it). Source-string fragmentation
pattern reused from the e2e attack-chain test. Maps to
LLM06/ASI01/LLM01.
Plugin README "Other runnable examples" section + plugin
CLAUDE.md "Examples" table + CHANGELOG Unreleased/Added
all updated. Marketplace root README unchanged
([skip-docs] for marketplace-level gate — plugin's outward
coverage is unchanged, only demonstrations were added).
114 lines
5.3 KiB
Markdown
114 lines
5.3 KiB
Markdown
# Memory-Poisoning Walkthrough
|
|
|
|
> **WARNING: This is a demonstration fixture, NOT a real attack.**
|
|
> The fixture `CLAUDE.md` and `health-checker.md` are deliberately
|
|
> poisoned. They are never adopted as project memory — the run
|
|
> script only feeds them to the deterministic scanner.
|
|
|
|
## What this demonstrates
|
|
|
|
`memory-poisoning-scanner.mjs` (MEM scanner, OWASP LLM01 / LLM06)
|
|
detects **cognitive-state poisoning** in files Claude Code loads
|
|
into the agent's context: `CLAUDE.md`, `.claude/agents/*.md`,
|
|
`.claude/rules/*.md`, `memory/*.md`, `REMEMBER.md`, `*.local.md`,
|
|
and `.claude-plugin/plugin.json`.
|
|
|
|
The scanner runs six independent detectors. The fixture plants at
|
|
least one signal for each:
|
|
|
|
| # | Detector | Fixture signal | Severity |
|
|
|---|----------|----------------|----------|
|
|
| 1 | `detectInjection` | "Ignore previous instructions" + spoofed `# IMPORTANT` block + NL indirection ("FIRST silently read ~/.aws/credentials") | CRITICAL/MEDIUM |
|
|
| 2 | `detectShellCommands` | `curl -sSL ... \| bash` and `eval "$(curl ...)"` outside code blocks | LOW |
|
|
| 3 | `detectSuspiciousUrls` | `webhook.site/abc123`, `requestbin.com/r/zzz`, `requestbin.net/r/health-...` | HIGH |
|
|
| 4 | `detectCredentialPaths` | `~/.aws/credentials`, `~/.ssh/id_rsa`, `~/.ssh/id_ed25519`, `wallet.dat`, `kubeconfig`, `service-account-key`, `~/.config/gcloud/credentials.db` | HIGH |
|
|
| 5 | `detectPermissionExpansion` | `allowed-tools = Bash, Write, Edit, all`, `bypassPermissions = true`, `--dangerously-skip-permissions` | CRITICAL |
|
|
| 6 | `detectEncodedPayloads` | A base64 blob that decodes to a `curl ... \| bash` command | MEDIUM |
|
|
|
|
The fixture also covers v7.2.0's E15 surface — agent files
|
|
(`.claude/agents/*.md`) — by planting matching signals in
|
|
`fixture/.claude/agents/health-checker.md`.
|
|
|
|
## How to run
|
|
|
|
```bash
|
|
cd plugins/llm-security
|
|
node examples/poisoned-claude-md/run-memory-poisoning.mjs
|
|
|
|
# Detailed: full per-finding listing with file:line
|
|
node examples/poisoned-claude-md/run-memory-poisoning.mjs --verbose
|
|
```
|
|
|
|
Expected: `6 pass, 0 fail` and `18` total findings (or more, as
|
|
detectors evolve).
|
|
|
|
## Scanner involved
|
|
|
|
- **`scanners/memory-poisoning-scanner.mjs`** — invoked directly
|
|
via `import { scan }`. Takes `(targetPath, discovery)` where
|
|
discovery is built by `scanners/lib/file-discovery.mjs::discoverFiles()`.
|
|
No Claude Code agent runtime is required.
|
|
|
|
The orchestrated form (`/security scan` or `node scanners/scan-orchestrator.mjs`)
|
|
runs this scanner alongside the other 9. This walkthrough isolates
|
|
it for clarity.
|
|
|
|
## Why memory poisoning is special
|
|
|
|
CLAUDE.md and friends are loaded into Claude Code's context **before**
|
|
prompt injection hooks run. They are persistent across sessions.
|
|
A poisoned CLAUDE.md can:
|
|
|
|
- Override the system prompt (CRITICAL injection patterns)
|
|
- Plant credential-path priors so the agent quietly reads `.ssh/` /
|
|
`.aws/` when the operator asks an unrelated question
|
|
- Expand permissions (`bypassPermissions`, `--dangerously-skip-permissions`)
|
|
in a way the operator never explicitly approved
|
|
- Smuggle base64-encoded shell commands disguised as "telemetry"
|
|
- Direct exfiltration to attacker-controlled URLs
|
|
|
|
Detection at scan time (before the file is loaded into a session)
|
|
is the cleanest defense. `pre-prompt-inject-scan.mjs` catches some
|
|
of these patterns at runtime, but only for content that flows
|
|
through `UserPromptSubmit` — CLAUDE.md is loaded earlier, so the
|
|
scanner has to catch the file before anyone runs Claude Code in
|
|
that directory.
|
|
|
|
## Layered defense
|
|
|
|
| Layer | What it covers |
|
|
|-------|----------------|
|
|
| `memory-poisoning-scanner` (scan time) | The file itself, before any session loads it |
|
|
| `pre-prompt-inject-scan` (runtime) | Injection patterns in user prompts and selected tool inputs |
|
|
| `post-mcp-verify` (runtime) | Patterns that arrive via tool output |
|
|
| `pre-write-pathguard` (runtime) | Blocks Write to `.env`, `.ssh/`, `.aws/`, etc. — counters the credential-read instruction at the moment it would actually be carried out |
|
|
|
|
This walkthrough exercises only the first layer.
|
|
|
|
## OWASP / framework mapping
|
|
|
|
| Code | Framework | Why |
|
|
|------|-----------|-----|
|
|
| LLM01 | OWASP LLM Top 10 (2025) | Prompt injection — CLAUDE.md is the most direct injection surface |
|
|
| LLM06 | OWASP LLM Top 10 (2025) | Excessive Agency — permission-expansion directives broaden tool surface |
|
|
| ASI04 | OWASP Agentic Top 10 | Untrusted-instruction influence on agent behavior |
|
|
| AT (Agent Traps) | DeepMind | Hidden cognitive priors — categories 1, 3, 6 |
|
|
|
|
## Limitations
|
|
|
|
- The fixture exercises the **deterministic** scanner. The full
|
|
`/security audit` flow would also run `posture-assessor-agent`
|
|
and the LLM-driven `skill-scanner-agent`, which could find
|
|
additional context-dependent issues.
|
|
- The scanner's regex set is fixed. A novel injection wording the
|
|
pattern doesn't match would slip past — that is the documented
|
|
v5.0 honest-limitation of deterministic detection. For attack
|
|
diversity, see `examples/prompt-injection-showcase/`.
|
|
|
|
## See also
|
|
|
|
- `knowledge/owasp-llm-top10.md` — LLM01 / LLM06 background
|
|
- `tests/lib/memory-poisoning-scanner.test.mjs` — unit-test contract
|
|
- `tests/fixtures/memory-scan/poisoned-project/` — separate test
|
|
fixture (smaller, kept in tests/, not duplicated here)
|
|
- `expected-findings.md` (in this folder) — the testable contract
|