Kjell Tore Guttormsen ca5a8cec67 feat(llm-security): add 3 more runnable threat examples [skip-docs]

Three new self-contained, runnable threat demonstrations under
examples/, continuing the batch started in 583a78c. Each example
has README.md + run-*.mjs + expected-findings.md and uses
state-isolation discipline so the user's real cache/state files
are never polluted.

- examples/supply-chain-attack/ — two-layer demonstration:
  pre-install-supply-chain (PreToolUse) blocks compromised
  event-stream version 3.3.6 and emits a scope-hop advisory for
  the @evilcorp scope; dep-auditor (DEP scanner, offline) flags
  5 typosquat dependencies plus a curl-piped install-script
  vector in the fixture package.json. Maps to LLM03/LLM05/ASI04.

- examples/poisoned-claude-md/ — all 6 memory-poisoning detectors
  fire on a deliberately poisoned CLAUDE.md plus a fixture
  agent file under .claude/agents (E15/v7.2.0 surface):
  detectInjection, detectShellCommands, detectSuspiciousUrls,
  detectCredentialPaths, detectPermissionExpansion,
  detectEncodedPayloads. No agent runtime needed — scanner
  imported directly. Maps to LLM01/LLM06/ASI04.

- examples/bash-evasion-gallery/ — one disguised variant per
  T1 through T9 evasion technique fed through pre-bash-destructive,
  verified BLOCK after bash-normalize strips the evasion. T8
  base64-pipe-shell uses its own BLOCK_RULE. The canonical
  destructive form uses a path token rather than the bare slash
  (regex word-boundary requires it). Source-string fragmentation
  pattern reused from the e2e attack-chain test. Maps to
  LLM06/ASI01/LLM01.

Plugin README "Other runnable examples" section + plugin
CLAUDE.md "Examples" table + CHANGELOG Unreleased/Added
all updated. Marketplace root README unchanged
([skip-docs] for marketplace-level gate — plugin's outward
coverage is unchanged, only demonstrations were added).

2026-05-05 15:01:20 +02:00

2.7 KiB

Raw Blame History

Expected Findings — Bash Evasion Gallery

This is the testable contract. run-evasion-gallery.mjs exits 0 only when every T-tag produces a BLOCK.

Per-tag contract

Tag	Hook stderr must contain
baseline	`BLOCKED: Destructive command detected — Filesystem root destruction (rm -rf /)`
T1	`BLOCKED: Destructive command detected — Filesystem root destruction (rm -rf /)`
T2	`BLOCKED: Destructive command detected — Filesystem root destruction (rm -rf /)`
T3	`BLOCKED: Destructive command detected — Filesystem root destruction (rm -rf /)`
T4	`BLOCKED: Destructive command detected — Filesystem root destruction (rm -rf /)`
T5	`BLOCKED: Destructive command detected — Filesystem root destruction (rm -rf /)`
T6	`BLOCKED: Destructive command detected — Filesystem root destruction (rm -rf /)`
T7	`BLOCKED: Destructive command detected — Filesystem root destruction (rm -rf /)`
T8	`BLOCKED: Destructive command detected — T8 — base64-pipe-shell idiom (echo BLOB \| base64 -d \| sh)`
T9	`BLOCKED: Destructive command detected — Filesystem root destruction (rm -rf /)`

All cases must exit with code 2.

Why $HOME instead of /

The "Filesystem root destruction" BLOCK_RULE regex requires a word-boundary anchor after the destination path:

\brm\s+(?:-[a-zA-Z]*f[a-zA-Z]*\s+|--force\s+)*-[a-zA-Z]*r[a-zA-Z]*\s+(?:\/|~|\$HOME)\b

rm -rf / ends with / followed by end-of-string; both / and EOL are non-word, so \b does not match. The variants rm -rf /tmp, rm -rf $HOME, and rm -rf /etc all match — the trailing word character provides the boundary.

This gallery uses $HOME because it is unambiguously destructive and the regex fires deterministically. The literal rm -rf / edge case is not part of this contract — it is covered by Claude Code 2.1.98+ harness-level checks.

Side effects

No file is modified
No real bash is invoked — only node hooks/scripts/...
Each hook spawn has tool_input.command set to the disguised variant — bash never sees these strings
No mutation of $HOME, /, /tmp, or anywhere else

Notes for forks

If bash-normalize.mjs adds new T-tags (T10+), add a new case to CASES and a corresponding row above
If a BLOCK_RULE in pre-bash-destructive.mjs is renamed, update the stderr-pattern column above (the assertion lives in expected-findings.md for documentation; the run script only checks exit code 2, so it continues to pass after a rename)
The base64 blob in T8 (cm0gLXJmICRIT01F) decodes to the literal command. If you change the canonical destructive target away from $HOME, regenerate the blob with printf '<new-cmd>' | base64

2.7 KiB Raw Blame History