chore(release): bump to v7.3.0

Batch C release. Closes 12 implementation tasks (E3, E8-E14, 8.4, 8.6,
8.7, 8.10) across four execution waves: A (bash + decoder), B (supply
chain + workflow scanner), C (MCP cumulative drift), D (code quality).

Wave E (9 new attack-simulator scenarios for the new defenses) deferred
to v7.3.1 — defenses are unit-tested per wave; the deferred work adds
attack-simulator regression coverage on top, not the primary safety net.

Tests: 1665+ → 1777 (Wave A-D cumulative, +112).

Version sync targets touched:
- package.json
- .claude-plugin/plugin.json
- CLAUDE.md (header)
- README.md (badge + new release-history row)
- scanners/ide-extension-scanner.mjs (VERSION constant)
- ../../README.md (marketplace root plugin entry)
- CHANGELOG.md (new [7.3.0] section per Keep a Changelog, all 12 task
  IDs covered individually under Added/Changed/Documentation/Tests/Notes)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Kjell Tore Guttormsen 2026-05-01 05:28:45 +02:00
commit c4183b8b4d
7 changed files with 186 additions and 7 deletions

View file

@ -26,7 +26,7 @@ Then open Claude Code and type `/plugin` to browse and install plugins from the
## Plugins
### [LLM Security](plugins/llm-security/) `v7.2.0`
### [LLM Security](plugins/llm-security/) `v7.3.0`
Security scanning, auditing, and threat modeling for agentic AI projects.

View file

@ -1,5 +1,5 @@
{
"name": "llm-security",
"description": "Security scanning, auditing, and threat modeling for Claude Code projects. Detects secrets, validates MCP servers, assesses security posture, and generates threat models aligned with OWASP LLM Top 10.",
"version": "7.2.0"
"version": "7.3.0"
}

View file

@ -4,6 +4,184 @@ All notable changes to the LLM Security Plugin are documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
## [7.3.0] - 2026-05-01
Batch C release. Closes 12 implementation tasks (E3, E8-E14, 8.4, 8.6,
8.7, 8.10) across four execution waves: Wave A (bash evasion + decoder),
Wave B (supply chain + workflow scanner), Wave C (MCP cumulative drift),
Wave D (code quality). Wave E (9 new attack-simulator scenarios for the
new defenses) deferred to v7.3.1 — the defenses themselves are unit-tested
per wave; the deferred work adds attack-simulator regression coverage on
top.
### Added
- **E8 — T7 process-substitution normalization** in
`scanners/lib/bash-normalize.mjs`. Collapses `<(cmd)` and `>(cmd)`
process-substitution wrappers so the inner command name is surfaced
to downstream destructive-command name matchers in
`pre-bash-destructive.mjs`. Defends against split-command evasion.
Nested wrappers handled up to depth 3. Single-quoted literals
masked before T7 runs to avoid corrupting string content.
- **E10 — T9 eval-via-variable normalization** in
`scanners/lib/bash-normalize.mjs`. Substitutes one-level variable
assignments before destructive-name matching. One-level forward-flow
only: chained-var attacks intentionally not followed (documented
limit). Bare-form, curly-form, and double-quoted forms supported;
single-quoted literals preserved.
- **E9 — T8 base64-pipe-shell BLOCK rule** in
`hooks/scripts/pre-bash-destructive.mjs`. Direct match on the
base64-decode-pipe-into-shell loader idiom — blocks the
encoded-payload runner pattern that bypasses static name-matching by
delivering the destructive command as encoded text.
- **E3 — rot13 layer for hidden-imperative comment-block detection**
in `scanners/lib/injection-patterns.mjs`. The decoder is bounded
in length to keep accidental rot13-look-alike short strings out of
scope. Base64/hex/URL/HTML decoding is already done by
`normalizeForScan`; the rot13 pass is the only genuinely new layer.
- **E12 — `.gitattributes` filter/diff/merge driver advisory** in
`scanners/lib/git-clone.mjs`. New `scanGitAttributes(repoDir)`
exported helper plus post-clone integration in the `clone` CLI
branch — surfaces filter, diff, and merge driver directives as
MEDIUM advisories so downstream consumers see the supply-chain
surface that survives even a sandboxed clone.
- **E13 — npm scope-hopping typosquat detection** in
`hooks/scripts/pre-install-supply-chain.mjs`. New shared
`NPM_OFFICIAL_SCOPES` export from `scanners/lib/supply-chain-data.mjs`.
When an install targets `@<scope>/<name>` where `<scope>` is unknown
but `<name>` matches a popular unscoped package, the hook emits a
MEDIUM advisory. Allowlist of legitimate scopes drives suppression.
Configurable via `policy.json` `supply_chain.allowed_scopes`.
- **E11 — workflow-injection scanner** (`scanners/workflow-scanner.mjs`).
Scans `.github/workflows/*.{yml,yaml}` and `.forgejo/workflows/*.{yml,yaml}`
for dangerous expression interpolations inside `run:` step blocks.
23-field canonical blacklist (GHSL Security Lab 17 + GlueStack-class
6) targeting attacker-controlled fields. Sink-restricted: only
`run:` steps are shell sinks; `if:`, `with:`, `env:`, `name:`,
`runs-on:` are evaluated by the runner's expression engine, not the
shell, and are suppressed. Severity matrix: privileged triggers →
HIGH; semi-privileged → MEDIUM; safe fields (numeric / hex /
fixed-string) → INFO. State machine extracted to
`scanners/lib/workflow-yaml-state.mjs` for unit-level testability.
Re-interpolation tracking — env-block bindings sourced from
blacklisted fields, then read back inside `run:`, are flagged at
MEDIUM as the Appsmith GHSL-2024-277 stealth pattern. Auth-bypass
detection — `(github|forgejo).actor` compared against bot
identities in `if:` conditions flagged at MEDIUM (Synacktiv 2023
Dependabot spoofing class). New `WFL` prefix in
`scanners/lib/severity.mjs` OWASP map. Registered in
`scanners/scan-orchestrator.mjs`.
- **E14 — MCP cumulative-drift baseline** in
`scanners/lib/mcp-description-cache.mjs`. Sticky `baseline` slot per
tool plus a 10-event rolling `history` array (FIFO). Cumulative
drift = `levenshtein(current, baseline.description) / max(|current|,
|baseline|)`; when ratio ≥ `mcp.cumulative_drift_threshold`
(default 0.25), `post-mcp-verify.mjs` emits a MEDIUM
`mcp-cumulative-drift` advisory independent of the existing
per-update >10% drift signal — both fire independently. Slow-burn
rug-pulls that keep each update under the per-update threshold but
cumulatively diverge from baseline are now caught. Baseline survives
the 7-day TTL purge so detection persists across the full window.
New `/security mcp-baseline-reset` slash command (plus
`scanners/mcp-baseline-reset.mjs` CLI: `--list`, `--target <tool>`,
or no-args clear-all) lets the user acknowledge a legitimate MCP
server upgrade. New `LLM_SECURITY_MCP_CACHE_FILE` env var overrides
the cache path for end-to-end testing without polluting the user's
real `~/.cache/llm-security/mcp-descriptions.json`. Migration logic
in `loadCache()` seeds `baseline` from existing entries on first
read post-upgrade.
- **8.7 — env-var deprecation warnings** in
`scanners/lib/policy-loader.mjs`. New `getPolicyValueWithEnvWarn(section,
key, envVarName, defaultValue)` helper. Env-var still wins per
existing Preferences, but when BOTH the env-var AND the
`policy.json` key are explicitly set, the helper emits a single
per-process stderr deprecation line pointing to v8.0.0 removal.
Module-scoped `Set` dedupes per env-var name across call-sites.
`DEFAULT_POLICY` gains `trifecta.escalation_window: 5` (closes the
gap where `LLM_SECURITY_ESCALATION_WINDOW` had no `policy.json`
equivalent). Wired through 4 hook call-sites:
`pre-prompt-inject-scan`, `post-session-guard` (×2), and
`audit-trail`. Env-only vars without `policy.json` equivalents are
unchanged.
### Changed
- **8.10 — CLAUDE.md hooks count corrected** from `## Hooks (8)` to
`## Hooks (9)`. Adds `pre-compact-scan.mjs` row to the hooks table
(PreCompact — transcript scan before context compaction). The hook
itself shipped in v6.2.0 but the count and table row drifted. New
`Hooks count consistency` `describe` block in
`tests/lib/doc-consistency.test.mjs` parses `hooks/hooks.json`,
reads the CLAUDE.md `## Hooks (\d+)` header and the table rows,
and asserts all three counts agree — locks in the fix and prevents
future drift.
### Documentation
- **8.4 — `riskScoreV1` annotated `@deprecated`** in
`scanners/lib/severity.mjs`. JSDoc explicitly tags v7.0.0 as the
introduction of the v2 model and v8.0.0 as the removal target for
v1, so library consumers see the deprecation in IDE tooling and
not just in release notes. The function remains exported and
functional for users who relied on it.
- **8.6 — sandbox-architecture rationale** in
`docs/security-hardening-guide.md` §7. Documents why
`lib/git-clone.mjs` and `lib/vsix-sandbox.mjs` remain separate
rather than being collapsed into a single shared sandbox helper.
Brief `Preferences` explicitly rejected the consolidation as
premature abstraction over safety-critical code; the rationale is
recorded so future maintainers see the deliberate decision.
### Tests
- 1665+ → 1777 (Wave A-D cumulative; ~+112 tests). Includes new
files (`tests/scanners/bash-normalize-t7-t9.test.mjs`,
`tests/lib/git-clone-gitattributes.test.mjs`,
`tests/scanners/workflow-scanner.test.mjs`,
`tests/lib/workflow-yaml-state.test.mjs`,
`tests/scanners/mcp-baseline-reset.test.mjs`) plus extensions to
`tests/lib/injection-patterns.test.mjs`,
`tests/hooks/pre-bash-destructive.test.mjs`,
`tests/hooks/pre-install-supply-chain.test.mjs`,
`tests/scanners/scan-orchestrator.test.mjs`,
`tests/lib/mcp-description-cache.test.mjs`,
`tests/hooks/post-mcp-verify.test.mjs`,
`tests/lib/severity.test.mjs`,
`tests/lib/policy-loader.test.mjs`,
`tests/lib/doc-consistency.test.mjs`. One pre-existing
size-cap timing flake at `tests/hooks/pre-compact-scan.test.mjs`
passes in isolation, fails sporadically under full-suite load —
unchanged across Wave A-D, not a Batch C blocker.
### Notes
- **Wave E deferred (red-team coverage).** The plan called for 9 new
attack-simulator scenarios covering every Wave A-D defense. The
work was deferred from v7.3.0 because two of the scenarios test
scanners (workflow-scanner, git-clone `scanGitAttributes`) that
don't fit the existing hook-spawn model used by attack-simulator
and would have required a new `scanner_test` execution mode.
Tracked for v7.3.1. Defenses are unit-tested per wave; this is
regression coverage on top of unit coverage, not the primary
safety net.
- **Hooks runtime behavior unchanged for existing setups.** Every
Wave A-D addition is either purely additive (new advisories at
MEDIUM) or layered before existing detection (T7/T9 normalize
before existing destructive-name matching; rot13 inside the
existing decoder loop; cumulative-drift independent of per-update
drift). Users who set neither the new `policy.json` keys nor the
new env-vars see identical behavior.
## [7.2.0] - 2026-04-29
Batch B release. Closes the remaining critical-review B-tier scanner

View file

@ -1,6 +1,6 @@
# LLM Security Plugin (v7.2.0)
# LLM Security Plugin (v7.3.0)
Security scanning, auditing, and threat modeling for Claude Code projects. 5 frameworks: OWASP LLM Top 10, Agentic AI Top 10 (ASI), Skills Top 10 (AST), MCP Top 10, AI Agent Traps (DeepMind). 1665+ unit and integration tests; mutation-testing coverage not published.
Security scanning, auditing, and threat modeling for Claude Code projects. 5 frameworks: OWASP LLM Top 10, Agentic AI Top 10 (ASI), Skills Top 10 (AST), MCP Top 10, AI Agent Traps (DeepMind). 1777+ unit and integration tests; mutation-testing coverage not published.
**v7.0.0 — Severity-dominated risk scoring (v2 model, BREAKING).** Three changes target the false-positive cascade on real codebases (hyperframes.com gave `BLOCK / Extreme / 100`, ~70% noise):

View file

@ -6,7 +6,7 @@
*AI-generated: all code produced by Claude Code through dialog-driven development. [Full disclosure →](../../README.md#ai-generated-code-disclosure)*
![Version](https://img.shields.io/badge/version-7.2.0-blue)
![Version](https://img.shields.io/badge/version-7.3.0-blue)
![Platform](https://img.shields.io/badge/platform-Claude_Code_Plugin-purple)
![Agents](https://img.shields.io/badge/agents-6-orange)
![Scanners](https://img.shields.io/badge/scanners-22-cyan)
@ -848,6 +848,7 @@ This plugin provides full-stack security hardening (static analysis + supply cha
| Version | Date | Highlights |
|---------|------|------------|
| **7.3.0** | 2026-05-01 | **Batch C release.** Closes 12 implementation tasks (E3, E8-E14, 8.4, 8.6, 8.7, 8.10) across four execution waves. **Added (Wave A — bash + decoder):** T7 process-substitution and T9 eval-via-variable normalizations in `scanners/lib/bash-normalize.mjs`; T8 base64-pipe-shell BLOCK rule in `pre-bash-destructive.mjs`; rot13 layer for hidden-imperative comment-block detection in `injection-patterns.mjs`. **Added (Wave B — supply chain + workflow scanner):** `scanGitAttributes()` post-clone advisory for filter/diff/merge driver directives in `scanners/lib/git-clone.mjs` (E12); npm scope-hop typosquat detection with allowlist in `pre-install-supply-chain.mjs` and shared `NPM_OFFICIAL_SCOPES` in `scanners/lib/supply-chain-data.mjs` (E13); new `scanners/workflow-scanner.mjs` for GitHub Actions and Forgejo Actions injection (`${{ <dangerous-field> }}` inside `run:` blocks, with re-interpolation tracking and Synacktiv-class `actor == bot[bot]` auth-bypass detection); state machine extracted to `scanners/lib/workflow-yaml-state.mjs`; `WFL` prefix added to `severity.mjs` OWASP map; orchestrator registration. **Added (Wave C — MCP cumulative drift, E14):** sticky baseline slot per tool plus 10-event rolling history in `scanners/lib/mcp-description-cache.mjs`; cumulative-drift advisory (MEDIUM, `mcp-cumulative-drift`) when Levenshtein ratio between current and baseline ≥ `mcp.cumulative_drift_threshold` (default 0.25); baseline survives 7-day TTL purge so slow-burn rug-pulls are caught; `clearBaseline()` exposed; new `/security mcp-baseline-reset` slash command + `scanners/mcp-baseline-reset.mjs` CLI; `LLM_SECURITY_MCP_CACHE_FILE` env var for end-to-end testing. **Changed (Wave D — code quality):** `riskScoreV1` annotated `@deprecated` with v8.0.0 removal target (8.4); `docs/security-hardening-guide.md` §7 documents the sandbox-architecture rationale (8.6 — descoped to documentation only, no code consolidation); new `getPolicyValueWithEnvWarn()` helper in `policy-loader.mjs` emits a one-time-per-process stderr deprecation line when both an env-var AND its `policy.json` equivalent are explicitly set (8.7) — wired through `pre-prompt-inject-scan` (`LLM_SECURITY_INJECTION_MODE`), `post-session-guard` (`LLM_SECURITY_TRIFECTA_MODE`, `LLM_SECURITY_ESCALATION_WINDOW`), and `audit-trail` (`LLM_SECURITY_AUDIT_LOG`); `DEFAULT_POLICY` gains `trifecta.escalation_window: 5`; CLAUDE.md hooks count corrected to 9 with `pre-compact-scan` row added, plus a new `Hooks count consistency` test in `doc-consistency.test.mjs` (8.10). **Notes:** Wave E (9 new attack-simulator scenarios for E3/E8/E9/E10/E11/E12/E13/E14) deferred to v7.3.1 — defenses are unit-tested per wave; the deferred work adds attack-simulator regression coverage on top. **Tests:** 1665+ → 1777 (Wave A-D cumulative). |
| **7.2.0** | 2026-04-29 | **Batch B release.** Closes the remaining critical-review B-tier scanner defects (B3, B5, B6, B7) and lands the v7.2.0 evasion-arsenal hardening patches (E1, E4, E5, E7, E15, E16, E17, E18). **Added:** B6 destructuring/spread taint propagation in `taint-tracer.mjs`; B7 token-overlap typosquat fallback in `string-utils.mjs`/`dep-auditor`/`supply-chain-recheck`; E15 `.claude/agents/*.md` glob in `memory-poisoning-scanner`; E1 PUA-A/PUA-B coverage in `containsUnicodeTags`; E16 `foldHomoglyphs` (Cyrillic/Greek → Latin via NFKC) before every pattern match in `scanForInjection` (with ASCII fast-path); E17 `LLM_SECURITY_ESCALATION_WINDOW` env-var + 20-call MEDIUM secondary advisory in `post-session-guard`; E4 markdown link-title scan, E5 SVG `<desc>/<title>/<metadata>/<foreignObject>` extractor, E7 generalized HTML comment scan in `post-mcp-verify`. **Changed:** B5 entropy two-stage pipeline — new `classifyFileContext` in `entropy-scanner.mjs` gates rules 11-13 (GLSL/CSS-in-JS/inline-markup line-proximity) on `context !== 'code-dominant'`, ending the v7.0.0 polyglot false-negative class while preserving existing behaviour for short single-line fixtures. E18 entropy rule 18 — `MARKDOWN_IMAGE_CDN_HOSTS` allowlist + secret-in-query pre-check; non-CDN hosts and CDN URLs carrying secret-shaped query tokens fall through to entropy classification. v1 → v2 risk-formula constants (BLOCK ≥65, WARNING ≥15) unified across `commands/scan.md`, `commands/audit.md`, `agents/mcp-scanner-agent.md`, `agents/posture-assessor-agent.md` with a `tests/lib/doc-consistency.test.mjs` drift-guard. **Documentation:** B3 `info` severity is scoring-inert — documented in `severity.mjs` JSDoc and CLAUDE.md. **Red team:** 8 new attack scenarios (UNI-007/008/009, MCP-005/006/007/008, TRI-004); attack-simulator 64 → 72, 100 % pass. **Tests:** 1522 → 1665+ (Wave 1-6 cumulative). |
| **7.1.0** | 2026-04-29 | **Critical-review patch.** Closes the highest-impact items from the v7.0.0 adversarial review (`docs/critical-review-2026-04-20.md`, grade B-). Bug-fixes + documentation honesty-sweep, no new features. **Fixed:** (1) `pre-write-pathguard.mjs` regex hole — `.env.production.local.backup`, `.env.prod.local.bak`, etc. could be written. New regex `/[\\/]\.env(\.[A-Za-z0-9._-]+)*$/` covers arbitrary multi-segment suffixes; `.envrc` still allowed. (2) `post-session-guard.mjs``LLM_SECURITY_TRIFECTA_MODE=block` only blocked when trifecta was MCP-concentrated or hit a sensitive path; distributed trifectas across MCP servers were advisory-only. AND-gate removed. (3) `scanners/lib/severity.mjs` JSDoc + CHANGELOG arithmetic — `riskScore({critical: 4})` is 93, not 90 (computation always was). **Changed (honesty-sweep, critical-review §9):** "Trustworthy scoring" → "Severity-dominated risk scoring (v2 model)"; "Context-aware entropy scanner" → "Rule-based entropy scanner with file-extension skip, 8 line-level suppression rules, and configurable policy"; "1487 tests" → "1511 unit and integration tests; mutation-testing coverage not published"; "Fully Schrems II compatible" → "Schrems II compatible in default offline mode. Optional OSV.dev enrichment (`supply-chain-recheck --online`) transmits package identifiers to a Google-operated API and is a separate compliance consideration"; "Rule of Two enforcement" → "Rule of Two detection (configurable; default `warn`; blocks on high-confidence trifectas in opt-in `block` mode; distributed trifectas detected but not blocked by default)"; "Hardened ZIP extractor" → suffix " — no fuzz-testing results published to date"; "defense-in-depth" preserved but quantified in `docs/security-hardening-guide.md` §4: "three independent detection layers with documented bypass classes". **CaMeL claim toned down:** `post-session-guard.mjs:646` and `CLAUDE.md:184` now describe the implementation honestly — opportunistic byte-matching of truncated output fingerprints (first 200 bytes, SHA-256/16-hex tag); not semantic data-flow tracking; trivially bypassed by mutation, summarisation, or re-encoding. Inspired by CaMeL (DeepMind 2025) but not a CaMeL capability-tracking implementation. **Tests:** +24 (+8 pathguard multi-segment + 1 distributed-trifecta + 15 verdict/riskBand co-monotonicity sweep + 1 `riskScore({critical: 4}) === 93` anchor). 1511 tests (was 1487). All green. **Why:** the critical-review CISO perspective (§F) flagged overclaim language as a blocker for regulated environments — toning it down does not weaken the actual defenses; it lets users trust the documentation. |
| **7.0.0** | 2026-04-19 | **Trustworthy scoring (BREAKING).** Three changes target the false-positive cascade on real codebases (scan of hyperframes.com gave `BLOCK / Extreme / 100` with ~70% noise). **1. Risk-score v2** (`scanners/lib/severity.mjs`) — severity-dominated, log-scaled within tier. Replaces sum-and-cap that collapsed every non-trivial scan to 100/Extreme. Tiers: critical → 7095, high only → 4065, medium only → 1535, low only → 111. Verdict cutoffs realigned (BLOCK ≥65, WARNING ≥15) for band co-monotonicity. **2. Context-aware entropy scanner** — file-extension skip (`.glsl/.frag/.vert/.shader/.wgsl/.css/.scss/.sass/.less/.svg/.min.*/.map`) + 8 new line-suppression rules (GLSL keywords, CSS-in-JS templates, inline SVG, ffmpeg `filter_complex`, User-Agent strings, SQL DDL on dedicated lines, `throw new Error(\`...\`)`, markdown image URLs). Configurable via `.llm-security/policy.json` `entropy` section (thresholds, `suppress_extensions`, `suppress_line_patterns`, `suppress_paths`). Envelope `calibration` block reports skip counters + effective thresholds + policy source. **3. DEP typosquat allowlist expansion** — 22 npm + 5 PyPI entries for short-name tools that tripped Levenshtein on every modern codebase (`knip`, `oxlint`, `tsx`, `nx`, `rimraf`, `uv`, `ruff`, etc.). Synthesizer "Scan Calibration" section + "never override verdict" rule added. Legacy `riskScoreV1()` kept for reference. **CI pipelines with `--fail-on` thresholds may need recalibration.** 1487 tests (was 1461). |

View file

@ -1,6 +1,6 @@
{
"name": "llm-security",
"version": "7.2.0",
"version": "7.3.0",
"description": "Security scanning, auditing, and threat modeling for Claude Code projects",
"type": "module",
"bin": {

View file

@ -49,7 +49,7 @@ import { scan as scanTaint } from './taint-tracer.mjs';
import { scan as scanMemoryPoisoning } from './memory-poisoning-scanner.mjs';
import { scan as scanSupplyChain } from './supply-chain-recheck.mjs';
const VERSION = '7.2.0';
const VERSION = '7.3.0';
const SCANNER = 'IDE';
// ---------------------------------------------------------------------------