Closes A2 of v7.1.0 critical-review patch (docs/critical-review-2026-04-20.md):
- B4 (severity JSDoc): 4 critical = 93, not 90. Fixed in scanners/lib/severity.mjs:23
and CHANGELOG.md v7.0.0 tier description. The actual computation has always been
93 (70 + log2(5)*10 = 93.22 → round); only the docs were wrong.
- §5.4 co-monotonicity: new sweep test in tests/lib/severity.test.mjs over 15
representative count vectors. Asserts that (verdict, riskBand) agree under the
v7.0.0 contract for every case — catches future drift between riskScore tiers,
verdict cutoffs, and riskBand cutoffs. Includes a B4 anchor test (riskScore
{critical: 4} === 93) so doc/code drift fails loudly.
- B8 (CaMeL claims toned down): post-session-guard.mjs:646 comment block and
CLAUDE.md:184 Defense Philosophy bullet now describe the implementation
honestly — opportunistic byte-matching of truncated output fingerprints
(first 200 bytes, SHA-256/16-hex), not semantic data-flow tracking.
Trivially bypassed by mutation, summarisation, or re-encoding. Inspired by
CaMeL (DeepMind 2025), but not a CaMeL capability-tracking implementation.
Tests: 1495 → 1511 (+16: 15 sweep cases + 1 B4 anchor). All green.
39 KiB
39 KiB
Changelog
All notable changes to the LLM Security Plugin are documented in this file.
The format is based on Keep a Changelog.
[7.0.0] - 2026-04-19
BREAKING CHANGES
- Risk-score formula rewritten (
scanners/lib/severity.mjs). The v1 sum-and-cap formula (critical*25 + high*10 + medium*4 + low*1, capped at 100) collapsed every non-trivial scan to 100/Extreme regardless of actual risk distribution. v2 is severity-dominated and log-scaled within tier:- Critical present → 70–95 (1=80, 2=86, 4=93, 10=95)
- High only → 40–65 (1=48, 5=60, 17=65)
- Medium only → 15–35 (1=20, 5=28, 50=33)
- Low only → 1–11 (1=4, 10=11)
- None → 0
Verdict cutoffs realigned to new bands:
BLOCKif critical ≥1 or score ≥65,WARNINGif high ≥1 or score ≥15. Legacy v1 formula kept asriskScoreV1()for reference only. CI pipelines with--fail-onthresholds may need recalibration — seedocs/security-hardening-guide.md§6.
- Verdict/band cutoffs aligned for co-monotonicity. Old cutoffs (BLOCK ≥61, WARNING ≥21) could produce "BLOCK / Medium band" or "ALLOW / High band" contradictions. New cutoffs (65, 15) are locked to the v2
riskBand()boundaries.
Added
- Context-aware entropy scanner (
scanners/entropy-scanner.mjs). Skip-lists and line-level rules drastically reduce false positives in shader/CSS/HTML/SQL-heavy codebases:- File-extension skip:
.glsl, .frag, .vert, .shader, .wgsl, .css, .scss, .sass, .less, .svg+ compound.min.js, .min.css, .map - Line-level rules 11–18 in
isFalsePositive(): GLSL keywords (uniform,vec3,texture2D...), CSS-in-JS templates (styled.), inline<svg>markup, ffmpegfilter_complexsyntax, browserUser-Agentstrings, SQL DDL on dedicated lines (^\s*(SELECT|INSERT|UPDATE|DELETE|CREATE|...)),throw new Error(\…`)templates, markdown image syntax with external URLs (` — common in JSON content indexes)
- Scanner envelope gains
calibrationblock:files_skipped_by_extension,files_skipped_by_path, effectivethresholds, andpolicy_source('default' | 'policy.json')
- File-extension skip:
- Policy-driven entropy configuration —
.llm-security/policy.jsonentropysection accepts:thresholds.{critical,high,medium}.{entropy,minLen}— override defaults per projectsuppress_extensions: string[]— additional file extensions to skipsuppress_line_patterns: string[]— user-defined regexes for line suppressionsuppress_paths: string[]— substring match againstrelPathto skip entire paths (e.g.,"vendored/")
- DEP typosquat allowlist expansion (
knowledge/typosquat-allowlist.json). 22 npm + 5 PyPI entries for short-name modern tools that tripped Levenshtein detection on nearly every real codebase:- npm:
knip,oxlint,tsx,nx,rimraf,glob,tar,zod,ky,ow,esm,ip,qs,url,prettier,vitest,vite,rollup,swc,turbo,bun,deno - PyPI:
uv,ruff,rich,typer,anyio
- npm:
- Synthesizer "Scan Calibration" section (
agents/deep-scan-synthesizer-agent.md). Heuristic: omit if <5% files skipped, flag prominently if >80% skipped by path (signals over-aggressive user policy). Agent instructed to NEVER override scanner verdict with narrative opinion. - 26 new unit tests (
tests/scanners/entropy-context.test.mjs): A. File-extension skip (4), B. Line-level rules 11–18 (10), C. Policy overrides (3); plus expandedtests/lib/severity.test.mjswith v2 scoring/band/verdict tables (70 tests total, was 52). Total: 1487 tests (was 1461).
Changed
tests/lib/output.test.mjs:243— "1 critical = score 80" under v2 (was 25 under v1).scanners/lib/file-discovery.mjs—TEXT_EXTENSIONSnow includes.sassand GPU shader source extensions (.glsl, .frag, .vert, .shader, .wgsl) so these files are discovered and explicitly counted as skipped by the entropy scanner instead of invisibly filtered out.- Plugin version:
6.6.0 → 7.0.0acrosspackage.json,.claude-plugin/plugin.json,scanners/ide-extension-scanner.mjs(VERSION), README badge, CLAUDE.md header, marketplace root README.
Why
- Real-world scan on
hyperframes.comproducedBLOCK / Extreme / 100with ~70% noise (shader strings, CSS gradients, bundled JS, Levenshtein false positives). A scanner that cries "extreme" on every project destroys its own credibility — users learn to ignore findings, so genuine threats slip past. - Trustworthiness comes from calibration, not from detecting everything. v7.0.0 accepts that some detection heuristics are noisy in context (entropy on shaders, typosquat on 2–3 letter tool names) and gives users both built-in suppression and policy-driven override controls.
- Verdict/score/band co-monotonicity fixed. A user can now correctly reason: "HIGH band → WARNING verdict" without reading the source. The v1 cutoffs allowed a mid-High score (42) to produce ALLOW and a low-Medium score (22) to produce WARNING.
[6.6.0] - 2026-04-18
Added
- JetBrains/IntelliJ plugin scanning.
/security ide-scanextends beyond VS Code forks to cover the JetBrains IDE family: IntelliJ IDEA, PyCharm, GoLand, WebStorm, RubyMine, PhpStorm, CLion, DataGrip, RustRover, Rider, Aqua, Writerside, Android Studio. Fleet and Toolbox are intentionally excluded (different plugin model, out of scope) - OS-aware JetBrains plugin discovery in
lib/ide-extension-discovery.mjs— macOS~/Library/Application Support/JetBrains/<IDE><version>/plugins/, Windows%APPDATA%\JetBrains\..., Linux~/.config/JetBrains/.... Regex excludes Fleet/Toolbox - Zero-dep
META-INF/plugin.xml+META-INF/MANIFEST.MFparsers inlib/ide-extension-parser-jb.mjswith nested-jar extraction for the common<plugin-root>/lib/*.jar → META-INF/plugin.xmllayout - 7 JetBrains-specific checks in
runJetBrainsChecks:checkThemeWithCodeJB,checkBroadActivationJB(application-components),checkPremainClassJB(HIGH — javaagent retransform),checkNativeBinariesJB,checkDependsChainJB(long mandatory<depends>= supply-chain pressure),checkTyposquatJB(Levenshtein vs top JetBrains plugins),checkShadedJarsJB(advisory — many bundled jars) - JetBrains Marketplace URL fetch. Supports
https://plugins.jetbrains.com/plugin/<numericId>-<slug>(metadata resolves numericId → xmlId, then downloads) andhttps://plugins.jetbrains.com/plugin/download?pluginId=<xmlId>[&version=<v>](direct download). Host allowlist:plugins.jetbrains.comonly fetchJetBrainsPlugininlib/vsix-fetch.mjswith the same safety envelope as VSIX fetch (50 MB cap, 30 s timeout, SHA-256, manual redirect host whitelist)lib/jetbrains-fetch-worker.mjs— sub-process worker mirroring the VSIX worker's JSON-line IPC. Shares the sandbox primitives through parameterizedbuildSandboxedWorker(dirs, workerPath).kt,.groovy,.scalaadded toscanners/taint-tracer.mjsCODE_EXTENSIONSso Kotlin/Groovy/Scala plugin sources are covered by taint analysis- Knowledge additions:
knowledge/jetbrains-marketplace-api-notes.md, expandedknowledge/ide-extension-threat-patterns.mdwith JetBrains sections, seededknowledge/top-jetbrains-plugins.json(no longer a stub) withloadJetBrainsBlocklisthelper - 8 new test files / suites covering JetBrains data, parsers, discovery, checks, URL fetch (unit + integration), end-to-end scan against a real JetBrains-layout fixture tree, plus a deterministic fixture-jar builder (
tests/helpers/build-jetbrains-fixtures.mjs) that produces byte-identical reproducible jars. Total: 1461 tests (was 1352)
Changed
buildSandboxedWorker(dirs)→buildSandboxedWorker(dirs, workerPath)— parameterized so the same sandbox wrapper is reused for VSIX and JetBrains workers instead of copying the primitives a third time/security ide-scancommand description updated to reflect the JetBrains branch; "JetBrains is a v1.1 stub" wording removedCLAUDE.mdand plugin README updated: scanner bullet rewritten to document the JetBrains branch, the seven JB-specific checks, and the new knowledge files- Plugin version: 6.5.0 → 6.6.0 across
package.json,.claude-plugin/plugin.json,scanners/ide-extension-scanner.mjs(VERSION), README badge, CLAUDE.md header, marketplace root README tests/scanners/git.test.mjs— loosenedfindings.lengthcaps (were too tight for organic repo growth; baseline already exceeded them)
Why
- Parity with the VS Code branch: organizations running IntelliJ-family IDEs get the same pre-install and installed-plugin coverage Koi-style supply-chain attacks now target across both platforms
- Reuse of
lib/vsix-sandbox.mjshonors the user-memory rule "don't copy a third sandbox" — one set of primitives, two workers, same kernel-enforced FS confinement - JetBrains-specific checks target the platform's real attack surface:
Premain-Classjavaagents (class retransform at JVM startup),application-components(global lifecycle hooks), nested-jar shading (dependency opacity), and typosquat oncom.intellij.*/org.jetbrains.*namespaces
[6.5.0] - 2026-04-17
Added
- OS sandbox for
/security ide-scan <url>. VSIX fetch + extract now runs in a sub-process wrapped bysandbox-exec(macOS) orbwrap(Linux), reusing the same primitives proven by thegit clonesandbox introduced in v5.1. Defense-in-depth: even ifzip-extract.mjshas an undiscovered bypass, the kernel refuses any write outside the per-scan temp directory scanners/lib/vsix-fetch-worker.mjs— Sub-process worker. Argv:--url <url> --tmpdir <writable-dir>. Emits a single JSON line on stdout ({ok, sha256, size, finalUrl, source, extRoot}or{ok:false, error, code?}). Exit 0 on success, 1 on failure. Silent on stderrscanners/lib/vsix-sandbox.mjs— Wrapper. ExportsbuildSandboxProfile,buildBwrapArgs,buildSandboxedWorker(tmpDir, args),runVsixWorker(url, tmpDir, opts). 35 s timeout, 1 MB stdout cap, deterministic JSON-line protocolscan(url, { useSandbox })option. Defaulttruefor CLI invocations; tests passfalseto keepglobalThis.fetchmocking working (mocks do not cross process boundaries). When sandbox unavailable on the platform (e.g., Windows), a warning is added tometa.warningsand the scan still completes via the in-process fallbackmeta.source.sandbox— New envelope field:'sandbox-exec' | 'bwrap' | 'none' | 'in-process'. Tells the report which protection layer was actually active- 8 new tests in
tests/scanners/vsix-sandbox.test.mjscovering profile generation per platform, worker arg construction, and live worker exit behavior on invalid URLs (no network required)
Changed
fetchAndExtractVsixUrlinide-extension-scanner.mjsis now sandbox-aware (useSandboxoption, defaulttrue). Existing in-process logic preserved as fallback path- Version bump: 6.4.0 → 6.5.0 across all files
Why
- Aligns the IDE-scan URL pipeline with the same defense-in-depth posture as the GitHub clone pipeline — kernel-enforced FS confinement instead of in-process validation alone
- VSIX is untrusted bytes from a third-party registry; even with hardened parsing, an OS sandbox is the right blast-radius constraint for filesystem writes
[6.4.0] - 2026-04-17
Added
/security ide-scan <url>— pre-install verification. The IDE extension scanner now accepts URLs as targets and fetches the VSIX before scanning. Supported sources:- VS Code Marketplace:
https://marketplace.visualstudio.com/items?itemName=publisher.name - OpenVSX:
https://open-vsx.org/extension/publisher/name[/version] - Direct VSIX download:
https://example.com/path/foo.vsix(HTTPS only)
- VS Code Marketplace:
scanners/lib/vsix-fetch.mjs— HTTPS-only fetcher with 50 MB compressed cap, 30 s total timeout, SHA-256 streamed during download, manual redirect handling with per-source host whitelist (Marketplace gallerycdn, OpenVSX blob storage). No npm dependencies — uses Node 18+fetchscanners/lib/zip-extract.mjs— Zero-dependency ZIP parser + safe extractor. Rejects: zip-slip via..paths, POSIX absolute paths, Windows drive letters, NUL bytes, encrypted entries, ZIP64, multi-disk archives, unsupported compression methods, symlink entries (Unix0xA000mode bits inexternal_attr). Caps: 10 000 entries, 500 MB uncompressed total, 100× expansion ratio (sum-uncomp / sum-comp), depth 20. STORE + DEFLATE only- Envelope
meta.source— When invoked with a URL, the scan envelope'smeta.sourcefield carries{ type: "url", kind, url, finalUrl, sha256, size, publisher, name, version, requestedUrl }so reports can attribute findings to the upstream artifact knowledge/marketplace-api-notes.md— Reference notes for the (undocumented but stable) Marketplace direct-download endpoint and the (officially documented) OpenVSX endpoints used byvsix-fetch.mjs- 48 new tests across
tests/scanners/zip-extract.test.mjs(validateEntryName / isSymlink / extractToDir happy + adversarial),tests/scanners/vsix-fetch.test.mjs(detectUrlType / isAllowedHost / readBodyCapped),tests/scanners/ide-extension-url.test.mjs(URL flow integration withglobal.fetchmock — Marketplace, OpenVSX, direct VSIX, malformed VSIX, zip-slip VSIX, network failure, unsupported URL, GitHub URL). 1344 tests total (was 1296). Test helper:tests/lib/build-zip.mjsbuilds adversarial ZIPs that realziptools refuse to emit
Changed
scanners/ide-extension-scanner.mjsearly-detects URL targets and routes through fetch + extract → temp dir → existing single-target scan path. Temp directory cleaned intry/finallyregardless of success/error/abort- CLI help text in
bin/llm-security.mjsandcommands/ide-scan.mdupdated with URL examples and security model - Version bump: 6.3.0 → 6.4.0 across all files
Not supported (intentional)
- GitHub repo URLs — would require
npm install+vsce packagebuild step. Use the Marketplace, OpenVSX, or a direct.vsixURL instead - VSIX
.signature.p7sverification — deferred to v6.5.0 (requires X.509 / PKCS#7 parsing) - ZIP64 archives — real-world VSIX never approaches the 4 GB threshold
[6.3.0] - 2026-04-17
Added
- IDE extension prescan — New
/security ide-scancommand andscanners/ide-extension-scanner.mjs(prefix IDE) discover and audit installed VS Code extensions across 6 roots (~/.vscode/extensions,~/.vscode-insiders/extensions,~/.cursor/extensions,~/.windsurf/extensions,~/.vscode-oss/extensions,~/.vscode-server/extensions, plus Linuxcode-server). OS-aware discovery viascanners/lib/ide-extension-discovery.mjs. Manifest parsing viascanners/lib/ide-extension-parser.mjs. Data loading viascanners/lib/ide-extension-data.mjs. JetBrains discovery is a v1.1 stub. - 7 IDE-specific detection categories — Blocklist match (CRITICAL), theme-with-code (HIGH, Material Theme pattern), sideload
.vsix(HIGH unsigned / MEDIUM signed), broad activation*/onStartupFinished(MEDIUM/LOW, suppressed for top-100 exact matches), Levenshtein typosquat ≤2 vs top-100 (HIGH distance-1 / MEDIUM distance-2 against top-50), extension-pack expansion ≥3 (MEDIUM), dangerousvscode:uninstallhooks referencingchild_process/curl/wget/rm/powershell(HIGH/LOW) - Per-extension scanner orchestration — Each discovered extension runs through UNI, ENT, NET, TNT, MEM, SCR scanners with bounded concurrency (default 4). MEM gets a filtered file list (README.md / CHANGELOG.md / package.json) to catch prompt-injection in marketplace-rendered text
- New knowledge files —
knowledge/ide-extension-threat-patterns.md(10 categories with 2024-2026 case studies from Koi Security — GlassWorm, WhiteCobra, TigerJack, Material Theme, VS Code Cryptojacking, MaliciousCorgi),knowledge/top-vscode-extensions.json(top ~100 Marketplace IDs + blocklist),knowledge/top-jetbrains-plugins.json(stub) - CLI integration —
bin/llm-security.mjsgainside-scansubcommand with passthrough flags - 22 new tests in
tests/scanners/ide-extension-scanner.test.mjs(fixtures undertests/fixtures/ide-extensions/). 1296 tests total (was 1274)
Changed
- Version bump: 6.2.0 → 6.3.0 across all files
[6.2.0] - 2026-04-17
Added
- Bash-normalize T5 + T6 —
scanners/lib/bash-normalize.mjsnow collapses${IFS}word-splitting (T5) and ANSI-C hex quoting$'\xHH'(T6) before the denylist gate runs. Defense-in-depth layer complementing the Claude Code 2.1.98+ harness fixes. 4 new unit tests intests/scanners/bash-normalize.test.mjs - PreCompact hook —
hooks/scripts/pre-compact-scan.mjsscans the transcript tail (default 500 KB) for injection patterns before Claude Code compacts context. Prevents poisoned summaries from surviving into the next turn. Modes:block/warn/offviaLLM_SECURITY_PRECOMPACT_MODE. 6 new tests intests/hooks/pre-compact-scan.test.mjs. Brings total hooks to 9 - Security hardening guide —
docs/security-hardening-guide.mddocuments environment variables (CLAUDE_CODE_EFFORT_LEVEL,ENABLE_PROMPT_CACHING_1H,CLAUDE_CODE_SCRIPT_CAPS, allLLM_SECURITY_*modes), sandboxing (sandbox-exec/bwrap/ fallback), T1-T6 normalization table, Opus 4.7 system card §5.2.1 + §6.3.1.1 alignment, baseline production recommendations
Changed
- Agent refactor for Opus 4.7 literal instruction following —
agents/skill-scanner-agent.mdandagents/mcp-scanner-agent.mdreframe stacked CANNOT/MUST NOT imperatives in favor of tool-level enforcement viatools:frontmatter. New Step 0 "Generaliseringsgrense" blocks (cite evidence path:line, mark speculation as speculation) and "Parallell Read-strategi" notes (prefer parallel Read calls for independent file reads) - Defense Philosophy linked to Opus 4.7 system card —
CLAUDE.md§Defense Philosophy now cites Opus 4.7 system card §5.2.1 (multi-layer defenses) and §6.3.1.1 (instruction hierarchy → tool-level enforcement) - Version bump: 6.1.0 → 6.2.0 across all files
[6.1.0] - 2026-04-10
Added
--fail-on <severity>flag — CI-friendly exit codes: exit 1 when any finding at or above the specified severity exists (critical/high/medium/low). Configurable viapolicy.jsonci.failOn--compactoutput mode — One-liner per finding format ([SEVERITY] scanner: title (file:line)), reduces CI log noise. Configurable viapolicy.jsonci.compact- CI/CD pipeline templates — Ready-to-use templates in
ci/: GitHub Actions (github-action.yml), Azure DevOps (azure-pipelines.yml), GitLab CI (gitlab-ci.yml) with SARIF upload, Node 18 setup - CI/CD integration guide —
docs/ci-cd-guide.mdwith 5-minute setup per platform, Schrems II/NSM compliance documentation, exit code reference - npm publish preparation —
fileswhitelist inpackage.json(onlybin/+scanners/),.npmignoresafety net,homepagefield - Policy
cisection — Newci: { failOn, compact }section in.llm-security/policy.jsonfor distributable CI configuration
Changed
- Version bump: 6.0.0 → 6.1.0 across all files
[6.0.0] - 2026-04-10
Added
- Compliance mapping —
knowledge/compliance-mapping.mdmaps plugin capabilities to EU AI Act (Art. 9, 15, 17), NIST AI RMF (Map, Measure, Manage, Govern), ISO 42001 (Annex A), and MITRE ATLAS techniques (AML.T IDs) - Norwegian regulatory context —
knowledge/norwegian-context.mdcovers Datatilsynet (DPIA for AI), NSM (basic security principles), and Digitaliseringsdirektoratet guidance - SARIF 2.1.0 output —
scanners/lib/sarif-formatter.mjsconverts scan output to OASIS SARIF standard format. Use--format sarifwith scan/deep-scan commands - Structured audit trail —
scanners/lib/audit-trail.mjswrites JSONL audit events with ISO 8601 timestamps, OWASP category tags, and SIEM-ready schema. Configurable viaLLM_SECURITY_AUDIT_*env vars - AI-BOM generator —
scanners/ai-bom-generator.mjs+scanners/lib/bom-builder.mjsproduce CycloneDX 1.6 Bills of Materials for AI components (models, MCP servers, plugins, knowledge, hooks) - Policy-as-code —
scanners/lib/policy-loader.mjsreads.llm-security/policy.jsonfor distributable hook configuration. Integrated into all 8 hooks. Env vars always take precedence - Standalone CLI —
bin/llm-security.mjsprovidesnpx llm-securityentry point. Subcommands:scan,deep-scan,posture,audit-bom,benchmark - Posture compliance categories — 3 new posture categories (14: EU AI Act, 15: NIST AI RMF, 16: ISO 42001). Advisory only — do not affect Grade A threshold
- Attack simulator benchmark mode —
--benchmarkflag outputs structured pass/fail metrics for CI integration
Changed
- Version bump: 5.1.0 → 6.0.0 across all files
- Knowledge base expanded from 13 to 15 files
- Scanner count: 15 → 16 (AI-BOM generator added)
- Posture scanner: 13 → 16 categories
- All hooks now read policy from
.llm-security/policy.json(backward-compatible — defaults match hardcoded values)
[5.1.0] - 2026-04-07
Added
- Sandboxed remote cloning —
git clonefor remote scans is now hardened with two defense layers:- Git config flags:
core.hooksPath=/dev/null,core.symlinks=false,core.fsmonitor=false, all LFS filter drivers disabled,protocol.file.allow=never,transfer.fsckObjects=true. Environment:GIT_CONFIG_NOSYSTEM=1,GIT_CONFIG_GLOBAL=/dev/null,GIT_ATTR_NOSYSTEM=1,GIT_TERMINAL_PROMPT=0 - OS-level filesystem sandbox: macOS
sandbox-execand Linuxbubblewrap(bwrap) restrict file writes to only the specific temp directory. Even if.gitattributesfilter drivers bypass git config, they cannot write outside the clone dir. bwrap probe-tests availability before use (graceful fallback on Ubuntu 24.04+ where AppArmor blocks it). Graceful fallback on Windows (git config flags only, WARN logged)
- Git config flags:
- Post-clone size check — Repos exceeding 100MB after clone are rejected and cleaned up
- UUID-unique evidence filenames —
fs-utils.mjs tmppathnow generates unique filenames withcrypto.randomUUID()suffix, preventing race conditions between concurrent scans - Evidence file cleanup —
scan.mdandplugin-audit.mdnow clean up evidence files (content-extract, plugin-extract) after scanning - Cleanup guarantee — Both
scan.mdandplugin-audit.mdhave explicit cleanup guarantee: temp dir + evidence file are removed even if scan fails or errors
Changed
scanners/lib/git-clone.mjs— complete rewrite of clone command with sandbox wrappingscanners/lib/fs-utils.mjs— tmppath usescrypto.randomUUID()for unique names
[5.0.0] - 2026-04-06
Added
- Prompt Injection Hardening (v5.0) — 8-session defense-in-depth overhaul driven by 7 research papers (2025-2026). Defense philosophy: broader detection + increased attack cost + longer monitoring windows + architectural constraints + honest documentation
- MEDIUM advisory wiring —
pre-prompt-inject-scan.mjsemits advisory for MEDIUM-severity obfuscation signals (leetspeak, homoglyphs, zero-width, multi-language). Never blocks.post-mcp-verify.mjsincludes MEDIUM in injection scan advisory - Unicode Tag steganography —
string-utils.mjsdecodes U+E0001-E007F (invisible ASCII encoding). CRITICAL if decoded content matches injection patterns, HIGH for bare presence. Integrated intonormalizeForScan()pipeline - BIDI override stripping — Removes directional override characters before injection scanning
- Bash expansion normalization — New
bash-normalize.mjsstrips${}, empty quotes, backslash splits before command matching. Applied inpre-bash-destructive.mjsandpre-install-supply-chain.mjs - Rule of Two enforcement —
post-session-guard.mjsgainsLLM_SECURITY_TRIFECTA_MODE=block|warn|off(default: warn). Block mode exits with code 2 for MCP-concentrated trifecta or sensitive path + exfiltration - 100-call long-horizon monitoring — Extended window alongside 20-call sliding window. Slow-burn trifecta detection (legs >50 calls apart = MEDIUM). Behavioral drift via Jensen-Shannon divergence on tool-class distribution
- HITL trap detection — HIGH patterns for approval urgency, summary suppression, scope minimization. MEDIUM for cognitive load (injection buried in verbose output)
- Sub-agent delegation tracking —
post-session-guard.mjstracks Task/Agent tool usage. Escalation-after-input advisory when delegation occurs within 5 calls of untrusted input (DeepMind Agent Traps kat. 4) - Natural language indirection — MEDIUM patterns for "fetch this URL and execute", "send this data to", "read ~/.ssh". Strict false-positive tests for benign phrasing
- Hybrid attack patterns — P2SQL (SQL keywords in injection text), recursive injection (injection containing injection), XSS in agent context (
<script>,javascript:,onerror=) - CaMeL-inspired data flow tagging — SHA-256 provenance tracking in
post-session-guard.mjs. Hash of tool output → match against subsequent tool input. Linked data flows elevate trifecta severity - Adaptive red-team —
attack-simulator.mjs --adaptiveruns 5 mutation rounds per passing scenario: homoglyph substitution, encoding wrapping, zero-width injection, case alternation, synonym substitution. Rules inknowledge/attack-mutations.json - Knowledge base expansion —
prompt-injection-research-2025-2026.md(7 papers),deepmind-agent-traps.md(6 categories, 43 techniques),attack-mutations.json(synonym tables). Attack scenarios expanded from 38 to 64 across 12 categories - Posture scanner expanded to 13 categories — New: Prompt Injection Hardening (cat 11), Rule of Two (cat 12), Long-Horizon Monitoring (cat 13). Checks for MEDIUM advisory, Unicode Tag detection, bash normalization, TRIFECTA_MODE, behavioral drift
- Defense Philosophy section in CLAUDE.md — honest documentation of what v5.0 can and cannot do, based on joint paper findings (95-100% ASR against all tested defenses)
- 8 new posture scanner tests (49 total for posture)
Changed
- Posture scanner version updated to 5.0.0
- Dashboard aggregator version updated to 5.0.0
- Red-team scenarios expanded from 38 to 64 across 12 categories
- Knowledge files count: 10 -> 13
[4.5.1] - 2026-04-04
Fixed
- Cross-platform support (Windows/Linux). Replaced all Unix-only patterns:
fileURLToPath()instead ofimport.meta.url.replace('file://', ''),path.dirname()instead oflastIndexOf('/'), nativefetch()instead ofcurlsubprocess (Node 18+), removed2>/dev/nullfrom shell commands, fixed tilde expansion regex for Windows backslash paths. 11 files changed, 782 tests pass.
[4.5.0] - 2026-04-04
Added
- Attack simulation / red-team mode —
scanners/attack-simulator.mjsruns 38 crafted attack scenarios across 7 categories against the plugin's own hooks. Data-driven: scenarios defined inknowledge/attack-scenarios.json, payloads assembled at runtime via fragment concatenation (avoids triggering hooks on source file). Categories: secrets (7), destructive (8), supply-chain (4), prompt-injection (6), pathguard (6), mcp-output (4), session-trifecta (3). CLI:node scanners/attack-simulator.mjs [--category <name>] [--json] [--verbose]. Library:import { loadScenarios, runScenario, resolvePayloads } /security red-teamcommand — attack simulation with category filter (--category secrets|destructive|...). Narrative report with per-category breakdown and defense scoreknowledge/attack-scenarios.json— 38 red-team scenarios with placeholder payloads ({{MARKER}}syntax), resolved at runtime to actual attack strings- 31 new tests for attack simulator (unit + integration + CLI)
[4.4.0] - 2026-04-03
Added
- Cross-project security dashboard —
scanners/dashboard-aggregator.mjsdiscovers all Claude Code projects under ~/ (depth 3) and ~/.claude/plugins/, runs posture-scanner on each, aggregates results. Machine grade = weakest link across all projects. Cache in~/.cache/llm-security/dashboard-latest.json(24h staleness). CLI:node scanners/dashboard-aggregator.mjs [--no-cache] [--max-depth N]. Library:import { aggregate, discoverProjects } /security dashboardcommand — machine-wide security overview with per-project grade table, sorted by grade (worst first). Shows cache status, total findings, and recommendations based on machine grade- 16 new tests for dashboard aggregator (discovery, aggregation, caching, grade logic)
[4.3.0] - 2026-04-03
Added
- MCP description drift detection —
scanners/lib/mcp-description-cache.mjscaches MCP tool descriptions in~/.cache/llm-security/mcp-descriptions.jsonwith 7-day TTL. Compares via Levenshtein distance — >10% change triggers advisory (OWASP MCP05 rug-pull).extractMcpServer()exported for server attribution - MCP-concentrated trifecta —
post-session-guard.mjsnow detects when all 3 lethal trifecta legs (input + access + exfil) originate from the same MCP server, elevating severity. Single compromised server pattern - Cumulative data volume tracking —
post-session-guard.mjstracks total output bytes per session, warns at 100KB (LOW), 500KB (MEDIUM), 1MB (HIGH) thresholds (OWASP ASI02) - Per-MCP-tool volume tracking —
post-mcp-verify.mjstracks cumulative output per MCP tool, warns when a single tool exceeds 100KB (OWASP ASI02, MCP03) - MCP drift integration in post-mcp-verify — checks MCP tool descriptions on every invocation against cached baseline, advisory on significant drift
- 35 new tests: 16 for mcp-description-cache, 5 for post-mcp-verify drift/volume, 14 for post-session-guard MCP features
[4.2.0] - 2026-04-03
Added
- Supply chain re-check scanner —
scanners/supply-chain-recheck.mjs(prefix SCR) periodically re-audits installed dependencies by parsing lockfiles (package-lock.json, yarn.lock, requirements.txt, Pipfile.lock). Checks against curated blocklists, OSV.dev batch API (/v1/querybatch) for known CVEs, and Levenshtein-based typosquat detection against top-packages knowledge base. Offline fallback: blocklist + typosquat checks run without network, INFO finding notes skipped CVE check. OWASP: LLM03, ASI04, AST06, MCP04 - Shared supply chain data module —
scanners/lib/supply-chain-data.mjsextracts blocklists (NPM/PIP/Cargo/Gem), helper functions, and OSV.dev API calls shared between the hook (pre-install-supply-chain.mjs) and the new scanner /security supply-checkcommand — standalone dependency re-audit with focused output. CLI wrapper:node scanners/supply-chain-recheck-cli.mjs <path>- SCR prefix added to all 4 OWASP maps (LLM, ASI, AST, MCP) in severity.mjs
- Supply chain scanner integrated into scan-orchestrator (10th scanner, runs before toxic-flow)
- Test fixtures:
tests/fixtures/supply-chain/with compromised and clean lockfiles for npm, pip, yarn, Pipfile - 30 new tests for supply-chain-recheck scanner and shared module
Changed
pre-install-supply-chain.mjshook refactored to import blocklists and helpers from sharedsupply-chain-data.mjsmodule (reduced duplication by ~160 lines)
[4.1.0] - 2026-04-03
Added
- Reference configuration generator —
scanners/reference-config-generator.mjsgenerates Grade A security configuration based on posture scanner gaps. Detects project type (plugin/monorepo/standalone). Templates intemplates/reference-config/. CLI:node scanners/reference-config-generator.mjs [path] [--apply]. Library:import { generate } from './reference-config-generator.mjs' /security hardencommand — runs posture scanner, identifies gaps, generates settings.json (deny-first), CLAUDE.md security section, and .gitignore additions. Supports--dry-run(default) and--apply(writes with backup). Post-apply verification re-runs posture scanner to confirm improvement- Reference config templates:
settings-deny-first.json,claude-md-security-section.md,gitignore-security.txt - 23 new tests for reference-config-generator (grade-a, grade-f, apply mode, project type detection)
[4.0.0] - 2026-04-03
Added
- Deterministic posture scanner —
posture-scanner.mjsreplaces the Opus-based posture-assessor-agent for/security posture. 10 categories assessed in <50ms (was ~6 min with agent). Scanner prefix PST. Standalone CLI:node scanners/posture-scanner.mjs [path]→ JSON stdout. Categories: Deny-First, Secrets, Path Guarding, MCP Trust, Destructive Blocking, Sandbox, Human Review, Plugin Sources, Session Isolation, Cognitive State Security. ReusesscanForInjection()andgradeFromPassRate()from shared libraries. Grade A/B/C/D/F with risk score, risk band, and verdict - PST prefix added to all 4 OWASP maps (LLM, ASI, AST, MCP) in severity.mjs
- Test fixtures:
tests/fixtures/posture-scan/grade-a-project/(Grade A) andgrade-f-project/(Grade F) - 41 new tests for posture scanner (interface, grade-a, grade-f)
Changed
/security posturenow uses deterministic scanner via Bash instead of spawning posture-assessor-agent. Instant results, zero token cost/security auditruns posture scanner first for instant category data, then agents for narrative and skill/MCP analysis- Posture-assessor-agent retained for full audit narrative only
[3.1.1] - 2026-04-03
Audit remediation: 6 findings fixed, global settings hardened.
[3.0.0] - 2026-04-03
Public release. 8 development sessions from v2.5 to v3.0.
Added
- Toxic flow analysis (v2.7.0) — 8th orchestrated scanner (
toxic-flow-analyzer.mjs, prefix TFA) detecting lethal trifecta patterns: untrusted input + sensitive data access + exfiltration sink. Post-processing correlator consuming output from all prior scanners. Direct, cross-component, and project-level detection with mitigation downgrades. OWASP: ASI01, ASI02, ASI05 - Runtime session guard (v2.7.1) — PostToolUse hook monitoring tool call sequences for lethal trifecta forming during a session. Sliding window (20 calls), per-session JSONL state in
/tmp/, advisory warning (never blocks). Auto-cleanup after 24h - MCP runtime inspection (v2.8.0) — Standalone scanner (
mcp-live-inspect.mjs, prefix MCI) connecting to running MCP stdio servers via JSON-RPC 2.0. Fetches live tool/prompt/resource lists, scans descriptions for injection patterns, detects tool shadowing across servers. 10s timeout per server. New/security mcp-inspectcommand./security mcp-audit --liveflag for combined static + live analysis - Auto update notifications (v2.8.1) — UserPromptSubmit hook checking for newer plugin versions against the public Forgejo repo (max 1x/24h, cached in
~/.cache/llm-security/). Disable:LLM_SECURITY_UPDATE_CHECK=off - Report diffing & baseline (v2.9.0) —
diff-engine.mjslibrary for finding fingerprinting, fuzzy line matching (+-3), and diff categorization (new/resolved/unchanged/moved). Scan orchestrator gains--baselineand--save-baselineflags. Baselines stored per target hash inreports/baselines/. New/security diffcommand - Continuous scanning (v2.9.1) —
/security watch [path] [--interval 6h]using built-in /loop for recurring diff scanning.watch-cron.mjsstandalone script for system cron/launchd with multi-target config and exit codes - Skill signature registry (v2.9.2) —
skill-registry.mjslibrary for SHA-256 fingerprinting of normalized skill content, scan result caching (7-day staleness), and pattern search. New/security registrycommand./security scanchecks registry before full scan for instant results on known fingerprints - OWASP Skills Top 10 (v2.6.0) — New knowledge file
owasp-skills-top10.md(AST01-AST10) with skill-specific threat definitions and mitigations - MEDIUM injection patterns (v2.6.0) — ~15 new patterns: base64 payloads, leetspeak, homoglyphs, multi-language mixing, markdown/HTML comment injection
- 4-framework OWASP mapping (v2.6.0) — Full coverage of LLM Top 10, Agentic AI Top 10 (ASI), Skills Top 10 (AST), MCP Top 10 in severity.mjs
- Architecture diagram (mermaid) in README
- CHANGELOG.md
Changed
- Scan orchestrator now runs 8 scanners (was 7) with TFA running last
- Agent prompts updated with ASI/AST/MCP OWASP references
scanForInjection()returns{ found, severity, patterns }instead of boolean- Self-scan suppressions updated from ~150 to ~190 (TFA self-referential findings added)
- Plugin description updated to reference all 4 OWASP frameworks
Fixed
- package.json version sync with plugin.json
[2.5.0] - 2026-04-02
Added
- Pre-extraction indirection layer for remote scan defense. Remote scans pre-extract structured evidence via
content-extractor.mjsand strip injection patterns BEFORE LLM agents see the content
[2.4.0] - 2026-04-01
Added
- GitHub repo URL support for
scanandplugin-audit. Clone to temp dir viagit-clone.mjs, scan locally, clean up.--branch <name>flag for non-default branches
[2.3.0] - 2026-04-01
Added
- PostToolUse expanded to ALL tools (was Bash-only). Scans Read, WebFetch, MCP, and all other tool output for indirect prompt injection
LLM_SECURITY_INJECTION_MODEenv var:block(default),warn,off- Complementary Tools section in README (parry-guard, Lasso, Snyk)
- CLAUDE.md poisoning documented as known limitation
Changed
- Short output skip (<100 chars) for PostToolUse performance
[2.2.0] - 2026-04-01
Added
- UserPromptSubmit hook blocking prompt injection in user input
- Obfuscation decoding: unicode-escape, hex-escape, URL-encoding, base64 normalization
- Shared
injection-patterns.mjsmodule (21 critical + 8 high patterns) - PostToolUse indirect injection scanning in tool output (LLM01)
Changed
- LLM01 coverage 83% -> 95%, LLM05 80% -> 83%
[2.1.0] - 2026-04-01
Added
- 383 tests (was 177): full hook coverage (66 tests), auto-cleaner coverage (140 tests)
- HTTPS install URL under fromaitochitta org
Fixed
- Auto-cleaner import guard
- Solo project setup (CONTRIBUTING.md removed)
Changed
- Model defaults set to sonnet
[2.0.0] - 2026-03-31
Added
- Open-source release: MIT LICENSE, SECURITY.md
- Test suite (
node:test, 177 tests) pre-write-pathguard.mjshook (8 path categories).gitignore,.editorconfig
[1.4.0] - 2026-02-21
Added
- Unified risk scoring formula (25/10/4/1 weights)
- Score-based verdicts and risk bands (Low -> Extreme)
- OWASP categorization and A-F grading
- Single
unified-report.mdtemplate replacing 9 separate templates
[1.3.0] - 2026-02-21
Added
/security cleancommand with 3-tier remediation (auto/semi-auto/manual)auto-cleaner.mjsengine (16 fix operations, atomic writes, post-fix validation)cleaner-agentfor semi-auto proposals--dry-runflag
[1.2.0] - 2026-02-19
Added
- 7 deterministic Node.js scanners (unicode, entropy, permissions, dependencies, taint, git forensics, network)
/security deep-scancommand and--deepflag- Synthesizer agent for scanner JSON interpretation
- Shared scanner library (
scanners/lib/) - Demo fixture with 85-finding security assessment
Changed
- OWASP coverage: LLM01 70->85%, LLM02 90->95%, LLM03 80->90%, LLM06 85->95%
[1.1.0] - 2026-02-19
Added
/security plugin-auditcommand/security mcp-auditcommand/security pre-deploycommand- 3 new report templates
Changed
- OWASP coverage: LLM03 75% -> 80%
[1.0.0] - 2026-02-19
Added
- Initial release
- 4 agents: skill-scanner, mcp-scanner, posture-assessor, threat-modeler
- 4 hooks: secret detection, destructive commands, supply chain, output verification
- 6 knowledge files (2,771 lines)
- 8 commands: security, scan, audit, posture, threat-model, plugin-audit, mcp-audit, pre-deploy
- 7 report templates
- OWASP LLM Top 10 + Agentic AI Top 10 coverage