Kjell Tore Guttormsen dea17a1c11 chore(release): bump to v6.0.0 — CAISS-readiness release with compliance, governance, CLI

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-04-10 14:03:10 +02:00

21 KiB

Raw Blame History

Changelog

All notable changes to the LLM Security Plugin are documented in this file.

The format is based on Keep a Changelog.

[6.0.0] - 2026-04-10

Added

Compliance mapping — knowledge/compliance-mapping.md maps plugin capabilities to EU AI Act (Art. 9, 15, 17), NIST AI RMF (Map, Measure, Manage, Govern), ISO 42001 (Annex A), and MITRE ATLAS techniques (AML.T IDs)
Norwegian regulatory context — knowledge/norwegian-context.md covers Datatilsynet (DPIA for AI), NSM (basic security principles), and Digitaliseringsdirektoratet guidance
SARIF 2.1.0 output — scanners/lib/sarif-formatter.mjs converts scan output to OASIS SARIF standard format. Use --format sarif with scan/deep-scan commands
Structured audit trail — scanners/lib/audit-trail.mjs writes JSONL audit events with ISO 8601 timestamps, OWASP category tags, and SIEM-ready schema. Configurable via LLM_SECURITY_AUDIT_* env vars
AI-BOM generator — scanners/ai-bom-generator.mjs + scanners/lib/bom-builder.mjs produce CycloneDX 1.6 Bills of Materials for AI components (models, MCP servers, plugins, knowledge, hooks)
Policy-as-code — scanners/lib/policy-loader.mjs reads .llm-security/policy.json for distributable hook configuration. Integrated into all 8 hooks. Env vars always take precedence
Standalone CLI — bin/llm-security.mjs provides npx llm-security entry point. Subcommands: scan, deep-scan, posture, audit-bom, benchmark
Posture compliance categories — 3 new posture categories (14: EU AI Act, 15: NIST AI RMF, 16: ISO 42001). Advisory only — do not affect Grade A threshold
Attack simulator benchmark mode — --benchmark flag outputs structured pass/fail metrics for CI integration

Changed

Version bump: 5.1.0 → 6.0.0 across all files
Knowledge base expanded from 13 to 15 files
Scanner count: 15 → 16 (AI-BOM generator added)
Posture scanner: 13 → 16 categories
All hooks now read policy from .llm-security/policy.json (backward-compatible — defaults match hardcoded values)

[5.1.0] - 2026-04-07

Added

Sandboxed remote cloning — git clone for remote scans is now hardened with two defense layers:
1. Git config flags: core.hooksPath=/dev/null, core.symlinks=false, core.fsmonitor=false, all LFS filter drivers disabled, protocol.file.allow=never, transfer.fsckObjects=true. Environment: GIT_CONFIG_NOSYSTEM=1, GIT_CONFIG_GLOBAL=/dev/null, GIT_ATTR_NOSYSTEM=1, GIT_TERMINAL_PROMPT=0
2. OS-level filesystem sandbox: macOS sandbox-exec and Linux bubblewrap (bwrap) restrict file writes to only the specific temp directory. Even if .gitattributes filter drivers bypass git config, they cannot write outside the clone dir. bwrap probe-tests availability before use (graceful fallback on Ubuntu 24.04+ where AppArmor blocks it). Graceful fallback on Windows (git config flags only, WARN logged)
Post-clone size check — Repos exceeding 100MB after clone are rejected and cleaned up
UUID-unique evidence filenames — fs-utils.mjs tmppath now generates unique filenames with crypto.randomUUID() suffix, preventing race conditions between concurrent scans
Evidence file cleanup — scan.md and plugin-audit.md now clean up evidence files (content-extract, plugin-extract) after scanning
Cleanup guarantee — Both scan.md and plugin-audit.md have explicit cleanup guarantee: temp dir + evidence file are removed even if scan fails or errors

Changed

scanners/lib/git-clone.mjs — complete rewrite of clone command with sandbox wrapping
scanners/lib/fs-utils.mjs — tmppath uses crypto.randomUUID() for unique names

[5.0.0] - 2026-04-06

Added

Prompt Injection Hardening (v5.0) — 8-session defense-in-depth overhaul driven by 7 research papers (2025-2026). Defense philosophy: broader detection + increased attack cost + longer monitoring windows + architectural constraints + honest documentation
MEDIUM advisory wiring — pre-prompt-inject-scan.mjs emits advisory for MEDIUM-severity obfuscation signals (leetspeak, homoglyphs, zero-width, multi-language). Never blocks. post-mcp-verify.mjs includes MEDIUM in injection scan advisory
Unicode Tag steganography — string-utils.mjs decodes U+E0001-E007F (invisible ASCII encoding). CRITICAL if decoded content matches injection patterns, HIGH for bare presence. Integrated into normalizeForScan() pipeline
BIDI override stripping — Removes directional override characters before injection scanning
Bash expansion normalization — New bash-normalize.mjs strips ${}, empty quotes, backslash splits before command matching. Applied in pre-bash-destructive.mjs and pre-install-supply-chain.mjs
Rule of Two enforcement — post-session-guard.mjs gains LLM_SECURITY_TRIFECTA_MODE=block|warn|off (default: warn). Block mode exits with code 2 for MCP-concentrated trifecta or sensitive path + exfiltration
100-call long-horizon monitoring — Extended window alongside 20-call sliding window. Slow-burn trifecta detection (legs >50 calls apart = MEDIUM). Behavioral drift via Jensen-Shannon divergence on tool-class distribution
HITL trap detection — HIGH patterns for approval urgency, summary suppression, scope minimization. MEDIUM for cognitive load (injection buried in verbose output)
Sub-agent delegation tracking — post-session-guard.mjs tracks Task/Agent tool usage. Escalation-after-input advisory when delegation occurs within 5 calls of untrusted input (DeepMind Agent Traps kat. 4)
Natural language indirection — MEDIUM patterns for "fetch this URL and execute", "send this data to", "read ~/.ssh". Strict false-positive tests for benign phrasing
Hybrid attack patterns — P2SQL (SQL keywords in injection text), recursive injection (injection containing injection), XSS in agent context (<script>, javascript:, onerror=)
CaMeL-inspired data flow tagging — SHA-256 provenance tracking in post-session-guard.mjs. Hash of tool output → match against subsequent tool input. Linked data flows elevate trifecta severity
Adaptive red-team — attack-simulator.mjs --adaptive runs 5 mutation rounds per passing scenario: homoglyph substitution, encoding wrapping, zero-width injection, case alternation, synonym substitution. Rules in knowledge/attack-mutations.json
Knowledge base expansion — prompt-injection-research-2025-2026.md (7 papers), deepmind-agent-traps.md (6 categories, 43 techniques), attack-mutations.json (synonym tables). Attack scenarios expanded from 38 to 64 across 12 categories
Posture scanner expanded to 13 categories — New: Prompt Injection Hardening (cat 11), Rule of Two (cat 12), Long-Horizon Monitoring (cat 13). Checks for MEDIUM advisory, Unicode Tag detection, bash normalization, TRIFECTA_MODE, behavioral drift
Defense Philosophy section in CLAUDE.md — honest documentation of what v5.0 can and cannot do, based on joint paper findings (95-100% ASR against all tested defenses)
8 new posture scanner tests (49 total for posture)

Changed

Posture scanner version updated to 5.0.0
Dashboard aggregator version updated to 5.0.0
Red-team scenarios expanded from 38 to 64 across 12 categories
Knowledge files count: 10 -> 13

[4.5.1] - 2026-04-04

Fixed

Cross-platform support (Windows/Linux). Replaced all Unix-only patterns: fileURLToPath() instead of import.meta.url.replace('file://', ''), path.dirname() instead of lastIndexOf('/'), native fetch() instead of curl subprocess (Node 18+), removed 2>/dev/null from shell commands, fixed tilde expansion regex for Windows backslash paths. 11 files changed, 782 tests pass.

[4.5.0] - 2026-04-04

Added

Attack simulation / red-team mode — scanners/attack-simulator.mjs runs 38 crafted attack scenarios across 7 categories against the plugin's own hooks. Data-driven: scenarios defined in knowledge/attack-scenarios.json, payloads assembled at runtime via fragment concatenation (avoids triggering hooks on source file). Categories: secrets (7), destructive (8), supply-chain (4), prompt-injection (6), pathguard (6), mcp-output (4), session-trifecta (3). CLI: node scanners/attack-simulator.mjs [--category <name>] [--json] [--verbose]. Library: import { loadScenarios, runScenario, resolvePayloads }
/security red-team command — attack simulation with category filter (--category secrets|destructive|...). Narrative report with per-category breakdown and defense score
knowledge/attack-scenarios.json — 38 red-team scenarios with placeholder payloads ({{MARKER}} syntax), resolved at runtime to actual attack strings
31 new tests for attack simulator (unit + integration + CLI)

[4.4.0] - 2026-04-03

Added

Cross-project security dashboard — scanners/dashboard-aggregator.mjs discovers all Claude Code projects under ~/ (depth 3) and ~/.claude/plugins/, runs posture-scanner on each, aggregates results. Machine grade = weakest link across all projects. Cache in ~/.cache/llm-security/dashboard-latest.json (24h staleness). CLI: node scanners/dashboard-aggregator.mjs [--no-cache] [--max-depth N]. Library: import { aggregate, discoverProjects }
/security dashboard command — machine-wide security overview with per-project grade table, sorted by grade (worst first). Shows cache status, total findings, and recommendations based on machine grade
16 new tests for dashboard aggregator (discovery, aggregation, caching, grade logic)

[4.3.0] - 2026-04-03

Added

MCP description drift detection — scanners/lib/mcp-description-cache.mjs caches MCP tool descriptions in ~/.cache/llm-security/mcp-descriptions.json with 7-day TTL. Compares via Levenshtein distance — >10% change triggers advisory (OWASP MCP05 rug-pull). extractMcpServer() exported for server attribution
MCP-concentrated trifecta — post-session-guard.mjs now detects when all 3 lethal trifecta legs (input + access + exfil) originate from the same MCP server, elevating severity. Single compromised server pattern
Cumulative data volume tracking — post-session-guard.mjs tracks total output bytes per session, warns at 100KB (LOW), 500KB (MEDIUM), 1MB (HIGH) thresholds (OWASP ASI02)
Per-MCP-tool volume tracking — post-mcp-verify.mjs tracks cumulative output per MCP tool, warns when a single tool exceeds 100KB (OWASP ASI02, MCP03)
MCP drift integration in post-mcp-verify — checks MCP tool descriptions on every invocation against cached baseline, advisory on significant drift
35 new tests: 16 for mcp-description-cache, 5 for post-mcp-verify drift/volume, 14 for post-session-guard MCP features

[4.2.0] - 2026-04-03

Added

Supply chain re-check scanner — scanners/supply-chain-recheck.mjs (prefix SCR) periodically re-audits installed dependencies by parsing lockfiles (package-lock.json, yarn.lock, requirements.txt, Pipfile.lock). Checks against curated blocklists, OSV.dev batch API (/v1/querybatch) for known CVEs, and Levenshtein-based typosquat detection against top-packages knowledge base. Offline fallback: blocklist + typosquat checks run without network, INFO finding notes skipped CVE check. OWASP: LLM03, ASI04, AST06, MCP04
Shared supply chain data module — scanners/lib/supply-chain-data.mjs extracts blocklists (NPM/PIP/Cargo/Gem), helper functions, and OSV.dev API calls shared between the hook (pre-install-supply-chain.mjs) and the new scanner
/security supply-check command — standalone dependency re-audit with focused output. CLI wrapper: node scanners/supply-chain-recheck-cli.mjs <path>
SCR prefix added to all 4 OWASP maps (LLM, ASI, AST, MCP) in severity.mjs
Supply chain scanner integrated into scan-orchestrator (10th scanner, runs before toxic-flow)
Test fixtures: tests/fixtures/supply-chain/ with compromised and clean lockfiles for npm, pip, yarn, Pipfile
30 new tests for supply-chain-recheck scanner and shared module

Changed

pre-install-supply-chain.mjs hook refactored to import blocklists and helpers from shared supply-chain-data.mjs module (reduced duplication by ~160 lines)

[4.1.0] - 2026-04-03

Added

Reference configuration generator — scanners/reference-config-generator.mjs generates Grade A security configuration based on posture scanner gaps. Detects project type (plugin/monorepo/standalone). Templates in templates/reference-config/. CLI: node scanners/reference-config-generator.mjs [path] [--apply]. Library: import { generate } from './reference-config-generator.mjs'
/security harden command — runs posture scanner, identifies gaps, generates settings.json (deny-first), CLAUDE.md security section, and .gitignore additions. Supports --dry-run (default) and --apply (writes with backup). Post-apply verification re-runs posture scanner to confirm improvement
Reference config templates: settings-deny-first.json, claude-md-security-section.md, gitignore-security.txt
23 new tests for reference-config-generator (grade-a, grade-f, apply mode, project type detection)

[4.0.0] - 2026-04-03

Added

Deterministic posture scanner — posture-scanner.mjs replaces the Opus-based posture-assessor-agent for /security posture. 10 categories assessed in <50ms (was ~6 min with agent). Scanner prefix PST. Standalone CLI: node scanners/posture-scanner.mjs [path] → JSON stdout. Categories: Deny-First, Secrets, Path Guarding, MCP Trust, Destructive Blocking, Sandbox, Human Review, Plugin Sources, Session Isolation, Cognitive State Security. Reuses scanForInjection() and gradeFromPassRate() from shared libraries. Grade A/B/C/D/F with risk score, risk band, and verdict
PST prefix added to all 4 OWASP maps (LLM, ASI, AST, MCP) in severity.mjs
Test fixtures: tests/fixtures/posture-scan/grade-a-project/ (Grade A) and grade-f-project/ (Grade F)
41 new tests for posture scanner (interface, grade-a, grade-f)

Changed

/security posture now uses deterministic scanner via Bash instead of spawning posture-assessor-agent. Instant results, zero token cost
/security audit runs posture scanner first for instant category data, then agents for narrative and skill/MCP analysis
Posture-assessor-agent retained for full audit narrative only

[3.1.1] - 2026-04-03

Audit remediation: 6 findings fixed, global settings hardened.

[3.0.0] - 2026-04-03

Public release. 8 development sessions from v2.5 to v3.0.

Added

Toxic flow analysis (v2.7.0) — 8th orchestrated scanner (toxic-flow-analyzer.mjs, prefix TFA) detecting lethal trifecta patterns: untrusted input + sensitive data access + exfiltration sink. Post-processing correlator consuming output from all prior scanners. Direct, cross-component, and project-level detection with mitigation downgrades. OWASP: ASI01, ASI02, ASI05
Runtime session guard (v2.7.1) — PostToolUse hook monitoring tool call sequences for lethal trifecta forming during a session. Sliding window (20 calls), per-session JSONL state in /tmp/, advisory warning (never blocks). Auto-cleanup after 24h
MCP runtime inspection (v2.8.0) — Standalone scanner (mcp-live-inspect.mjs, prefix MCI) connecting to running MCP stdio servers via JSON-RPC 2.0. Fetches live tool/prompt/resource lists, scans descriptions for injection patterns, detects tool shadowing across servers. 10s timeout per server. New /security mcp-inspect command. /security mcp-audit --live flag for combined static + live analysis
Auto update notifications (v2.8.1) — UserPromptSubmit hook checking for newer plugin versions against the public Forgejo repo (max 1x/24h, cached in ~/.cache/llm-security/). Disable: LLM_SECURITY_UPDATE_CHECK=off
Report diffing & baseline (v2.9.0) — diff-engine.mjs library for finding fingerprinting, fuzzy line matching (+-3), and diff categorization (new/resolved/unchanged/moved). Scan orchestrator gains --baseline and --save-baseline flags. Baselines stored per target hash in reports/baselines/. New /security diff command
Continuous scanning (v2.9.1) — /security watch [path] [--interval 6h] using built-in /loop for recurring diff scanning. watch-cron.mjs standalone script for system cron/launchd with multi-target config and exit codes
Skill signature registry (v2.9.2) — skill-registry.mjs library for SHA-256 fingerprinting of normalized skill content, scan result caching (7-day staleness), and pattern search. New /security registry command. /security scan checks registry before full scan for instant results on known fingerprints
OWASP Skills Top 10 (v2.6.0) — New knowledge file owasp-skills-top10.md (AST01-AST10) with skill-specific threat definitions and mitigations
MEDIUM injection patterns (v2.6.0) — ~15 new patterns: base64 payloads, leetspeak, homoglyphs, multi-language mixing, markdown/HTML comment injection
4-framework OWASP mapping (v2.6.0) — Full coverage of LLM Top 10, Agentic AI Top 10 (ASI), Skills Top 10 (AST), MCP Top 10 in severity.mjs
Architecture diagram (mermaid) in README
CHANGELOG.md

Changed

Scan orchestrator now runs 8 scanners (was 7) with TFA running last
Agent prompts updated with ASI/AST/MCP OWASP references
scanForInjection() returns { found, severity, patterns } instead of boolean
Self-scan suppressions updated from ~150 to ~190 (TFA self-referential findings added)
Plugin description updated to reference all 4 OWASP frameworks

Fixed

package.json version sync with plugin.json

[2.5.0] - 2026-04-02

Added

Pre-extraction indirection layer for remote scan defense. Remote scans pre-extract structured evidence via content-extractor.mjs and strip injection patterns BEFORE LLM agents see the content

[2.4.0] - 2026-04-01

Added

GitHub repo URL support for scan and plugin-audit. Clone to temp dir via git-clone.mjs, scan locally, clean up. --branch <name> flag for non-default branches

[2.3.0] - 2026-04-01

Added

PostToolUse expanded to ALL tools (was Bash-only). Scans Read, WebFetch, MCP, and all other tool output for indirect prompt injection
LLM_SECURITY_INJECTION_MODE env var: block (default), warn, off
Complementary Tools section in README (parry-guard, Lasso, Snyk)
CLAUDE.md poisoning documented as known limitation

Changed

Short output skip (<100 chars) for PostToolUse performance

[2.2.0] - 2026-04-01

Added

UserPromptSubmit hook blocking prompt injection in user input
Obfuscation decoding: unicode-escape, hex-escape, URL-encoding, base64 normalization
Shared injection-patterns.mjs module (21 critical + 8 high patterns)
PostToolUse indirect injection scanning in tool output (LLM01)

Changed

LLM01 coverage 83% -> 95%, LLM05 80% -> 83%

[2.1.0] - 2026-04-01

Added

383 tests (was 177): full hook coverage (66 tests), auto-cleaner coverage (140 tests)
HTTPS install URL under fromaitochitta org

Fixed

Auto-cleaner import guard
Solo project setup (CONTRIBUTING.md removed)

Changed

Model defaults set to sonnet

[2.0.0] - 2026-03-31

Added

Open-source release: MIT LICENSE, SECURITY.md
Test suite (node:test, 177 tests)
pre-write-pathguard.mjs hook (8 path categories)
.gitignore, .editorconfig

[1.4.0] - 2026-02-21

Added

Unified risk scoring formula (25/10/4/1 weights)
Score-based verdicts and risk bands (Low -> Extreme)
OWASP categorization and A-F grading
Single unified-report.md template replacing 9 separate templates

[1.3.0] - 2026-02-21

Added

/security clean command with 3-tier remediation (auto/semi-auto/manual)
auto-cleaner.mjs engine (16 fix operations, atomic writes, post-fix validation)
cleaner-agent for semi-auto proposals
--dry-run flag

[1.2.0] - 2026-02-19

Added

7 deterministic Node.js scanners (unicode, entropy, permissions, dependencies, taint, git forensics, network)
/security deep-scan command and --deep flag
Synthesizer agent for scanner JSON interpretation
Shared scanner library (scanners/lib/)
Demo fixture with 85-finding security assessment

Changed

OWASP coverage: LLM01 70->85%, LLM02 90->95%, LLM03 80->90%, LLM06 85->95%

[1.1.0] - 2026-02-19

Added

/security plugin-audit command
/security mcp-audit command
/security pre-deploy command
3 new report templates

Changed

OWASP coverage: LLM03 75% -> 80%

[1.0.0] - 2026-02-19

Added

Initial release
4 agents: skill-scanner, mcp-scanner, posture-assessor, threat-modeler
4 hooks: secret detection, destructive commands, supply chain, output verification
6 knowledge files (2,771 lines)
8 commands: security, scan, audit, posture, threat-model, plugin-audit, mcp-audit, pre-deploy
7 report templates
OWASP LLM Top 10 + Agentic AI Top 10 coverage

21 KiB Raw Blame History

Changelog

[6.0.0] - 2026-04-10

Added

Changed

[5.1.0] - 2026-04-07

Added

Changed

[5.0.0] - 2026-04-06

Added

Changed

[4.5.1] - 2026-04-04

Fixed

[4.5.0] - 2026-04-04

Added

[4.4.0] - 2026-04-03

Added

[4.3.0] - 2026-04-03

Added

[4.2.0] - 2026-04-03

Added

Changed

[4.1.0] - 2026-04-03

Added

[4.0.0] - 2026-04-03

Added

Changed

[3.1.1] - 2026-04-03

[3.0.0] - 2026-04-03

Added

Changed

Fixed

[2.5.0] - 2026-04-02

Added

[2.4.0] - 2026-04-01

Added

[2.3.0] - 2026-04-01

Added

Changed

[2.2.0] - 2026-04-01

Added

Changed

[2.1.0] - 2026-04-01

Added

Fixed

Changed

[2.0.0] - 2026-03-31

Added

[1.4.0] - 2026-02-21

Added

[1.3.0] - 2026-02-21

Added

[1.2.0] - 2026-02-19

Added

Changed

[1.1.0] - 2026-02-19

Added

Changed

[1.0.0] - 2026-02-19

Added

21 KiB

Raw Blame History