ktg-plugin-marketplace/plugins/llm-security/CHANGELOG.md

17 KiB

Changelog

All notable changes to the LLM Security Plugin are documented in this file.

The format is based on Keep a Changelog.

[5.0.0] - 2026-04-06

Added

  • Prompt Injection Hardening (v5.0) — 8-session defense-in-depth overhaul driven by 7 research papers (2025-2026). Defense philosophy: broader detection + increased attack cost + longer monitoring windows + architectural constraints + honest documentation
  • MEDIUM advisory wiringpre-prompt-inject-scan.mjs emits advisory for MEDIUM-severity obfuscation signals (leetspeak, homoglyphs, zero-width, multi-language). Never blocks. post-mcp-verify.mjs includes MEDIUM in injection scan advisory
  • Unicode Tag steganographystring-utils.mjs decodes U+E0001-E007F (invisible ASCII encoding). CRITICAL if decoded content matches injection patterns, HIGH for bare presence. Integrated into normalizeForScan() pipeline
  • BIDI override stripping — Removes directional override characters before injection scanning
  • Bash expansion normalization — New bash-normalize.mjs strips ${}, empty quotes, backslash splits before command matching. Applied in pre-bash-destructive.mjs and pre-install-supply-chain.mjs
  • Rule of Two enforcementpost-session-guard.mjs gains LLM_SECURITY_TRIFECTA_MODE=block|warn|off (default: warn). Block mode exits with code 2 for MCP-concentrated trifecta or sensitive path + exfiltration
  • 100-call long-horizon monitoring — Extended window alongside 20-call sliding window. Slow-burn trifecta detection (legs >50 calls apart = MEDIUM). Behavioral drift via Jensen-Shannon divergence on tool-class distribution
  • HITL trap detection — HIGH patterns for approval urgency, summary suppression, scope minimization. MEDIUM for cognitive load (injection buried in verbose output)
  • Sub-agent delegation trackingpost-session-guard.mjs tracks Task/Agent tool usage. Escalation-after-input advisory when delegation occurs within 5 calls of untrusted input (DeepMind Agent Traps kat. 4)
  • Natural language indirection — MEDIUM patterns for "fetch this URL and execute", "send this data to", "read ~/.ssh". Strict false-positive tests for benign phrasing
  • Hybrid attack patterns — P2SQL (SQL keywords in injection text), recursive injection (injection containing injection), XSS in agent context (<script>, javascript:, onerror=)
  • CaMeL-inspired data flow tagging — SHA-256 provenance tracking in post-session-guard.mjs. Hash of tool output → match against subsequent tool input. Linked data flows elevate trifecta severity
  • Adaptive red-teamattack-simulator.mjs --adaptive runs 5 mutation rounds per passing scenario: homoglyph substitution, encoding wrapping, zero-width injection, case alternation, synonym substitution. Rules in knowledge/attack-mutations.json
  • Knowledge base expansionprompt-injection-research-2025-2026.md (7 papers), deepmind-agent-traps.md (6 categories, 43 techniques), attack-mutations.json (synonym tables). Attack scenarios expanded from 38 to 64 across 12 categories
  • Posture scanner expanded to 13 categories — New: Prompt Injection Hardening (cat 11), Rule of Two (cat 12), Long-Horizon Monitoring (cat 13). Checks for MEDIUM advisory, Unicode Tag detection, bash normalization, TRIFECTA_MODE, behavioral drift
  • Defense Philosophy section in CLAUDE.md — honest documentation of what v5.0 can and cannot do, based on joint paper findings (95-100% ASR against all tested defenses)
  • 8 new posture scanner tests (49 total for posture)

Changed

  • Posture scanner version updated to 5.0.0
  • Dashboard aggregator version updated to 5.0.0
  • Red-team scenarios expanded from 38 to 64 across 12 categories
  • Knowledge files count: 10 -> 13

[4.5.1] - 2026-04-04

Fixed

  • Cross-platform support (Windows/Linux). Replaced all Unix-only patterns: fileURLToPath() instead of import.meta.url.replace('file://', ''), path.dirname() instead of lastIndexOf('/'), native fetch() instead of curl subprocess (Node 18+), removed 2>/dev/null from shell commands, fixed tilde expansion regex for Windows backslash paths. 11 files changed, 782 tests pass.

[4.5.0] - 2026-04-04

Added

  • Attack simulation / red-team modescanners/attack-simulator.mjs runs 38 crafted attack scenarios across 7 categories against the plugin's own hooks. Data-driven: scenarios defined in knowledge/attack-scenarios.json, payloads assembled at runtime via fragment concatenation (avoids triggering hooks on source file). Categories: secrets (7), destructive (8), supply-chain (4), prompt-injection (6), pathguard (6), mcp-output (4), session-trifecta (3). CLI: node scanners/attack-simulator.mjs [--category <name>] [--json] [--verbose]. Library: import { loadScenarios, runScenario, resolvePayloads }
  • /security red-team command — attack simulation with category filter (--category secrets|destructive|...). Narrative report with per-category breakdown and defense score
  • knowledge/attack-scenarios.json — 38 red-team scenarios with placeholder payloads ({{MARKER}} syntax), resolved at runtime to actual attack strings
  • 31 new tests for attack simulator (unit + integration + CLI)

[4.4.0] - 2026-04-03

Added

  • Cross-project security dashboardscanners/dashboard-aggregator.mjs discovers all Claude Code projects under ~/ (depth 3) and ~/.claude/plugins/, runs posture-scanner on each, aggregates results. Machine grade = weakest link across all projects. Cache in ~/.cache/llm-security/dashboard-latest.json (24h staleness). CLI: node scanners/dashboard-aggregator.mjs [--no-cache] [--max-depth N]. Library: import { aggregate, discoverProjects }
  • /security dashboard command — machine-wide security overview with per-project grade table, sorted by grade (worst first). Shows cache status, total findings, and recommendations based on machine grade
  • 16 new tests for dashboard aggregator (discovery, aggregation, caching, grade logic)

[4.3.0] - 2026-04-03

Added

  • MCP description drift detectionscanners/lib/mcp-description-cache.mjs caches MCP tool descriptions in ~/.cache/llm-security/mcp-descriptions.json with 7-day TTL. Compares via Levenshtein distance — >10% change triggers advisory (OWASP MCP05 rug-pull). extractMcpServer() exported for server attribution
  • MCP-concentrated trifectapost-session-guard.mjs now detects when all 3 lethal trifecta legs (input + access + exfil) originate from the same MCP server, elevating severity. Single compromised server pattern
  • Cumulative data volume trackingpost-session-guard.mjs tracks total output bytes per session, warns at 100KB (LOW), 500KB (MEDIUM), 1MB (HIGH) thresholds (OWASP ASI02)
  • Per-MCP-tool volume trackingpost-mcp-verify.mjs tracks cumulative output per MCP tool, warns when a single tool exceeds 100KB (OWASP ASI02, MCP03)
  • MCP drift integration in post-mcp-verify — checks MCP tool descriptions on every invocation against cached baseline, advisory on significant drift
  • 35 new tests: 16 for mcp-description-cache, 5 for post-mcp-verify drift/volume, 14 for post-session-guard MCP features

[4.2.0] - 2026-04-03

Added

  • Supply chain re-check scannerscanners/supply-chain-recheck.mjs (prefix SCR) periodically re-audits installed dependencies by parsing lockfiles (package-lock.json, yarn.lock, requirements.txt, Pipfile.lock). Checks against curated blocklists, OSV.dev batch API (/v1/querybatch) for known CVEs, and Levenshtein-based typosquat detection against top-packages knowledge base. Offline fallback: blocklist + typosquat checks run without network, INFO finding notes skipped CVE check. OWASP: LLM03, ASI04, AST06, MCP04
  • Shared supply chain data modulescanners/lib/supply-chain-data.mjs extracts blocklists (NPM/PIP/Cargo/Gem), helper functions, and OSV.dev API calls shared between the hook (pre-install-supply-chain.mjs) and the new scanner
  • /security supply-check command — standalone dependency re-audit with focused output. CLI wrapper: node scanners/supply-chain-recheck-cli.mjs <path>
  • SCR prefix added to all 4 OWASP maps (LLM, ASI, AST, MCP) in severity.mjs
  • Supply chain scanner integrated into scan-orchestrator (10th scanner, runs before toxic-flow)
  • Test fixtures: tests/fixtures/supply-chain/ with compromised and clean lockfiles for npm, pip, yarn, Pipfile
  • 30 new tests for supply-chain-recheck scanner and shared module

Changed

  • pre-install-supply-chain.mjs hook refactored to import blocklists and helpers from shared supply-chain-data.mjs module (reduced duplication by ~160 lines)

[4.1.0] - 2026-04-03

Added

  • Reference configuration generatorscanners/reference-config-generator.mjs generates Grade A security configuration based on posture scanner gaps. Detects project type (plugin/monorepo/standalone). Templates in templates/reference-config/. CLI: node scanners/reference-config-generator.mjs [path] [--apply]. Library: import { generate } from './reference-config-generator.mjs'
  • /security harden command — runs posture scanner, identifies gaps, generates settings.json (deny-first), CLAUDE.md security section, and .gitignore additions. Supports --dry-run (default) and --apply (writes with backup). Post-apply verification re-runs posture scanner to confirm improvement
  • Reference config templates: settings-deny-first.json, claude-md-security-section.md, gitignore-security.txt
  • 23 new tests for reference-config-generator (grade-a, grade-f, apply mode, project type detection)

[4.0.0] - 2026-04-03

Added

  • Deterministic posture scannerposture-scanner.mjs replaces the Opus-based posture-assessor-agent for /security posture. 10 categories assessed in <50ms (was ~6 min with agent). Scanner prefix PST. Standalone CLI: node scanners/posture-scanner.mjs [path] → JSON stdout. Categories: Deny-First, Secrets, Path Guarding, MCP Trust, Destructive Blocking, Sandbox, Human Review, Plugin Sources, Session Isolation, Cognitive State Security. Reuses scanForInjection() and gradeFromPassRate() from shared libraries. Grade A/B/C/D/F with risk score, risk band, and verdict
  • PST prefix added to all 4 OWASP maps (LLM, ASI, AST, MCP) in severity.mjs
  • Test fixtures: tests/fixtures/posture-scan/grade-a-project/ (Grade A) and grade-f-project/ (Grade F)
  • 41 new tests for posture scanner (interface, grade-a, grade-f)

Changed

  • /security posture now uses deterministic scanner via Bash instead of spawning posture-assessor-agent. Instant results, zero token cost
  • /security audit runs posture scanner first for instant category data, then agents for narrative and skill/MCP analysis
  • Posture-assessor-agent retained for full audit narrative only

[3.1.1] - 2026-04-03

Audit remediation: 6 findings fixed, global settings hardened.

[3.0.0] - 2026-04-03

Public release. 8 development sessions from v2.5 to v3.0.

Added

  • Toxic flow analysis (v2.7.0) — 8th orchestrated scanner (toxic-flow-analyzer.mjs, prefix TFA) detecting lethal trifecta patterns: untrusted input + sensitive data access + exfiltration sink. Post-processing correlator consuming output from all prior scanners. Direct, cross-component, and project-level detection with mitigation downgrades. OWASP: ASI01, ASI02, ASI05
  • Runtime session guard (v2.7.1) — PostToolUse hook monitoring tool call sequences for lethal trifecta forming during a session. Sliding window (20 calls), per-session JSONL state in /tmp/, advisory warning (never blocks). Auto-cleanup after 24h
  • MCP runtime inspection (v2.8.0) — Standalone scanner (mcp-live-inspect.mjs, prefix MCI) connecting to running MCP stdio servers via JSON-RPC 2.0. Fetches live tool/prompt/resource lists, scans descriptions for injection patterns, detects tool shadowing across servers. 10s timeout per server. New /security mcp-inspect command. /security mcp-audit --live flag for combined static + live analysis
  • Auto update notifications (v2.8.1) — UserPromptSubmit hook checking for newer plugin versions against the public Forgejo repo (max 1x/24h, cached in ~/.cache/llm-security/). Disable: LLM_SECURITY_UPDATE_CHECK=off
  • Report diffing & baseline (v2.9.0) — diff-engine.mjs library for finding fingerprinting, fuzzy line matching (+-3), and diff categorization (new/resolved/unchanged/moved). Scan orchestrator gains --baseline and --save-baseline flags. Baselines stored per target hash in reports/baselines/. New /security diff command
  • Continuous scanning (v2.9.1) — /security watch [path] [--interval 6h] using built-in /loop for recurring diff scanning. watch-cron.mjs standalone script for system cron/launchd with multi-target config and exit codes
  • Skill signature registry (v2.9.2) — skill-registry.mjs library for SHA-256 fingerprinting of normalized skill content, scan result caching (7-day staleness), and pattern search. New /security registry command. /security scan checks registry before full scan for instant results on known fingerprints
  • OWASP Skills Top 10 (v2.6.0) — New knowledge file owasp-skills-top10.md (AST01-AST10) with skill-specific threat definitions and mitigations
  • MEDIUM injection patterns (v2.6.0) — ~15 new patterns: base64 payloads, leetspeak, homoglyphs, multi-language mixing, markdown/HTML comment injection
  • 4-framework OWASP mapping (v2.6.0) — Full coverage of LLM Top 10, Agentic AI Top 10 (ASI), Skills Top 10 (AST), MCP Top 10 in severity.mjs
  • Architecture diagram (mermaid) in README
  • CHANGELOG.md

Changed

  • Scan orchestrator now runs 8 scanners (was 7) with TFA running last
  • Agent prompts updated with ASI/AST/MCP OWASP references
  • scanForInjection() returns { found, severity, patterns } instead of boolean
  • Self-scan suppressions updated from ~150 to ~190 (TFA self-referential findings added)
  • Plugin description updated to reference all 4 OWASP frameworks

Fixed

  • package.json version sync with plugin.json

[2.5.0] - 2026-04-02

Added

  • Pre-extraction indirection layer for remote scan defense. Remote scans pre-extract structured evidence via content-extractor.mjs and strip injection patterns BEFORE LLM agents see the content

[2.4.0] - 2026-04-01

Added

  • GitHub repo URL support for scan and plugin-audit. Clone to temp dir via git-clone.mjs, scan locally, clean up. --branch <name> flag for non-default branches

[2.3.0] - 2026-04-01

Added

  • PostToolUse expanded to ALL tools (was Bash-only). Scans Read, WebFetch, MCP, and all other tool output for indirect prompt injection
  • LLM_SECURITY_INJECTION_MODE env var: block (default), warn, off
  • Complementary Tools section in README (parry-guard, Lasso, Snyk)
  • CLAUDE.md poisoning documented as known limitation

Changed

  • Short output skip (<100 chars) for PostToolUse performance

[2.2.0] - 2026-04-01

Added

  • UserPromptSubmit hook blocking prompt injection in user input
  • Obfuscation decoding: unicode-escape, hex-escape, URL-encoding, base64 normalization
  • Shared injection-patterns.mjs module (21 critical + 8 high patterns)
  • PostToolUse indirect injection scanning in tool output (LLM01)

Changed

  • LLM01 coverage 83% -> 95%, LLM05 80% -> 83%

[2.1.0] - 2026-04-01

Added

  • 383 tests (was 177): full hook coverage (66 tests), auto-cleaner coverage (140 tests)
  • HTTPS install URL under fromaitochitta org

Fixed

  • Auto-cleaner import guard
  • Solo project setup (CONTRIBUTING.md removed)

Changed

  • Model defaults set to sonnet

[2.0.0] - 2026-03-31

Added

  • Open-source release: MIT LICENSE, SECURITY.md
  • Test suite (node:test, 177 tests)
  • pre-write-pathguard.mjs hook (8 path categories)
  • .gitignore, .editorconfig

[1.4.0] - 2026-02-21

Added

  • Unified risk scoring formula (25/10/4/1 weights)
  • Score-based verdicts and risk bands (Low -> Extreme)
  • OWASP categorization and A-F grading
  • Single unified-report.md template replacing 9 separate templates

[1.3.0] - 2026-02-21

Added

  • /security clean command with 3-tier remediation (auto/semi-auto/manual)
  • auto-cleaner.mjs engine (16 fix operations, atomic writes, post-fix validation)
  • cleaner-agent for semi-auto proposals
  • --dry-run flag

[1.2.0] - 2026-02-19

Added

  • 7 deterministic Node.js scanners (unicode, entropy, permissions, dependencies, taint, git forensics, network)
  • /security deep-scan command and --deep flag
  • Synthesizer agent for scanner JSON interpretation
  • Shared scanner library (scanners/lib/)
  • Demo fixture with 85-finding security assessment

Changed

  • OWASP coverage: LLM01 70->85%, LLM02 90->95%, LLM03 80->90%, LLM06 85->95%

[1.1.0] - 2026-02-19

Added

  • /security plugin-audit command
  • /security mcp-audit command
  • /security pre-deploy command
  • 3 new report templates

Changed

  • OWASP coverage: LLM03 75% -> 80%

[1.0.0] - 2026-02-19

Added

  • Initial release
  • 4 agents: skill-scanner, mcp-scanner, posture-assessor, threat-modeler
  • 4 hooks: secret detection, destructive commands, supply chain, output verification
  • 6 knowledge files (2,771 lines)
  • 8 commands: security, scan, audit, posture, threat-model, plugin-audit, mcp-audit, pre-deploy
  • 7 report templates
  • OWASP LLM Top 10 + Agentic AI Top 10 coverage