The plugin lives in ktg-plugin-marketplace and is distributed via the Claude Code marketplace mechanism. There is no standalone open/claude-code-llm-security repo; references to it were aspirational and never realized. - package.json: homepage now deep-links to plugins/llm-security/ in the marketplace; repository.url uses the marketplace repo with directory field (npm convention for monorepo plugins); bugs.url routes to marketplace issue tracker. - CLAUDE.md: "Public Repository" section replaced with "Distribution" section documenting the marketplace install path. - CONTRIBUTING.md: issue tracker URL points at marketplace issues with [llm-security] prefix convention. - CHANGELOG.md: v7.3.1 entry rewritten to reflect actual change (URLs corrected to marketplace, not "fixed from one wrong URL to another wrong URL"). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
897 lines
66 KiB
Markdown
897 lines
66 KiB
Markdown
# Changelog
|
||
|
||
All notable changes to the LLM Security Plugin are documented in this file.
|
||
|
||
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
|
||
|
||
## [7.3.1] - 2026-05-01
|
||
|
||
Stabilization patch. No behavior changes. Sets the public stance, tightens
|
||
documentation, and removes coherence drift so forkers and downstream
|
||
organizations get a consistent starting point.
|
||
|
||
### Added
|
||
|
||
- `CONTRIBUTING.md` — public fork-and-own guide. Explains why PRs are not
|
||
accepted on the upstream repo, how to fork well (rename plugin, change
|
||
security contact, preserve LICENSE, re-establish trust), what is welcome
|
||
via issues, and the bar for inline-diff suggestions the maintainer may
|
||
apply directly.
|
||
- `README.md` "Project scope" section — public statement of stabilization
|
||
mode (effective 2026-05-01) plus an out-of-scope table naming what is
|
||
fork-and-own territory: web dashboard, fleet policy server, runtime
|
||
prompt firewall, IDE LSP, compliance PDF/DOCX pack, enterprise ticketing
|
||
connectors, multi-tenancy, ML-based detectors, marketplace UI,
|
||
SSO/SCIM/RBAC. Each row points at the commercial alternative
|
||
(Snyk, Lakera, Vanta, Splunk SOAR, parry-guard, etc.).
|
||
- `package.json`: `bugs.url` field, `CONTRIBUTING.md` / `SECURITY.md` /
|
||
`CHANGELOG.md` added to the `files` whitelist so npm-published artifacts
|
||
ship with full project documentation.
|
||
|
||
### Changed
|
||
|
||
- `SECURITY.md` rewritten. Supported-versions table moves from `5.1.x`
|
||
(stale since v6.0.0) to current reality: 7.3.x active, 7.0–7.2 best-effort,
|
||
< 7.0 EOL. Adds explicit best-effort solo-project response timeline (7
|
||
days ack, 14 days triage, 30 days fix for High/Critical), expands scope
|
||
list to cover `bin/llm-security.mjs`, and notes that out-of-scope
|
||
vulnerabilities (e.g., adaptive ML-based bypass) get an explanatory
|
||
response rather than silent ignore.
|
||
- `README.md` "Feedback & contributing" section now links to
|
||
`CONTRIBUTING.md` and the new "Project scope" section.
|
||
- `package.json` URL fields corrected to point at the
|
||
`ktg-plugin-marketplace` monorepo (the canonical home for this plugin).
|
||
`homepage` now deep-links to `plugins/llm-security/`, `repository.url`
|
||
uses the marketplace repo with a `directory: "plugins/llm-security"`
|
||
field (npm convention for monorepo plugins), and `bugs.url` routes to
|
||
the marketplace issue tracker. Earlier values referenced a standalone
|
||
`claude-code-llm-security` repo that was never published — the plugin
|
||
is distributed via the marketplace mechanism, not as an independent
|
||
package.
|
||
- `CLAUDE.md` "Public Repository" section replaced with a "Distribution"
|
||
section that documents the marketplace install path and removes the
|
||
stale standalone-repo references.
|
||
- Scanner `VERSION` constants synced to plugin version. Previously
|
||
`dashboard-aggregator.mjs` and `posture-scanner.mjs` reported `6.0.0`
|
||
in scan output and SARIF, mismatching the actual plugin version.
|
||
All three standalone scanners (`dashboard-aggregator`, `posture-scanner`,
|
||
`ide-extension-scanner`) now report `7.3.1`.
|
||
|
||
### Fixed
|
||
|
||
- `tests/hooks/pre-compact-scan.test.mjs` size-cap timing test ceiling
|
||
raised from 500 ms to 1000 ms. The 500 ms hard cap was a flake source
|
||
on Intel Mac and CI runners under load, while the design target
|
||
(documented in `CLAUDE.md`) remains <500 ms. The test now catches
|
||
order-of-magnitude regressions without breaking on hardware/CI noise.
|
||
|
||
### Notes
|
||
|
||
- This is the first patch on the stabilization line. Future 7.3.x
|
||
releases will be limited to bug + security fixes and small
|
||
knowledge-base refreshes that fit the existing deterministic
|
||
architecture. v8.0.0 remains scheduled as the deprecation cleanup
|
||
for the env vars and `riskScoreV1` constant deprecated in v7.3.0;
|
||
see "Project scope" in `README.md` for the longer-term direction.
|
||
- Wave E (additional attack-simulator scenarios mentioned in the v7.3.0
|
||
changelog as "deferred to v7.3.1") is now deferred indefinitely.
|
||
Coverage remains at 72 scenarios. Forkers who want broader red-team
|
||
coverage are encouraged to extend `knowledge/attack-scenarios.json`.
|
||
|
||
## [7.3.0] - 2026-05-01
|
||
|
||
Batch C release. Closes 12 implementation tasks (E3, E8-E14, 8.4, 8.6,
|
||
8.7, 8.10) across four execution waves: Wave A (bash evasion + decoder),
|
||
Wave B (supply chain + workflow scanner), Wave C (MCP cumulative drift),
|
||
Wave D (code quality). Wave E (9 new attack-simulator scenarios for the
|
||
new defenses) deferred to v7.3.1 — the defenses themselves are unit-tested
|
||
per wave; the deferred work adds attack-simulator regression coverage on
|
||
top.
|
||
|
||
### Added
|
||
|
||
- **E8 — T7 process-substitution normalization** in
|
||
`scanners/lib/bash-normalize.mjs`. Collapses `<(cmd)` and `>(cmd)`
|
||
process-substitution wrappers so the inner command name is surfaced
|
||
to downstream destructive-command name matchers in
|
||
`pre-bash-destructive.mjs`. Defends against split-command evasion.
|
||
Nested wrappers handled up to depth 3. Single-quoted literals
|
||
masked before T7 runs to avoid corrupting string content.
|
||
|
||
- **E10 — T9 eval-via-variable normalization** in
|
||
`scanners/lib/bash-normalize.mjs`. Substitutes one-level variable
|
||
assignments before destructive-name matching. One-level forward-flow
|
||
only: chained-var attacks intentionally not followed (documented
|
||
limit). Bare-form, curly-form, and double-quoted forms supported;
|
||
single-quoted literals preserved.
|
||
|
||
- **E9 — T8 base64-pipe-shell BLOCK rule** in
|
||
`hooks/scripts/pre-bash-destructive.mjs`. Direct match on the
|
||
base64-decode-pipe-into-shell loader idiom — blocks the
|
||
encoded-payload runner pattern that bypasses static name-matching by
|
||
delivering the destructive command as encoded text.
|
||
|
||
- **E3 — rot13 layer for hidden-imperative comment-block detection**
|
||
in `scanners/lib/injection-patterns.mjs`. The decoder is bounded
|
||
in length to keep accidental rot13-look-alike short strings out of
|
||
scope. Base64/hex/URL/HTML decoding is already done by
|
||
`normalizeForScan`; the rot13 pass is the only genuinely new layer.
|
||
|
||
- **E12 — `.gitattributes` filter/diff/merge driver advisory** in
|
||
`scanners/lib/git-clone.mjs`. New `scanGitAttributes(repoDir)`
|
||
exported helper plus post-clone integration in the `clone` CLI
|
||
branch — surfaces filter, diff, and merge driver directives as
|
||
MEDIUM advisories so downstream consumers see the supply-chain
|
||
surface that survives even a sandboxed clone.
|
||
|
||
- **E13 — npm scope-hopping typosquat detection** in
|
||
`hooks/scripts/pre-install-supply-chain.mjs`. New shared
|
||
`NPM_OFFICIAL_SCOPES` export from `scanners/lib/supply-chain-data.mjs`.
|
||
When an install targets `@<scope>/<name>` where `<scope>` is unknown
|
||
but `<name>` matches a popular unscoped package, the hook emits a
|
||
MEDIUM advisory. Allowlist of legitimate scopes drives suppression.
|
||
Configurable via `policy.json` `supply_chain.allowed_scopes`.
|
||
|
||
- **E11 — workflow-injection scanner** (`scanners/workflow-scanner.mjs`).
|
||
Scans `.github/workflows/*.{yml,yaml}` and `.forgejo/workflows/*.{yml,yaml}`
|
||
for dangerous expression interpolations inside `run:` step blocks.
|
||
23-field canonical blacklist (GHSL Security Lab 17 + GlueStack-class
|
||
6) targeting attacker-controlled fields. Sink-restricted: only
|
||
`run:` steps are shell sinks; `if:`, `with:`, `env:`, `name:`,
|
||
`runs-on:` are evaluated by the runner's expression engine, not the
|
||
shell, and are suppressed. Severity matrix: privileged triggers →
|
||
HIGH; semi-privileged → MEDIUM; safe fields (numeric / hex /
|
||
fixed-string) → INFO. State machine extracted to
|
||
`scanners/lib/workflow-yaml-state.mjs` for unit-level testability.
|
||
Re-interpolation tracking — env-block bindings sourced from
|
||
blacklisted fields, then read back inside `run:`, are flagged at
|
||
MEDIUM as the Appsmith GHSL-2024-277 stealth pattern. Auth-bypass
|
||
detection — `(github|forgejo).actor` compared against bot
|
||
identities in `if:` conditions flagged at MEDIUM (Synacktiv 2023
|
||
Dependabot spoofing class). New `WFL` prefix in
|
||
`scanners/lib/severity.mjs` OWASP map. Registered in
|
||
`scanners/scan-orchestrator.mjs`.
|
||
|
||
- **E14 — MCP cumulative-drift baseline** in
|
||
`scanners/lib/mcp-description-cache.mjs`. Sticky `baseline` slot per
|
||
tool plus a 10-event rolling `history` array (FIFO). Cumulative
|
||
drift = `levenshtein(current, baseline.description) / max(|current|,
|
||
|baseline|)`; when ratio ≥ `mcp.cumulative_drift_threshold`
|
||
(default 0.25), `post-mcp-verify.mjs` emits a MEDIUM
|
||
`mcp-cumulative-drift` advisory independent of the existing
|
||
per-update >10% drift signal — both fire independently. Slow-burn
|
||
rug-pulls that keep each update under the per-update threshold but
|
||
cumulatively diverge from baseline are now caught. Baseline survives
|
||
the 7-day TTL purge so detection persists across the full window.
|
||
New `/security mcp-baseline-reset` slash command (plus
|
||
`scanners/mcp-baseline-reset.mjs` CLI: `--list`, `--target <tool>`,
|
||
or no-args clear-all) lets the user acknowledge a legitimate MCP
|
||
server upgrade. New `LLM_SECURITY_MCP_CACHE_FILE` env var overrides
|
||
the cache path for end-to-end testing without polluting the user's
|
||
real `~/.cache/llm-security/mcp-descriptions.json`. Migration logic
|
||
in `loadCache()` seeds `baseline` from existing entries on first
|
||
read post-upgrade.
|
||
|
||
- **8.7 — env-var deprecation warnings** in
|
||
`scanners/lib/policy-loader.mjs`. New `getPolicyValueWithEnvWarn(section,
|
||
key, envVarName, defaultValue)` helper. Env-var still wins per
|
||
existing Preferences, but when BOTH the env-var AND the
|
||
`policy.json` key are explicitly set, the helper emits a single
|
||
per-process stderr deprecation line pointing to v8.0.0 removal.
|
||
Module-scoped `Set` dedupes per env-var name across call-sites.
|
||
`DEFAULT_POLICY` gains `trifecta.escalation_window: 5` (closes the
|
||
gap where `LLM_SECURITY_ESCALATION_WINDOW` had no `policy.json`
|
||
equivalent). Wired through 4 hook call-sites:
|
||
`pre-prompt-inject-scan`, `post-session-guard` (×2), and
|
||
`audit-trail`. Env-only vars without `policy.json` equivalents are
|
||
unchanged.
|
||
|
||
### Changed
|
||
|
||
- **8.10 — CLAUDE.md hooks count corrected** from `## Hooks (8)` to
|
||
`## Hooks (9)`. Adds `pre-compact-scan.mjs` row to the hooks table
|
||
(PreCompact — transcript scan before context compaction). The hook
|
||
itself shipped in v6.2.0 but the count and table row drifted. New
|
||
`Hooks count consistency` `describe` block in
|
||
`tests/lib/doc-consistency.test.mjs` parses `hooks/hooks.json`,
|
||
reads the CLAUDE.md `## Hooks (\d+)` header and the table rows,
|
||
and asserts all three counts agree — locks in the fix and prevents
|
||
future drift.
|
||
|
||
### Documentation
|
||
|
||
- **8.4 — `riskScoreV1` annotated `@deprecated`** in
|
||
`scanners/lib/severity.mjs`. JSDoc explicitly tags v7.0.0 as the
|
||
introduction of the v2 model and v8.0.0 as the removal target for
|
||
v1, so library consumers see the deprecation in IDE tooling and
|
||
not just in release notes. The function remains exported and
|
||
functional for users who relied on it.
|
||
|
||
- **8.6 — sandbox-architecture rationale** in
|
||
`docs/security-hardening-guide.md` §7. Documents why
|
||
`lib/git-clone.mjs` and `lib/vsix-sandbox.mjs` remain separate
|
||
rather than being collapsed into a single shared sandbox helper.
|
||
Brief `Preferences` explicitly rejected the consolidation as
|
||
premature abstraction over safety-critical code; the rationale is
|
||
recorded so future maintainers see the deliberate decision.
|
||
|
||
### Tests
|
||
|
||
- 1665+ → 1777 (Wave A-D cumulative; ~+112 tests). Includes new
|
||
files (`tests/scanners/bash-normalize-t7-t9.test.mjs`,
|
||
`tests/lib/git-clone-gitattributes.test.mjs`,
|
||
`tests/scanners/workflow-scanner.test.mjs`,
|
||
`tests/lib/workflow-yaml-state.test.mjs`,
|
||
`tests/scanners/mcp-baseline-reset.test.mjs`) plus extensions to
|
||
`tests/lib/injection-patterns.test.mjs`,
|
||
`tests/hooks/pre-bash-destructive.test.mjs`,
|
||
`tests/hooks/pre-install-supply-chain.test.mjs`,
|
||
`tests/scanners/scan-orchestrator.test.mjs`,
|
||
`tests/lib/mcp-description-cache.test.mjs`,
|
||
`tests/hooks/post-mcp-verify.test.mjs`,
|
||
`tests/lib/severity.test.mjs`,
|
||
`tests/lib/policy-loader.test.mjs`,
|
||
`tests/lib/doc-consistency.test.mjs`. One pre-existing
|
||
size-cap timing flake at `tests/hooks/pre-compact-scan.test.mjs`
|
||
passes in isolation, fails sporadically under full-suite load —
|
||
unchanged across Wave A-D, not a Batch C blocker.
|
||
|
||
### Notes
|
||
|
||
- **Wave E deferred (red-team coverage).** The plan called for 9 new
|
||
attack-simulator scenarios covering every Wave A-D defense. The
|
||
work was deferred from v7.3.0 because two of the scenarios test
|
||
scanners (workflow-scanner, git-clone `scanGitAttributes`) that
|
||
don't fit the existing hook-spawn model used by attack-simulator
|
||
and would have required a new `scanner_test` execution mode.
|
||
Tracked for v7.3.1. Defenses are unit-tested per wave; this is
|
||
regression coverage on top of unit coverage, not the primary
|
||
safety net.
|
||
|
||
- **Hooks runtime behavior unchanged for existing setups.** Every
|
||
Wave A-D addition is either purely additive (new advisories at
|
||
MEDIUM) or layered before existing detection (T7/T9 normalize
|
||
before existing destructive-name matching; rot13 inside the
|
||
existing decoder loop; cumulative-drift independent of per-update
|
||
drift). Users who set neither the new `policy.json` keys nor the
|
||
new env-vars see identical behavior.
|
||
|
||
## [7.2.0] - 2026-04-29
|
||
|
||
Batch B release. Closes the remaining critical-review B-tier scanner
|
||
defects (B3, B5, B6, B7), lands the v7.2.0 evasion-arsenal hardening
|
||
patches (E1, E4, E5, E7, E15, E16, E17, E18), unifies the v1→v2
|
||
risk-score formula across documentation surfaces, and ships 8 new
|
||
red-team scenarios (64 → 72) plus a polyglot fixture for the entropy
|
||
two-stage pipeline.
|
||
|
||
### Added
|
||
|
||
- **B6 destructuring/spread taint propagation** (`scanners/taint-tracer.mjs`).
|
||
`extractAssignedVariable` now recognises `const { secret: userInput } = req.body`
|
||
and `const [input, ...rest] = process.argv` — destructured and spread
|
||
bindings carry their tainted source into downstream usage.
|
||
`extractAssignedVariable` exported for direct unit testing.
|
||
`+19 tests`.
|
||
|
||
- **B7 token-overlap typosquat fallback** (`scanners/lib/string-utils.mjs`,
|
||
`scanners/dep-auditor.mjs`, `scanners/supply-chain-recheck.mjs`).
|
||
New `tokenize` / `tokenOverlap` helpers + `TYPOSQUAT_SUSPICIOUS_TOKENS`
|
||
list catch typosquats that Levenshtein distance misses
|
||
(e.g. `chalk-color-utility` vs `chalk`). `+21 tests`.
|
||
|
||
- **E15 `.claude/agents/*.md` memory-poisoning glob** (`scanners/memory-poisoning-scanner.mjs`).
|
||
Agent definitions are now scanned alongside `CLAUDE.md` and rules.
|
||
New fixture + `+3 tests`.
|
||
|
||
- **E1 hidden-Unicode coverage extended to PUA-A and PUA-B**
|
||
(`scanners/lib/string-utils.mjs`). `containsUnicodeTags` now flags
|
||
U+F0000–U+FFFFD (Supplementary Private Use Area-A) and U+100000–U+10FFFD
|
||
(Supplementary Private Use Area-B) in addition to the U+E0000 Tag block.
|
||
PUA characters do not decode to ASCII (they have no standard mapping)
|
||
but their presence is suspicious enough to emit a HIGH advisory.
|
||
`+21 tests`.
|
||
|
||
- **E16 homoglyph fold before pattern matching**
|
||
(`scanners/lib/string-utils.mjs`, `scanners/lib/injection-patterns.mjs`).
|
||
New `foldHomoglyphs` (NFKC + targeted Cyrillic/Greek → Latin map)
|
||
runs before every pattern match in `scanForInjection`. Attacks like
|
||
`ignоre previous instructions` (Cyrillic `о`) now trigger the same
|
||
CRITICAL pattern as the Latin form. ASCII fast-path keeps the helper
|
||
zero-cost on plain text. `+27 tests`.
|
||
|
||
- **E17 configurable escalation window + 20-call MEDIUM advisory**
|
||
(`hooks/scripts/post-session-guard.mjs`). The
|
||
`LLM_SECURITY_ESCALATION_WINDOW` env-var now overrides the primary
|
||
escalation-after-input window (default 5). A secondary 20-call
|
||
MEDIUM advisory catches slow-burn variants outside the primary
|
||
window. `+5 tests`.
|
||
|
||
- **E4 markdown link-title injection scan** (`hooks/scripts/post-mcp-verify.mjs`).
|
||
Every `[text](url "title")` title is HTML-entity-decoded and run
|
||
through `scanForInjection`. Bypassed the existing HTML-tag-gated
|
||
checks pre-E4. `+3 tests`.
|
||
|
||
- **E5 SVG `<desc> / <title> / <metadata> / <foreignObject>` extractor**
|
||
(`hooks/scripts/post-mcp-verify.mjs`). Adversarial text inside SVG
|
||
containers is invisible in the rendered image but parsed by an
|
||
agent reading the source. `+3 tests`.
|
||
|
||
- **E7 generalized HTML comment scan** (`hooks/scripts/post-mcp-verify.mjs`).
|
||
Pre-E7 the `<!-- AGENT|AI|HIDDEN -->` keyword-restricted CRITICAL
|
||
pattern fired only on marked comments. Now every `<!-- ... -->`
|
||
body is decoded and scanned. The keyword pattern still fires
|
||
(defense-in-depth). `+3 tests`.
|
||
|
||
- **8 new red-team scenarios** (`knowledge/attack-scenarios.json`).
|
||
UNI-007/008 (E1 PUA-A/PUA-B), UNI-009 (E16 Greek-Latin homoglyph
|
||
fold blocks), MCP-005 (E4), MCP-006/007 (E5 desc/foreignObject),
|
||
MCP-008 (E7), TRI-004 (E17 escalation-after-input).
|
||
`attack-simulator.mjs` baseline: 64 → 72, 100 % pass.
|
||
|
||
### Changed
|
||
|
||
- **B5 entropy two-stage pipeline** (`scanners/entropy-scanner.mjs`).
|
||
New `classifyFileContext(absPath, lines)` returns
|
||
`'shader-dominant' | 'markup-dominant' | 'code-dominant' | 'mixed'`,
|
||
keyed off file extension with a content-density fallback for
|
||
code-extension files (≥50 % sampled lines matching GLSL/inline-markup
|
||
→ downgrade to `mixed`). `isFalsePositive` now accepts the context
|
||
and gates rules 11-13 (GLSL / CSS-in-JS / inline-markup
|
||
line-proximity) on `context !== 'code-dominant'`. Polyglot `.ts`
|
||
files with embedded GLSL blocks no longer suppress credentials
|
||
adjacent to shader keywords (the v7.0.0 false-negative class).
|
||
Conservative defaults preserve existing rule-11 / 12 / 13 behaviour
|
||
for the single-line `.js` / `.jsx` test fixtures. New fixture
|
||
`tests/fixtures/entropy/polyglot-ts-with-glsl.ts`. `+3 tests`.
|
||
|
||
- **E18 entropy rule 18 — markdown-image CDN-aware + secret pre-check**
|
||
(`scanners/entropy-scanner.mjs`). Pre-E18, every
|
||
`` line was suppressed regardless of host or query.
|
||
Now suppression requires (host matches `MARKDOWN_IMAGE_CDN_HOSTS`
|
||
allowlist) AND (no secret-shaped token in query). Non-CDN hosts and
|
||
CDN hosts carrying `?token=…` / `?api_key=…` / AWS / GitHub / npm
|
||
prefixes fall through to entropy classification. `+4 tests`.
|
||
|
||
- **v1 → v2 risk-formula constants unified across docs**
|
||
(`commands/scan.md`, `commands/audit.md`, `agents/mcp-scanner-agent.md`,
|
||
`agents/posture-assessor-agent.md`). The four files referenced the
|
||
legacy v1 `score >= 61` / `score >= 21` / `Critical × 25` constants;
|
||
authoritative implementation in `scanners/lib/severity.mjs` has been
|
||
v2 (`BLOCK ≥65`, `WARNING ≥15`, severity-dominated log-scaled tiers)
|
||
since v7.0.0. `tests/lib/doc-consistency.test.mjs` adds a guard so
|
||
these surfaces cannot drift back. `+28 tests`.
|
||
|
||
### Documentation
|
||
|
||
- **B3 `info` severity is scoring-inert** (`scanners/lib/severity.mjs` JSDoc,
|
||
`CLAUDE.md`). Documents the long-standing implementation: `info`
|
||
findings appear in OWASP aggregates but contribute zero to
|
||
`risk_score`, `verdict`, and `riskBand`. `+1 anchor test`.
|
||
|
||
### Tests
|
||
|
||
- **1522 → 1665+** (Wave 1 +29, Wave 2 +43, Wave 3 +53, Wave 4 +9,
|
||
Wave 5 +7, Wave 6 attack scenarios). All green except the
|
||
documented `pre-compact-scan` perf-flake (passes 6/6 in isolation,
|
||
fluctuates around the 500 ms ceiling under full-suite parallelism).
|
||
`attack-simulator`: 64 → 72 scenarios, 100 % pass.
|
||
|
||
### Notes
|
||
|
||
- E15 (`.claude/agents/*.md` glob) and E18 (entropy rule 18 CDN
|
||
allowlist) are scanner-only — they have unit / integration
|
||
coverage in their respective scanner test files and no
|
||
`attack-simulator.mjs` scenario.
|
||
|
||
## [7.1.1] - 2026-04-29
|
||
|
||
Patch release. Closes the narrative-coherence gap that survived v7.0.0:
|
||
the severity-dominated risk score corrected the numbers, but the agent
|
||
prompt continued to emit raw signals and walk them back as
|
||
"false positive" in prose, producing whiplash in the rendered report.
|
||
v7.1.1 makes severity assignment context-first at the prompt level and
|
||
adds a structural counter for suppressed signals.
|
||
|
||
### Fixed
|
||
|
||
- **Agent prompt context-first severity** (`agents/skill-scanner-agent.md`).
|
||
New Step 2.5 mandates that every signal has exactly one disposition —
|
||
suppressed (counted only) or reported (full finding) — with the split
|
||
happening before severity is assigned. The phrases "false positive",
|
||
"legitimate framework", and "no action required" are forbidden in
|
||
finding-body text and reserved for the new `## Suppressed Signals`
|
||
section. Verdict Logic section was also updated to reference v2 tiers
|
||
and cutoffs from `severity.mjs` (BLOCK ≥65, WARNING ≥15) — replaces
|
||
the stale v1 sum-and-cap formula that had been left in place after
|
||
the v7.0.0 numeric overhaul.
|
||
- **Template v1 → v2 risk constants** (`templates/unified-report.md`).
|
||
HTML-comment header at lines 55-66 now describes the v2 tiers and
|
||
cutoffs the engine has been using since v7.0.0. Adds an
|
||
`### Narrative Audit` block inside Executive Summary surfacing
|
||
`summary.narrative_audit.suppressed_findings.{count, by_category}` for
|
||
reviewer transparency. The block does NOT affect verdict computation.
|
||
|
||
### Added
|
||
|
||
- **`tests/scanners/skill-scanner-narrative.test.mjs`** — 11 assertions
|
||
against `tests/fixtures/skill-scan/hyperframes-like/`. Covers
|
||
deterministic content-extractor (exactly 1 HIGH HITL trap, ≥ 2
|
||
framework env-var refs, has_injection true on any signal,
|
||
has_critical_injection false), entropy scanner (calibration block
|
||
present, ≤ 1 finding after suppression), inline co-monotonicity
|
||
guard (`{ high: 1 }` → WARNING / High), and prompt-contract static
|
||
assertions on `agents/skill-scanner-agent.md` and
|
||
`templates/unified-report.md`.
|
||
- **`tests/fixtures/skill-scan/hyperframes-like/`** — synthetic skill
|
||
with HTML5 canvas / CSS keyframes / inline SVG data URI noise plus
|
||
exactly one genuine HITL trap signal. Committed (not gitignored).
|
||
`.llm-security-ignore` uses the canonical `SCANNER:glob` format
|
||
(`ENT:**/*.md`).
|
||
|
||
### Tests
|
||
|
||
- 1511 → 1522 tests (adds 11 new). Co-monotonicity sweep at
|
||
`tests/lib/severity.test.mjs:252-303` unchanged and green.
|
||
|
||
### Why
|
||
|
||
Hyperframes.com re-test on 2026-04-19 produced `risk_score 20 / WARNING /
|
||
1 HIGH` numerically (correct after v7.0.0) but the agent listed 8
|
||
findings in prose and walked 6 back as "false positive". v7.1.1 closes
|
||
the structural gap that allowed this: severity is assigned ONCE,
|
||
context-first, and suppressed signals are categorical telemetry rather
|
||
than free-text walk-backs.
|
||
|
||
### Out of scope (flagged for Batch B)
|
||
|
||
- `commands/scan.md:113-114` retains the v1 risk formula and acts as a
|
||
third source of truth alongside agent prompt and severity.mjs. Will
|
||
be unified in v7.2.0.
|
||
|
||
## [7.1.0] - 2026-04-29
|
||
|
||
Patch release closing the highest-impact items from the v7.0.0 adversarial review
|
||
(`docs/critical-review-2026-04-20.md`, grade B-). Bug-fixes plus an honesty-sweep on
|
||
documentation language. No new features and no behavioral changes outside the listed
|
||
fixes.
|
||
|
||
### Fixed
|
||
|
||
- **Pathguard regex hole — `.env.*.*.*` could be written without blocking** (`hooks/scripts/pre-write-pathguard.mjs`). The old `ENV_PATTERNS` only matched a single dotted segment after `.env`, so `.env.production.local.backup`, `.env.prod.local.bak`, etc. slipped through. Replaced with `/[\\/]\.env(\.[A-Za-z0-9._-]+)*$/` covering arbitrary multi-segment suffixes. `.envrc` continues to be allowed. Commit `751f119`. (Critical-review B1.)
|
||
- **Distributed trifecta in BLOCK mode only warned** (`hooks/scripts/post-session-guard.mjs`). The previous block-gate required *both* `LLM_SECURITY_TRIFECTA_MODE=block` *and* a "concentrated" or "sensitive-path" qualifier, so a trifecta whose three legs landed on different MCP servers without a sensitive path was advisory-only. Removed the AND-gate; block mode now blocks any detected trifecta. Commit `36be963`. (Critical-review B2.)
|
||
- **JSDoc/CHANGELOG arithmetic for `riskScore({critical: 4})`** (`scanners/lib/severity.mjs:23`, `CHANGELOG.md` v7.0.0 tier description). The actual computation has always been `70 + log2(5)*10 = 93.22 → round → 93`; only the docs said `90`. Fixed; pin test added. (Critical-review B4.)
|
||
|
||
### Changed
|
||
|
||
- **Honesty-sweep on documentation language** (`CLAUDE.md`, `commands/ide-scan.md`, `knowledge/mitigation-matrix.md`, `docs/security-hardening-guide.md`). Critical-review §9 flagged a set of overclaim phrasings; rewritten while preserving accurate underlying claims:
|
||
- "Trustworthy scoring (BREAKING)" → "Severity-dominated risk scoring (v2 model, BREAKING)"
|
||
- "Context-aware entropy scanner" → "Rule-based entropy scanner with file-extension skip, 8 line-level suppression rules, and configurable policy"
|
||
- "1487 tests" → "1511 unit and integration tests; mutation-testing coverage not published"
|
||
- "Fully Schrems II compatible" → "Schrems II compatible in default offline mode. Optional OSV.dev enrichment is a separate compliance consideration"
|
||
- "Rule of Two enforcement" → "Rule of Two detection (configurable; default warn; blocks on high-confidence trifectas in opt-in `block` mode)"
|
||
- "Hardened ZIP extractor" → suffix " — no fuzz-testing results published to date"
|
||
- "defense-in-depth" → preserved, but quantified in `docs/security-hardening-guide.md` §4: "three independent detection layers with documented bypass classes"
|
||
- **CaMeL claims toned down** (`hooks/scripts/post-session-guard.mjs:646`, `CLAUDE.md:184`). Implementation is opportunistic byte-matching of truncated output fingerprints (first 200 bytes, SHA-256/16-hex tag) — trivially bypassed by mutation, summarisation, or re-encoding. Renamed framing from "CaMeL-inspired data-flow tagging (SHA-256 provenance tracking)" to "output fingerprint matching (inspired by CaMeL but not a CaMeL capability-tracking implementation)". (Critical-review B8.)
|
||
- **Plugin version:** `7.0.0 → 7.1.0` across `package.json`, `.claude-plugin/plugin.json`, `scanners/ide-extension-scanner.mjs` (`VERSION`), README badge, CLAUDE.md header, marketplace root README. Test count `1487 → 1511` in marketplace root README.
|
||
|
||
### Tests
|
||
|
||
- **+8 tests for B1 pathguard** (`tests/hooks/pre-write-pathguard.test.mjs`): 6 multi-segment BLOCK + 1 `.envrc` ALLOW + 1 sentinel.
|
||
- **+1 test for B2 distributed trifecta** (`tests/hooks/post-session-guard.test.mjs`): three legs from different sources blocked under `block` mode.
|
||
- **+15 sweep tests + 1 anchor test for verdict/riskBand co-monotonicity** (`tests/lib/severity.test.mjs`): asserts `(verdict, riskBand)` agree under v7.0.0 contract for representative count vectors. Catches future drift between scoring tiers, verdict cutoffs, and riskBand cutoffs. Anchor test pins `riskScore({critical: 4}) === 93` so doc/code drift fails loudly.
|
||
- **Total: 1511 tests** (was 1487). All green.
|
||
|
||
### Why
|
||
|
||
- Pathguard and trifecta-block bugs were live security holes — both fixed at the
|
||
hook level so users on the default install get the fix automatically.
|
||
- The honesty-sweep is a deliberate response to the critical-review CISO-perspective
|
||
(§F): "Would a CISO install this?" — overclaim language was identified as a
|
||
blocker for regulated environments. Toning it down does not weaken the actual
|
||
defenses; it lets users trust the documentation.
|
||
|
||
## [7.0.0] - 2026-04-19
|
||
|
||
### BREAKING CHANGES
|
||
- **Risk-score formula rewritten** (`scanners/lib/severity.mjs`). The v1 sum-and-cap formula (`critical*25 + high*10 + medium*4 + low*1`, capped at 100) collapsed every non-trivial scan to 100/Extreme regardless of actual risk distribution. v2 is severity-dominated and log-scaled within tier:
|
||
- Critical present → 70–95 (1=80, 2=86, 4=93, 10=95)
|
||
- High only → 40–65 (1=48, 5=60, 17=65)
|
||
- Medium only → 15–35 (1=20, 5=28, 50=33)
|
||
- Low only → 1–11 (1=4, 10=11)
|
||
- None → 0
|
||
Verdict cutoffs realigned to new bands: `BLOCK` if critical ≥1 or score ≥65, `WARNING` if high ≥1 or score ≥15. Legacy v1 formula kept as `riskScoreV1()` for reference only. CI pipelines with `--fail-on` thresholds may need recalibration — see `docs/security-hardening-guide.md` §6.
|
||
- **Verdict/band cutoffs aligned for co-monotonicity.** Old cutoffs (BLOCK ≥61, WARNING ≥21) could produce "BLOCK / Medium band" or "ALLOW / High band" contradictions. New cutoffs (65, 15) are locked to the v2 `riskBand()` boundaries.
|
||
|
||
### Added
|
||
- **Context-aware entropy scanner** (`scanners/entropy-scanner.mjs`). Skip-lists and line-level rules drastically reduce false positives in shader/CSS/HTML/SQL-heavy codebases:
|
||
- File-extension skip: `.glsl, .frag, .vert, .shader, .wgsl, .css, .scss, .sass, .less, .svg` + compound `.min.js, .min.css, .map`
|
||
- Line-level rules 11–18 in `isFalsePositive()`: GLSL keywords (`uniform`, `vec3`, `texture2D`...), CSS-in-JS templates (`styled.`), inline `<svg>` markup, ffmpeg `filter_complex` syntax, browser `User-Agent` strings, SQL DDL on dedicated lines (`^\s*(SELECT|INSERT|UPDATE|DELETE|CREATE|...)`), `throw new Error(\`…\`)` templates, markdown image syntax with external URLs (`` — common in JSON content indexes)
|
||
- Scanner envelope gains `calibration` block: `files_skipped_by_extension`, `files_skipped_by_path`, effective `thresholds`, and `policy_source` (`'default' | 'policy.json'`)
|
||
- **Policy-driven entropy configuration** — `.llm-security/policy.json` `entropy` section accepts:
|
||
- `thresholds.{critical,high,medium}.{entropy,minLen}` — override defaults per project
|
||
- `suppress_extensions: string[]` — additional file extensions to skip
|
||
- `suppress_line_patterns: string[]` — user-defined regexes for line suppression
|
||
- `suppress_paths: string[]` — substring match against `relPath` to skip entire paths (e.g., `"vendored/"`)
|
||
- **DEP typosquat allowlist expansion** (`knowledge/typosquat-allowlist.json`). 22 npm + 5 PyPI entries for short-name modern tools that tripped Levenshtein detection on nearly every real codebase:
|
||
- npm: `knip`, `oxlint`, `tsx`, `nx`, `rimraf`, `glob`, `tar`, `zod`, `ky`, `ow`, `esm`, `ip`, `qs`, `url`, `prettier`, `vitest`, `vite`, `rollup`, `swc`, `turbo`, `bun`, `deno`
|
||
- PyPI: `uv`, `ruff`, `rich`, `typer`, `anyio`
|
||
- **Synthesizer "Scan Calibration" section** (`agents/deep-scan-synthesizer-agent.md`). Heuristic: omit if <5% files skipped, flag prominently if >80% skipped by path (signals over-aggressive user policy). Agent instructed to NEVER override scanner verdict with narrative opinion.
|
||
- **26 new unit tests** (`tests/scanners/entropy-context.test.mjs`): A. File-extension skip (4), B. Line-level rules 11–18 (10), C. Policy overrides (3); plus expanded `tests/lib/severity.test.mjs` with v2 scoring/band/verdict tables (70 tests total, was 52). **Total: 1487 tests (was 1461).**
|
||
|
||
### Changed
|
||
- `tests/lib/output.test.mjs:243` — "1 critical = score 80" under v2 (was 25 under v1).
|
||
- `scanners/lib/file-discovery.mjs` — `TEXT_EXTENSIONS` now includes `.sass` and GPU shader source extensions (`.glsl, .frag, .vert, .shader, .wgsl`) so these files are discovered and explicitly counted as skipped by the entropy scanner instead of invisibly filtered out.
|
||
- Plugin version: `6.6.0 → 7.0.0` across `package.json`, `.claude-plugin/plugin.json`, `scanners/ide-extension-scanner.mjs` (`VERSION`), README badge, CLAUDE.md header, marketplace root README.
|
||
|
||
### Why
|
||
- **Real-world scan on `hyperframes.com` produced `BLOCK / Extreme / 100` with ~70% noise** (shader strings, CSS gradients, bundled JS, Levenshtein false positives). A scanner that cries "extreme" on every project destroys its own credibility — users learn to ignore findings, so genuine threats slip past.
|
||
- **Trustworthiness comes from calibration, not from detecting everything.** v7.0.0 accepts that some detection heuristics are noisy in context (entropy on shaders, typosquat on 2–3 letter tool names) and gives users both built-in suppression and policy-driven override controls.
|
||
- **Verdict/score/band co-monotonicity fixed.** A user can now correctly reason: "HIGH band → WARNING verdict" without reading the source. The v1 cutoffs allowed a mid-High score (42) to produce ALLOW and a low-Medium score (22) to produce WARNING.
|
||
|
||
## [6.6.0] - 2026-04-18
|
||
|
||
### Added
|
||
- **JetBrains/IntelliJ plugin scanning.** `/security ide-scan` extends beyond VS Code forks to cover the JetBrains IDE family: IntelliJ IDEA, PyCharm, GoLand, WebStorm, RubyMine, PhpStorm, CLion, DataGrip, RustRover, Rider, Aqua, Writerside, Android Studio. Fleet and Toolbox are intentionally excluded (different plugin model, out of scope)
|
||
- **OS-aware JetBrains plugin discovery** in `lib/ide-extension-discovery.mjs` — macOS `~/Library/Application Support/JetBrains/<IDE><version>/plugins/`, Windows `%APPDATA%\JetBrains\...`, Linux `~/.config/JetBrains/...`. Regex excludes Fleet/Toolbox
|
||
- **Zero-dep `META-INF/plugin.xml` + `META-INF/MANIFEST.MF` parsers** in `lib/ide-extension-parser-jb.mjs` with nested-jar extraction for the common `<plugin-root>/lib/*.jar → META-INF/plugin.xml` layout
|
||
- **7 JetBrains-specific checks** in `runJetBrainsChecks`: `checkThemeWithCodeJB`, `checkBroadActivationJB` (`application-components`), `checkPremainClassJB` (HIGH — javaagent retransform), `checkNativeBinariesJB`, `checkDependsChainJB` (long mandatory `<depends>` = supply-chain pressure), `checkTyposquatJB` (Levenshtein vs top JetBrains plugins), `checkShadedJarsJB` (advisory — many bundled jars)
|
||
- **JetBrains Marketplace URL fetch.** Supports `https://plugins.jetbrains.com/plugin/<numericId>-<slug>` (metadata resolves numericId → xmlId, then downloads) and `https://plugins.jetbrains.com/plugin/download?pluginId=<xmlId>[&version=<v>]` (direct download). Host allowlist: `plugins.jetbrains.com` only
|
||
- **`fetchJetBrainsPlugin`** in `lib/vsix-fetch.mjs` with the same safety envelope as VSIX fetch (50 MB cap, 30 s timeout, SHA-256, manual redirect host whitelist)
|
||
- **`lib/jetbrains-fetch-worker.mjs`** — sub-process worker mirroring the VSIX worker's JSON-line IPC. Shares the sandbox primitives through parameterized `buildSandboxedWorker(dirs, workerPath)`
|
||
- **`.kt`, `.groovy`, `.scala`** added to `scanners/taint-tracer.mjs` `CODE_EXTENSIONS` so Kotlin/Groovy/Scala plugin sources are covered by taint analysis
|
||
- **Knowledge additions:** `knowledge/jetbrains-marketplace-api-notes.md`, expanded `knowledge/ide-extension-threat-patterns.md` with JetBrains sections, seeded `knowledge/top-jetbrains-plugins.json` (no longer a stub) with `loadJetBrainsBlocklist` helper
|
||
- **8 new test files / suites** covering JetBrains data, parsers, discovery, checks, URL fetch (unit + integration), end-to-end scan against a real JetBrains-layout fixture tree, plus a deterministic fixture-jar builder (`tests/helpers/build-jetbrains-fixtures.mjs`) that produces byte-identical reproducible jars. Total: 1461 tests (was 1352)
|
||
|
||
### Changed
|
||
- `buildSandboxedWorker(dirs)` → `buildSandboxedWorker(dirs, workerPath)` — parameterized so the same sandbox wrapper is reused for VSIX and JetBrains workers instead of copying the primitives a third time
|
||
- `/security ide-scan` command description updated to reflect the JetBrains branch; "JetBrains is a v1.1 stub" wording removed
|
||
- `CLAUDE.md` and plugin README updated: scanner bullet rewritten to document the JetBrains branch, the seven JB-specific checks, and the new knowledge files
|
||
- Plugin version: 6.5.0 → 6.6.0 across `package.json`, `.claude-plugin/plugin.json`, `scanners/ide-extension-scanner.mjs` (`VERSION`), README badge, CLAUDE.md header, marketplace root README
|
||
- `tests/scanners/git.test.mjs` — loosened `findings.length` caps (were too tight for organic repo growth; baseline already exceeded them)
|
||
|
||
### Why
|
||
- Parity with the VS Code branch: organizations running IntelliJ-family IDEs get the same pre-install and installed-plugin coverage Koi-style supply-chain attacks now target across both platforms
|
||
- Reuse of `lib/vsix-sandbox.mjs` honors the user-memory rule "don't copy a third sandbox" — one set of primitives, two workers, same kernel-enforced FS confinement
|
||
- JetBrains-specific checks target the platform's real attack surface: `Premain-Class` javaagents (class retransform at JVM startup), `application-components` (global lifecycle hooks), nested-jar shading (dependency opacity), and typosquat on `com.intellij.*` / `org.jetbrains.*` namespaces
|
||
|
||
## [6.5.0] - 2026-04-17
|
||
|
||
### Added
|
||
- **OS sandbox for `/security ide-scan <url>`.** VSIX fetch + extract now runs in a sub-process wrapped by `sandbox-exec` (macOS) or `bwrap` (Linux), reusing the same primitives proven by the `git clone` sandbox introduced in v5.1. Defense-in-depth: even if `zip-extract.mjs` has an undiscovered bypass, the kernel refuses any write outside the per-scan temp directory
|
||
- **`scanners/lib/vsix-fetch-worker.mjs`** — Sub-process worker. Argv: `--url <url> --tmpdir <writable-dir>`. Emits a single JSON line on stdout (`{ok, sha256, size, finalUrl, source, extRoot}` or `{ok:false, error, code?}`). Exit 0 on success, 1 on failure. Silent on stderr
|
||
- **`scanners/lib/vsix-sandbox.mjs`** — Wrapper. Exports `buildSandboxProfile`, `buildBwrapArgs`, `buildSandboxedWorker(tmpDir, args)`, `runVsixWorker(url, tmpDir, opts)`. 35 s timeout, 1 MB stdout cap, deterministic JSON-line protocol
|
||
- **`scan(url, { useSandbox })` option.** Default `true` for CLI invocations; tests pass `false` to keep `globalThis.fetch` mocking working (mocks do not cross process boundaries). When sandbox unavailable on the platform (e.g., Windows), a warning is added to `meta.warnings` and the scan still completes via the in-process fallback
|
||
- **`meta.source.sandbox`** — New envelope field: `'sandbox-exec' | 'bwrap' | 'none' | 'in-process'`. Tells the report which protection layer was actually active
|
||
- **8 new tests** in `tests/scanners/vsix-sandbox.test.mjs` covering profile generation per platform, worker arg construction, and live worker exit behavior on invalid URLs (no network required)
|
||
|
||
### Changed
|
||
- `fetchAndExtractVsixUrl` in `ide-extension-scanner.mjs` is now sandbox-aware (`useSandbox` option, default `true`). Existing in-process logic preserved as fallback path
|
||
- Version bump: 6.4.0 → 6.5.0 across all files
|
||
|
||
### Why
|
||
- Aligns the IDE-scan URL pipeline with the same defense-in-depth posture as the GitHub clone pipeline — kernel-enforced FS confinement instead of in-process validation alone
|
||
- VSIX is untrusted bytes from a third-party registry; even with hardened parsing, an OS sandbox is the right blast-radius constraint for filesystem writes
|
||
|
||
## [6.4.0] - 2026-04-17
|
||
|
||
### Added
|
||
- **`/security ide-scan <url>` — pre-install verification.** The IDE extension scanner now accepts URLs as targets and fetches the VSIX before scanning. Supported sources:
|
||
- VS Code Marketplace: `https://marketplace.visualstudio.com/items?itemName=publisher.name`
|
||
- OpenVSX: `https://open-vsx.org/extension/publisher/name[/version]`
|
||
- Direct VSIX download: `https://example.com/path/foo.vsix` (HTTPS only)
|
||
- **`scanners/lib/vsix-fetch.mjs`** — HTTPS-only fetcher with 50 MB compressed cap, 30 s total timeout, SHA-256 streamed during download, manual redirect handling with per-source host whitelist (Marketplace gallerycdn, OpenVSX blob storage). No npm dependencies — uses Node 18+ `fetch`
|
||
- **`scanners/lib/zip-extract.mjs`** — Zero-dependency ZIP parser + safe extractor. Rejects: zip-slip via `..` paths, POSIX absolute paths, Windows drive letters, NUL bytes, encrypted entries, ZIP64, multi-disk archives, unsupported compression methods, symlink entries (Unix `0xA000` mode bits in `external_attr`). Caps: 10 000 entries, 500 MB uncompressed total, 100× expansion ratio (sum-uncomp / sum-comp), depth 20. STORE + DEFLATE only
|
||
- **Envelope `meta.source`** — When invoked with a URL, the scan envelope's `meta.source` field carries `{ type: "url", kind, url, finalUrl, sha256, size, publisher, name, version, requestedUrl }` so reports can attribute findings to the upstream artifact
|
||
- **`knowledge/marketplace-api-notes.md`** — Reference notes for the (undocumented but stable) Marketplace direct-download endpoint and the (officially documented) OpenVSX endpoints used by `vsix-fetch.mjs`
|
||
- **48 new tests** across `tests/scanners/zip-extract.test.mjs` (validateEntryName / isSymlink / extractToDir happy + adversarial), `tests/scanners/vsix-fetch.test.mjs` (detectUrlType / isAllowedHost / readBodyCapped), `tests/scanners/ide-extension-url.test.mjs` (URL flow integration with `global.fetch` mock — Marketplace, OpenVSX, direct VSIX, malformed VSIX, zip-slip VSIX, network failure, unsupported URL, GitHub URL). 1344 tests total (was 1296). Test helper: `tests/lib/build-zip.mjs` builds adversarial ZIPs that real `zip` tools refuse to emit
|
||
|
||
### Changed
|
||
- `scanners/ide-extension-scanner.mjs` early-detects URL targets and routes through fetch + extract → temp dir → existing single-target scan path. Temp directory cleaned in `try/finally` regardless of success/error/abort
|
||
- CLI help text in `bin/llm-security.mjs` and `commands/ide-scan.md` updated with URL examples and security model
|
||
- Version bump: 6.3.0 → 6.4.0 across all files
|
||
|
||
### Not supported (intentional)
|
||
- GitHub repo URLs — would require `npm install` + `vsce package` build step. Use the Marketplace, OpenVSX, or a direct `.vsix` URL instead
|
||
- VSIX `.signature.p7s` verification — deferred to v6.5.0 (requires X.509 / PKCS#7 parsing)
|
||
- ZIP64 archives — real-world VSIX never approaches the 4 GB threshold
|
||
|
||
## [6.3.0] - 2026-04-17
|
||
|
||
### Added
|
||
- **IDE extension prescan** — New `/security ide-scan` command and `scanners/ide-extension-scanner.mjs` (prefix IDE) discover and audit installed VS Code extensions across 6 roots (`~/.vscode/extensions`, `~/.vscode-insiders/extensions`, `~/.cursor/extensions`, `~/.windsurf/extensions`, `~/.vscode-oss/extensions`, `~/.vscode-server/extensions`, plus Linux `code-server`). OS-aware discovery via `scanners/lib/ide-extension-discovery.mjs`. Manifest parsing via `scanners/lib/ide-extension-parser.mjs`. Data loading via `scanners/lib/ide-extension-data.mjs`. JetBrains discovery is a v1.1 stub.
|
||
- **7 IDE-specific detection categories** — Blocklist match (CRITICAL), theme-with-code (HIGH, Material Theme pattern), sideload `.vsix` (HIGH unsigned / MEDIUM signed), broad activation `*` / `onStartupFinished` (MEDIUM/LOW, suppressed for top-100 exact matches), Levenshtein typosquat ≤2 vs top-100 (HIGH distance-1 / MEDIUM distance-2 against top-50), extension-pack expansion ≥3 (MEDIUM), dangerous `vscode:uninstall` hooks referencing `child_process`/`curl`/`wget`/`rm`/`powershell` (HIGH/LOW)
|
||
- **Per-extension scanner orchestration** — Each discovered extension runs through UNI, ENT, NET, TNT, MEM, SCR scanners with bounded concurrency (default 4). MEM gets a filtered file list (README.md / CHANGELOG.md / package.json) to catch prompt-injection in marketplace-rendered text
|
||
- **New knowledge files** — `knowledge/ide-extension-threat-patterns.md` (10 categories with 2024-2026 case studies from Koi Security — GlassWorm, WhiteCobra, TigerJack, Material Theme, VS Code Cryptojacking, MaliciousCorgi), `knowledge/top-vscode-extensions.json` (top ~100 Marketplace IDs + blocklist), `knowledge/top-jetbrains-plugins.json` (stub)
|
||
- **CLI integration** — `bin/llm-security.mjs` gains `ide-scan` subcommand with passthrough flags
|
||
- 22 new tests in `tests/scanners/ide-extension-scanner.test.mjs` (fixtures under `tests/fixtures/ide-extensions/`). 1296 tests total (was 1274)
|
||
|
||
### Changed
|
||
- Version bump: 6.2.0 → 6.3.0 across all files
|
||
|
||
## [6.2.0] - 2026-04-17
|
||
|
||
### Added
|
||
- **Bash-normalize T5 + T6** — `scanners/lib/bash-normalize.mjs` now collapses `${IFS}` word-splitting (T5) and ANSI-C hex quoting `$'\xHH'` (T6) before the denylist gate runs. Defense-in-depth layer complementing the Claude Code 2.1.98+ harness fixes. 4 new unit tests in `tests/scanners/bash-normalize.test.mjs`
|
||
- **PreCompact hook** — `hooks/scripts/pre-compact-scan.mjs` scans the transcript tail (default 500 KB) for injection patterns before Claude Code compacts context. Prevents poisoned summaries from surviving into the next turn. Modes: `block` / `warn` / `off` via `LLM_SECURITY_PRECOMPACT_MODE`. 6 new tests in `tests/hooks/pre-compact-scan.test.mjs`. Brings total hooks to 9
|
||
- **Security hardening guide** — `docs/security-hardening-guide.md` documents environment variables (`CLAUDE_CODE_EFFORT_LEVEL`, `ENABLE_PROMPT_CACHING_1H`, `CLAUDE_CODE_SCRIPT_CAPS`, all `LLM_SECURITY_*` modes), sandboxing (`sandbox-exec` / `bwrap` / fallback), T1-T6 normalization table, Opus 4.7 system card §5.2.1 + §6.3.1.1 alignment, baseline production recommendations
|
||
|
||
### Changed
|
||
- **Agent refactor for Opus 4.7 literal instruction following** — `agents/skill-scanner-agent.md` and `agents/mcp-scanner-agent.md` reframe stacked CANNOT/MUST NOT imperatives in favor of tool-level enforcement via `tools:` frontmatter. New Step 0 "Generaliseringsgrense" blocks (cite evidence path:line, mark speculation as speculation) and "Parallell Read-strategi" notes (prefer parallel Read calls for independent file reads)
|
||
- **Defense Philosophy linked to Opus 4.7 system card** — `CLAUDE.md` §Defense Philosophy now cites Opus 4.7 system card §5.2.1 (multi-layer defenses) and §6.3.1.1 (instruction hierarchy → tool-level enforcement)
|
||
- Version bump: 6.1.0 → 6.2.0 across all files
|
||
|
||
## [6.1.0] - 2026-04-10
|
||
|
||
### Added
|
||
- **`--fail-on <severity>` flag** — CI-friendly exit codes: exit 1 when any finding at or above the specified severity exists (critical/high/medium/low). Configurable via `policy.json` `ci.failOn`
|
||
- **`--compact` output mode** — One-liner per finding format (`[SEVERITY] scanner: title (file:line)`), reduces CI log noise. Configurable via `policy.json` `ci.compact`
|
||
- **CI/CD pipeline templates** — Ready-to-use templates in `ci/`: GitHub Actions (`github-action.yml`), Azure DevOps (`azure-pipelines.yml`), GitLab CI (`gitlab-ci.yml`) with SARIF upload, Node 18 setup
|
||
- **CI/CD integration guide** — `docs/ci-cd-guide.md` with 5-minute setup per platform, Schrems II/NSM compliance documentation, exit code reference
|
||
- **npm publish preparation** — `files` whitelist in `package.json` (only `bin/` + `scanners/`), `.npmignore` safety net, `homepage` field
|
||
- **Policy `ci` section** — New `ci: { failOn, compact }` section in `.llm-security/policy.json` for distributable CI configuration
|
||
|
||
### Changed
|
||
- Version bump: 6.0.0 → 6.1.0 across all files
|
||
|
||
## [6.0.0] - 2026-04-10
|
||
|
||
### Added
|
||
- **Compliance mapping** — `knowledge/compliance-mapping.md` maps plugin capabilities to EU AI Act (Art. 9, 15, 17), NIST AI RMF (Map, Measure, Manage, Govern), ISO 42001 (Annex A), and MITRE ATLAS techniques (AML.T IDs)
|
||
- **Norwegian regulatory context** — `knowledge/norwegian-context.md` covers Datatilsynet (DPIA for AI), NSM (basic security principles), and Digitaliseringsdirektoratet guidance
|
||
- **SARIF 2.1.0 output** — `scanners/lib/sarif-formatter.mjs` converts scan output to OASIS SARIF standard format. Use `--format sarif` with scan/deep-scan commands
|
||
- **Structured audit trail** — `scanners/lib/audit-trail.mjs` writes JSONL audit events with ISO 8601 timestamps, OWASP category tags, and SIEM-ready schema. Configurable via `LLM_SECURITY_AUDIT_*` env vars
|
||
- **AI-BOM generator** — `scanners/ai-bom-generator.mjs` + `scanners/lib/bom-builder.mjs` produce CycloneDX 1.6 Bills of Materials for AI components (models, MCP servers, plugins, knowledge, hooks)
|
||
- **Policy-as-code** — `scanners/lib/policy-loader.mjs` reads `.llm-security/policy.json` for distributable hook configuration. Integrated into all 8 hooks. Env vars always take precedence
|
||
- **Standalone CLI** — `bin/llm-security.mjs` provides `npx llm-security` entry point. Subcommands: `scan`, `deep-scan`, `posture`, `audit-bom`, `benchmark`
|
||
- **Posture compliance categories** — 3 new posture categories (14: EU AI Act, 15: NIST AI RMF, 16: ISO 42001). Advisory only — do not affect Grade A threshold
|
||
- **Attack simulator benchmark mode** — `--benchmark` flag outputs structured pass/fail metrics for CI integration
|
||
|
||
### Changed
|
||
- Version bump: 5.1.0 → 6.0.0 across all files
|
||
- Knowledge base expanded from 13 to 15 files
|
||
- Scanner count: 15 → 16 (AI-BOM generator added)
|
||
- Posture scanner: 13 → 16 categories
|
||
- All hooks now read policy from `.llm-security/policy.json` (backward-compatible — defaults match hardcoded values)
|
||
|
||
## [5.1.0] - 2026-04-07
|
||
|
||
### Added
|
||
- **Sandboxed remote cloning** — `git clone` for remote scans is now hardened with two defense layers:
|
||
1. Git config flags: `core.hooksPath=/dev/null`, `core.symlinks=false`, `core.fsmonitor=false`, all LFS filter drivers disabled, `protocol.file.allow=never`, `transfer.fsckObjects=true`. Environment: `GIT_CONFIG_NOSYSTEM=1`, `GIT_CONFIG_GLOBAL=/dev/null`, `GIT_ATTR_NOSYSTEM=1`, `GIT_TERMINAL_PROMPT=0`
|
||
2. OS-level filesystem sandbox: macOS `sandbox-exec` and Linux `bubblewrap` (bwrap) restrict file writes to only the specific temp directory. Even if `.gitattributes` filter drivers bypass git config, they cannot write outside the clone dir. bwrap probe-tests availability before use (graceful fallback on Ubuntu 24.04+ where AppArmor blocks it). Graceful fallback on Windows (git config flags only, WARN logged)
|
||
- **Post-clone size check** — Repos exceeding 100MB after clone are rejected and cleaned up
|
||
- **UUID-unique evidence filenames** — `fs-utils.mjs tmppath` now generates unique filenames with `crypto.randomUUID()` suffix, preventing race conditions between concurrent scans
|
||
- **Evidence file cleanup** — `scan.md` and `plugin-audit.md` now clean up evidence files (content-extract, plugin-extract) after scanning
|
||
- **Cleanup guarantee** — Both `scan.md` and `plugin-audit.md` have explicit cleanup guarantee: temp dir + evidence file are removed even if scan fails or errors
|
||
|
||
### Changed
|
||
- `scanners/lib/git-clone.mjs` — complete rewrite of clone command with sandbox wrapping
|
||
- `scanners/lib/fs-utils.mjs` — tmppath uses `crypto.randomUUID()` for unique names
|
||
|
||
## [5.0.0] - 2026-04-06
|
||
|
||
### Added
|
||
- **Prompt Injection Hardening (v5.0)** — 8-session defense-in-depth overhaul driven by 7 research papers (2025-2026). Defense philosophy: broader detection + increased attack cost + longer monitoring windows + architectural constraints + honest documentation
|
||
- **MEDIUM advisory wiring** — `pre-prompt-inject-scan.mjs` emits advisory for MEDIUM-severity obfuscation signals (leetspeak, homoglyphs, zero-width, multi-language). Never blocks. `post-mcp-verify.mjs` includes MEDIUM in injection scan advisory
|
||
- **Unicode Tag steganography** — `string-utils.mjs` decodes U+E0001-E007F (invisible ASCII encoding). CRITICAL if decoded content matches injection patterns, HIGH for bare presence. Integrated into `normalizeForScan()` pipeline
|
||
- **BIDI override stripping** — Removes directional override characters before injection scanning
|
||
- **Bash expansion normalization** — New `bash-normalize.mjs` strips `${}`, empty quotes, backslash splits before command matching. Applied in `pre-bash-destructive.mjs` and `pre-install-supply-chain.mjs`
|
||
- **Rule of Two enforcement** — `post-session-guard.mjs` gains `LLM_SECURITY_TRIFECTA_MODE=block|warn|off` (default: warn). Block mode exits with code 2 for MCP-concentrated trifecta or sensitive path + exfiltration
|
||
- **100-call long-horizon monitoring** — Extended window alongside 20-call sliding window. Slow-burn trifecta detection (legs >50 calls apart = MEDIUM). Behavioral drift via Jensen-Shannon divergence on tool-class distribution
|
||
- **HITL trap detection** — HIGH patterns for approval urgency, summary suppression, scope minimization. MEDIUM for cognitive load (injection buried in verbose output)
|
||
- **Sub-agent delegation tracking** — `post-session-guard.mjs` tracks Task/Agent tool usage. Escalation-after-input advisory when delegation occurs within 5 calls of untrusted input (DeepMind Agent Traps kat. 4)
|
||
- **Natural language indirection** — MEDIUM patterns for "fetch this URL and execute", "send this data to", "read ~/.ssh". Strict false-positive tests for benign phrasing
|
||
- **Hybrid attack patterns** — P2SQL (SQL keywords in injection text), recursive injection (injection containing injection), XSS in agent context (`<script>`, `javascript:`, `onerror=`)
|
||
- **CaMeL-inspired data flow tagging** — SHA-256 provenance tracking in `post-session-guard.mjs`. Hash of tool output → match against subsequent tool input. Linked data flows elevate trifecta severity
|
||
- **Adaptive red-team** — `attack-simulator.mjs --adaptive` runs 5 mutation rounds per passing scenario: homoglyph substitution, encoding wrapping, zero-width injection, case alternation, synonym substitution. Rules in `knowledge/attack-mutations.json`
|
||
- **Knowledge base expansion** — `prompt-injection-research-2025-2026.md` (7 papers), `deepmind-agent-traps.md` (6 categories, 43 techniques), `attack-mutations.json` (synonym tables). Attack scenarios expanded from 38 to 64 across 12 categories
|
||
- **Posture scanner expanded to 13 categories** — New: Prompt Injection Hardening (cat 11), Rule of Two (cat 12), Long-Horizon Monitoring (cat 13). Checks for MEDIUM advisory, Unicode Tag detection, bash normalization, TRIFECTA_MODE, behavioral drift
|
||
- **Defense Philosophy section** in CLAUDE.md — honest documentation of what v5.0 can and cannot do, based on joint paper findings (95-100% ASR against all tested defenses)
|
||
- 8 new posture scanner tests (49 total for posture)
|
||
|
||
### Changed
|
||
- Posture scanner version updated to 5.0.0
|
||
- Dashboard aggregator version updated to 5.0.0
|
||
- Red-team scenarios expanded from 38 to 64 across 12 categories
|
||
- Knowledge files count: 10 -> 13
|
||
|
||
## [4.5.1] - 2026-04-04
|
||
|
||
### Fixed
|
||
- **Cross-platform support (Windows/Linux).** Replaced all Unix-only patterns: `fileURLToPath()` instead of `import.meta.url.replace('file://', '')`, `path.dirname()` instead of `lastIndexOf('/')`, native `fetch()` instead of `curl` subprocess (Node 18+), removed `2>/dev/null` from shell commands, fixed tilde expansion regex for Windows backslash paths. 11 files changed, 782 tests pass.
|
||
|
||
## [4.5.0] - 2026-04-04
|
||
|
||
### Added
|
||
- **Attack simulation / red-team mode** — `scanners/attack-simulator.mjs` runs 38 crafted attack scenarios across 7 categories against the plugin's own hooks. Data-driven: scenarios defined in `knowledge/attack-scenarios.json`, payloads assembled at runtime via fragment concatenation (avoids triggering hooks on source file). Categories: secrets (7), destructive (8), supply-chain (4), prompt-injection (6), pathguard (6), mcp-output (4), session-trifecta (3). CLI: `node scanners/attack-simulator.mjs [--category <name>] [--json] [--verbose]`. Library: `import { loadScenarios, runScenario, resolvePayloads }`
|
||
- **`/security red-team` command** — attack simulation with category filter (`--category secrets|destructive|...`). Narrative report with per-category breakdown and defense score
|
||
- **`knowledge/attack-scenarios.json`** — 38 red-team scenarios with placeholder payloads (`{{MARKER}}` syntax), resolved at runtime to actual attack strings
|
||
- 31 new tests for attack simulator (unit + integration + CLI)
|
||
|
||
## [4.4.0] - 2026-04-03
|
||
|
||
### Added
|
||
- **Cross-project security dashboard** — `scanners/dashboard-aggregator.mjs` discovers all Claude Code projects under ~/ (depth 3) and ~/.claude/plugins/, runs posture-scanner on each, aggregates results. Machine grade = weakest link across all projects. Cache in `~/.cache/llm-security/dashboard-latest.json` (24h staleness). CLI: `node scanners/dashboard-aggregator.mjs [--no-cache] [--max-depth N]`. Library: `import { aggregate, discoverProjects }`
|
||
- **`/security dashboard` command** — machine-wide security overview with per-project grade table, sorted by grade (worst first). Shows cache status, total findings, and recommendations based on machine grade
|
||
- 16 new tests for dashboard aggregator (discovery, aggregation, caching, grade logic)
|
||
|
||
## [4.3.0] - 2026-04-03
|
||
|
||
### Added
|
||
- **MCP description drift detection** — `scanners/lib/mcp-description-cache.mjs` caches MCP tool descriptions in `~/.cache/llm-security/mcp-descriptions.json` with 7-day TTL. Compares via Levenshtein distance — >10% change triggers advisory (OWASP MCP05 rug-pull). `extractMcpServer()` exported for server attribution
|
||
- **MCP-concentrated trifecta** — `post-session-guard.mjs` now detects when all 3 lethal trifecta legs (input + access + exfil) originate from the same MCP server, elevating severity. Single compromised server pattern
|
||
- **Cumulative data volume tracking** — `post-session-guard.mjs` tracks total output bytes per session, warns at 100KB (LOW), 500KB (MEDIUM), 1MB (HIGH) thresholds (OWASP ASI02)
|
||
- **Per-MCP-tool volume tracking** — `post-mcp-verify.mjs` tracks cumulative output per MCP tool, warns when a single tool exceeds 100KB (OWASP ASI02, MCP03)
|
||
- **MCP drift integration in post-mcp-verify** — checks MCP tool descriptions on every invocation against cached baseline, advisory on significant drift
|
||
- 35 new tests: 16 for mcp-description-cache, 5 for post-mcp-verify drift/volume, 14 for post-session-guard MCP features
|
||
|
||
## [4.2.0] - 2026-04-03
|
||
|
||
### Added
|
||
- **Supply chain re-check scanner** — `scanners/supply-chain-recheck.mjs` (prefix SCR) periodically re-audits installed dependencies by parsing lockfiles (package-lock.json, yarn.lock, requirements.txt, Pipfile.lock). Checks against curated blocklists, OSV.dev batch API (`/v1/querybatch`) for known CVEs, and Levenshtein-based typosquat detection against top-packages knowledge base. Offline fallback: blocklist + typosquat checks run without network, INFO finding notes skipped CVE check. OWASP: LLM03, ASI04, AST06, MCP04
|
||
- **Shared supply chain data module** — `scanners/lib/supply-chain-data.mjs` extracts blocklists (NPM/PIP/Cargo/Gem), helper functions, and OSV.dev API calls shared between the hook (`pre-install-supply-chain.mjs`) and the new scanner
|
||
- **`/security supply-check` command** — standalone dependency re-audit with focused output. CLI wrapper: `node scanners/supply-chain-recheck-cli.mjs <path>`
|
||
- SCR prefix added to all 4 OWASP maps (LLM, ASI, AST, MCP) in severity.mjs
|
||
- Supply chain scanner integrated into scan-orchestrator (10th scanner, runs before toxic-flow)
|
||
- Test fixtures: `tests/fixtures/supply-chain/` with compromised and clean lockfiles for npm, pip, yarn, Pipfile
|
||
- 30 new tests for supply-chain-recheck scanner and shared module
|
||
|
||
### Changed
|
||
- `pre-install-supply-chain.mjs` hook refactored to import blocklists and helpers from shared `supply-chain-data.mjs` module (reduced duplication by ~160 lines)
|
||
|
||
## [4.1.0] - 2026-04-03
|
||
|
||
### Added
|
||
- **Reference configuration generator** — `scanners/reference-config-generator.mjs` generates Grade A security configuration based on posture scanner gaps. Detects project type (plugin/monorepo/standalone). Templates in `templates/reference-config/`. CLI: `node scanners/reference-config-generator.mjs [path] [--apply]`. Library: `import { generate } from './reference-config-generator.mjs'`
|
||
- **`/security harden` command** — runs posture scanner, identifies gaps, generates settings.json (deny-first), CLAUDE.md security section, and .gitignore additions. Supports `--dry-run` (default) and `--apply` (writes with backup). Post-apply verification re-runs posture scanner to confirm improvement
|
||
- Reference config templates: `settings-deny-first.json`, `claude-md-security-section.md`, `gitignore-security.txt`
|
||
- 23 new tests for reference-config-generator (grade-a, grade-f, apply mode, project type detection)
|
||
|
||
## [4.0.0] - 2026-04-03
|
||
|
||
### Added
|
||
- **Deterministic posture scanner** — `posture-scanner.mjs` replaces the Opus-based posture-assessor-agent for `/security posture`. 10 categories assessed in <50ms (was ~6 min with agent). Scanner prefix PST. Standalone CLI: `node scanners/posture-scanner.mjs [path]` → JSON stdout. Categories: Deny-First, Secrets, Path Guarding, MCP Trust, Destructive Blocking, Sandbox, Human Review, Plugin Sources, Session Isolation, Cognitive State Security. Reuses `scanForInjection()` and `gradeFromPassRate()` from shared libraries. Grade A/B/C/D/F with risk score, risk band, and verdict
|
||
- PST prefix added to all 4 OWASP maps (LLM, ASI, AST, MCP) in severity.mjs
|
||
- Test fixtures: `tests/fixtures/posture-scan/grade-a-project/` (Grade A) and `grade-f-project/` (Grade F)
|
||
- 41 new tests for posture scanner (interface, grade-a, grade-f)
|
||
|
||
### Changed
|
||
- `/security posture` now uses deterministic scanner via Bash instead of spawning posture-assessor-agent. Instant results, zero token cost
|
||
- `/security audit` runs posture scanner first for instant category data, then agents for narrative and skill/MCP analysis
|
||
- Posture-assessor-agent retained for full audit narrative only
|
||
|
||
## [3.1.1] - 2026-04-03
|
||
|
||
Audit remediation: 6 findings fixed, global settings hardened.
|
||
|
||
## [3.0.0] - 2026-04-03
|
||
|
||
Public release. 8 development sessions from v2.5 to v3.0.
|
||
|
||
### Added
|
||
- **Toxic flow analysis** (v2.7.0) — 8th orchestrated scanner (`toxic-flow-analyzer.mjs`, prefix TFA) detecting lethal trifecta patterns: untrusted input + sensitive data access + exfiltration sink. Post-processing correlator consuming output from all prior scanners. Direct, cross-component, and project-level detection with mitigation downgrades. OWASP: ASI01, ASI02, ASI05
|
||
- **Runtime session guard** (v2.7.1) — PostToolUse hook monitoring tool call sequences for lethal trifecta forming during a session. Sliding window (20 calls), per-session JSONL state in `/tmp/`, advisory warning (never blocks). Auto-cleanup after 24h
|
||
- **MCP runtime inspection** (v2.8.0) — Standalone scanner (`mcp-live-inspect.mjs`, prefix MCI) connecting to running MCP stdio servers via JSON-RPC 2.0. Fetches live tool/prompt/resource lists, scans descriptions for injection patterns, detects tool shadowing across servers. 10s timeout per server. New `/security mcp-inspect` command. `/security mcp-audit --live` flag for combined static + live analysis
|
||
- **Auto update notifications** (v2.8.1) — UserPromptSubmit hook checking for newer plugin versions against the public Forgejo repo (max 1x/24h, cached in `~/.cache/llm-security/`). Disable: `LLM_SECURITY_UPDATE_CHECK=off`
|
||
- **Report diffing & baseline** (v2.9.0) — `diff-engine.mjs` library for finding fingerprinting, fuzzy line matching (+-3), and diff categorization (new/resolved/unchanged/moved). Scan orchestrator gains `--baseline` and `--save-baseline` flags. Baselines stored per target hash in `reports/baselines/`. New `/security diff` command
|
||
- **Continuous scanning** (v2.9.1) — `/security watch [path] [--interval 6h]` using built-in /loop for recurring diff scanning. `watch-cron.mjs` standalone script for system cron/launchd with multi-target config and exit codes
|
||
- **Skill signature registry** (v2.9.2) — `skill-registry.mjs` library for SHA-256 fingerprinting of normalized skill content, scan result caching (7-day staleness), and pattern search. New `/security registry` command. `/security scan` checks registry before full scan for instant results on known fingerprints
|
||
- **OWASP Skills Top 10** (v2.6.0) — New knowledge file `owasp-skills-top10.md` (AST01-AST10) with skill-specific threat definitions and mitigations
|
||
- **MEDIUM injection patterns** (v2.6.0) — ~15 new patterns: base64 payloads, leetspeak, homoglyphs, multi-language mixing, markdown/HTML comment injection
|
||
- **4-framework OWASP mapping** (v2.6.0) — Full coverage of LLM Top 10, Agentic AI Top 10 (ASI), Skills Top 10 (AST), MCP Top 10 in severity.mjs
|
||
- Architecture diagram (mermaid) in README
|
||
- CHANGELOG.md
|
||
|
||
### Changed
|
||
- Scan orchestrator now runs 8 scanners (was 7) with TFA running last
|
||
- Agent prompts updated with ASI/AST/MCP OWASP references
|
||
- `scanForInjection()` returns `{ found, severity, patterns }` instead of boolean
|
||
- Self-scan suppressions updated from ~150 to ~190 (TFA self-referential findings added)
|
||
- Plugin description updated to reference all 4 OWASP frameworks
|
||
|
||
### Fixed
|
||
- package.json version sync with plugin.json
|
||
|
||
## [2.5.0] - 2026-04-02
|
||
|
||
### Added
|
||
- Pre-extraction indirection layer for remote scan defense. Remote scans pre-extract structured evidence via `content-extractor.mjs` and strip injection patterns BEFORE LLM agents see the content
|
||
|
||
## [2.4.0] - 2026-04-01
|
||
|
||
### Added
|
||
- GitHub repo URL support for `scan` and `plugin-audit`. Clone to temp dir via `git-clone.mjs`, scan locally, clean up. `--branch <name>` flag for non-default branches
|
||
|
||
## [2.3.0] - 2026-04-01
|
||
|
||
### Added
|
||
- PostToolUse expanded to ALL tools (was Bash-only). Scans Read, WebFetch, MCP, and all other tool output for indirect prompt injection
|
||
- `LLM_SECURITY_INJECTION_MODE` env var: `block` (default), `warn`, `off`
|
||
- Complementary Tools section in README (parry-guard, Lasso, Snyk)
|
||
- CLAUDE.md poisoning documented as known limitation
|
||
|
||
### Changed
|
||
- Short output skip (<100 chars) for PostToolUse performance
|
||
|
||
## [2.2.0] - 2026-04-01
|
||
|
||
### Added
|
||
- UserPromptSubmit hook blocking prompt injection in user input
|
||
- Obfuscation decoding: unicode-escape, hex-escape, URL-encoding, base64 normalization
|
||
- Shared `injection-patterns.mjs` module (21 critical + 8 high patterns)
|
||
- PostToolUse indirect injection scanning in tool output (LLM01)
|
||
|
||
### Changed
|
||
- LLM01 coverage 83% -> 95%, LLM05 80% -> 83%
|
||
|
||
## [2.1.0] - 2026-04-01
|
||
|
||
### Added
|
||
- 383 tests (was 177): full hook coverage (66 tests), auto-cleaner coverage (140 tests)
|
||
- HTTPS install URL under fromaitochitta org
|
||
|
||
### Fixed
|
||
- Auto-cleaner import guard
|
||
- Solo project setup (CONTRIBUTING.md removed)
|
||
|
||
### Changed
|
||
- Model defaults set to sonnet
|
||
|
||
## [2.0.0] - 2026-03-31
|
||
|
||
### Added
|
||
- Open-source release: MIT LICENSE, SECURITY.md
|
||
- Test suite (`node:test`, 177 tests)
|
||
- `pre-write-pathguard.mjs` hook (8 path categories)
|
||
- `.gitignore`, `.editorconfig`
|
||
|
||
## [1.4.0] - 2026-02-21
|
||
|
||
### Added
|
||
- Unified risk scoring formula (25/10/4/1 weights)
|
||
- Score-based verdicts and risk bands (Low -> Extreme)
|
||
- OWASP categorization and A-F grading
|
||
- Single `unified-report.md` template replacing 9 separate templates
|
||
|
||
## [1.3.0] - 2026-02-21
|
||
|
||
### Added
|
||
- `/security clean` command with 3-tier remediation (auto/semi-auto/manual)
|
||
- `auto-cleaner.mjs` engine (16 fix operations, atomic writes, post-fix validation)
|
||
- `cleaner-agent` for semi-auto proposals
|
||
- `--dry-run` flag
|
||
|
||
## [1.2.0] - 2026-02-19
|
||
|
||
### Added
|
||
- 7 deterministic Node.js scanners (unicode, entropy, permissions, dependencies, taint, git forensics, network)
|
||
- `/security deep-scan` command and `--deep` flag
|
||
- Synthesizer agent for scanner JSON interpretation
|
||
- Shared scanner library (`scanners/lib/`)
|
||
- Demo fixture with 85-finding security assessment
|
||
|
||
### Changed
|
||
- OWASP coverage: LLM01 70->85%, LLM02 90->95%, LLM03 80->90%, LLM06 85->95%
|
||
|
||
## [1.1.0] - 2026-02-19
|
||
|
||
### Added
|
||
- `/security plugin-audit` command
|
||
- `/security mcp-audit` command
|
||
- `/security pre-deploy` command
|
||
- 3 new report templates
|
||
|
||
### Changed
|
||
- OWASP coverage: LLM03 75% -> 80%
|
||
|
||
## [1.0.0] - 2026-02-19
|
||
|
||
### Added
|
||
- Initial release
|
||
- 4 agents: skill-scanner, mcp-scanner, posture-assessor, threat-modeler
|
||
- 4 hooks: secret detection, destructive commands, supply chain, output verification
|
||
- 6 knowledge files (2,771 lines)
|
||
- 8 commands: security, scan, audit, posture, threat-model, plugin-audit, mcp-audit, pre-deploy
|
||
- 7 report templates
|
||
- OWASP LLM Top 10 + Agentic AI Top 10 coverage
|