Batch C release. Closes 12 implementation tasks (E3, E8-E14, 8.4, 8.6,
8.7, 8.10) across four execution waves: A (bash + decoder), B (supply
chain + workflow scanner), C (MCP cumulative drift), D (code quality).
Wave E (9 new attack-simulator scenarios for the new defenses) deferred
to v7.3.1 — defenses are unit-tested per wave; the deferred work adds
attack-simulator regression coverage on top, not the primary safety net.
Tests: 1665+ → 1777 (Wave A-D cumulative, +112).
Version sync targets touched:
- package.json
- .claude-plugin/plugin.json
- CLAUDE.md (header)
- README.md (badge + new release-history row)
- scanners/ide-extension-scanner.mjs (VERSION constant)
- ../../README.md (marketplace root plugin entry)
- CHANGELOG.md (new [7.3.0] section per Keep a Changelog, all 12 task
IDs covered individually under Added/Changed/Documentation/Tests/Notes)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Documents the v7.1.1 narrative-coherence patch in CLAUDE.md (mini-block
appended after the v7.0.0 paragraph) and CHANGELOG.md (new [7.1.1]
section per Keep a Changelog convention, placed above [7.1.0]).
Plan: .claude/plans/ultraplan-2026-04-29-report-coherence.md
Brief: .claude/ultraplan-spec-2026-04-29-report-coherence.md
Verification gates passed:
- npm test: 1522/1522 (was 1511; +11 from new narrative test)
- node --test tests/lib/severity.test.mjs: 86/86 (co-monotonicity sweep
at lines 252-303 unchanged and green)
- node --test tests/scanners/skill-scanner-narrative.test.mjs: 11/11
- Orchestrator against fixture: WARNING / 48 / 1 HIGH (HITL trap caught
correctly, no whiplash)
- SARIF inline check via toSARIF import: sarif-version 2.1.0, runs: 1
- Zero remaining v1 cutoffs in agent + template
Out of scope but flagged for Batch B (deferred to v7.2.0):
- commands/scan.md:113-114 retains v1 risk formula
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Closes A2 of v7.1.0 critical-review patch (docs/critical-review-2026-04-20.md):
- B4 (severity JSDoc): 4 critical = 93, not 90. Fixed in scanners/lib/severity.mjs:23
and CHANGELOG.md v7.0.0 tier description. The actual computation has always been
93 (70 + log2(5)*10 = 93.22 → round); only the docs were wrong.
- §5.4 co-monotonicity: new sweep test in tests/lib/severity.test.mjs over 15
representative count vectors. Asserts that (verdict, riskBand) agree under the
v7.0.0 contract for every case — catches future drift between riskScore tiers,
verdict cutoffs, and riskBand cutoffs. Includes a B4 anchor test (riskScore
{critical: 4} === 93) so doc/code drift fails loudly.
- B8 (CaMeL claims toned down): post-session-guard.mjs:646 comment block and
CLAUDE.md:184 Defense Philosophy bullet now describe the implementation
honestly — opportunistic byte-matching of truncated output fingerprints
(first 200 bytes, SHA-256/16-hex), not semantic data-flow tracking.
Trivially bypassed by mutation, summarisation, or re-encoding. Inspired by
CaMeL (DeepMind 2025), but not a CaMeL capability-tracking implementation.
Tests: 1495 → 1511 (+16: 15 sweep cases + 1 B4 anchor). All green.
E2E verification against content-heavy repo (`content-claude-code`) revealed
413 entropy findings (8 HIGH / 405 MEDIUM) from markdown image CDN URLs in
JSON content indexes — e.g., ``.
These are legitimate content-repo artifacts, not credentials. The 40-char
hash segment in the CDN URL trips Shannon entropy (H=5.29 over 300 chars),
and rule 13 (inline <svg>) doesn't match since there's no literal `<svg>`
tag — the `.svg` is just a URL path suffix.
Added rule 18 `MARKDOWN_IMAGE = /!\[[^\]]*\]\(\s*https?:\/\//` — matches
`` / ``. Line-level (not string-level) so URL
is not over-specific.
E2E impact on `content-claude-code`:
- Before: BLOCK / 65 / 8H 437M 0L
- After: WARNING / 56 / 3H 427M 0L
Hyperframes unchanged: BLOCK / 80 / 1C 4H 92M — real CRITICAL SQL-injection
and HIGH findings still detected.
Tests: 2 new (positive + negative fixture) bringing entropy-context to 26,
total suite 1485 → 1487.
Docs updated to "rules 11-18" and "8 new line-suppression rules".
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Final commit in the trustworthy-scoring series. Bundles verdict cutoff
alignment, the last suite of tests, and all documentation touch-points
that quote version numbers or describe v7.0.0 behaviour.
Verdict/band co-monotonicity
- `scanners/lib/severity.mjs` — verdict cutoffs moved from 61/21 to 65/15
so `BLOCK >= 65`, `WARNING >= 15` locks onto the v2 riskBand() boundaries.
Prevents "BLOCK / Medium band" contradictions under the v2 formula.
Scanner hardening (bug fixes from v7.0.0 testing)
- `scanners/entropy-scanner.mjs` — `policy_source` now uses
`existsSync('.llm-security/policy.json')` instead of value-based check.
Old heuristic always reported 'policy.json' because DEFAULT_POLICY now
carries an `entropy.thresholds` section.
- `scanners/lib/file-discovery.mjs` — `.sass` and GPU shader extensions
(`.glsl, .frag, .vert, .shader, .wgsl`) added to TEXT_EXTENSIONS. Without
this, shader files were invisible to file-discovery, so they were never
counted as skipped by the entropy-scanner extension filter.
Tests
- `tests/scanners/entropy-context.test.mjs` (new, 24 tests) — A. File-ext
skip (4), B. Line-level rules 11-17 (8), C. Policy overrides (3).
Fixtures generate 80-char base64 payloads at runtime via
`crypto.randomBytes` to dodge the plugin's own pre-edit credential hook
on the test source.
- `tests/lib/severity.test.mjs` — rewritten with v2 scoring table (70
tests total, was 52).
- `tests/lib/output.test.mjs:243` — "1 critical = score 80" under v2
(was 25 under v1).
- Full suite: 1485/1485 green (was 1461).
Docs
- `CHANGELOG.md` — v7.0.0 entry with BREAKING CHANGES section.
- `README.md` (plugin + marketplace root) — version badge, history table,
plugin-card version string, test count.
- `CLAUDE.md` — header version, "v7.0.0 — Trustworthy scoring" summary
paragraph at the top.
- `docs/security-hardening-guide.md` — new section 6 "Calibration & false
positives" documenting v2 formula, context-aware entropy scanner,
typosquat allowlist, and §6.4 tuning workflow. Existing "Recommended
baseline" section renumbered to §7.
Version bump
- `6.6.0 -> 7.0.0` across package.json, .claude-plugin/plugin.json,
scanners/ide-extension-scanner.mjs VERSION const, README badge,
CLAUDE.md header, marketplace root README card.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
VSIX fetch + extract for URL targets now runs in a sub-process wrapped by
sandbox-exec (macOS) or bwrap (Linux), reusing the same primitives proven
by the v5.1 git-clone sandbox. Defense-in-depth — even if our own
zip-extract.mjs ever has a bypass, the kernel refuses any write outside
the per-scan temp directory.
New files:
- scanners/lib/vsix-fetch-worker.mjs — sub-process worker. Argv: --url
--tmpdir; emits one JSON line on stdout (ok/sha256/size/source/extRoot
or ok:false/error/code). Silent on stderr. Exit 0/1.
- scanners/lib/vsix-sandbox.mjs — wrapper. Exports buildSandboxProfile,
buildBwrapArgs, buildSandboxedWorker, runVsixWorker. 35s timeout, 1 MB
stdout cap.
Changes:
- scanners/ide-extension-scanner.mjs: fetchAndExtractVsixUrl is now
sandbox-aware (useSandbox option, default true). In-process logic
preserved as fallback. New meta.source.sandbox field:
'sandbox-exec' | 'bwrap' | 'none' | 'in-process'.
- scan(target, { useSandbox }) defaults to true; tests pass false because
globalThis.fetch mocks do not cross process boundaries.
- Windows fallback: in-process with meta.warnings advisory.
Tests:
- 8 new tests in tests/scanners/vsix-sandbox.test.mjs (per-platform
profile generation, worker arg construction, live worker exit
behavior on invalid URLs — no network).
- Existing URL tests updated to opt out of sandbox (useSandbox: false).
- 1344 → 1352 tests, all green.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Pre-installation verification of VS Code extensions via URL — fetch a remote
VSIX, extract it in a hardened sandbox, and run the existing IDE scanner
pipeline against it. No npm dependencies.
Sources:
- VS Code Marketplace (publisher.gallery.vsassets.io direct download)
- OpenVSX (open-vsx.org official API)
- Direct .vsix HTTPS URLs
Defenses:
- HTTPS-only, TLS verified, manual redirect with per-source host whitelist
- 30s total timeout via AbortController
- 50MB compressed cap, 500MB uncompressed, 100x expansion ratio
- Zero-dep ZIP extractor: zip-slip, absolute paths, drive letters, NUL bytes,
symlinks (Unix mode 0xA000), depth limits, ZIP64 rejected, encrypted rejected
- SHA-256 streamed during fetch, surfaced in meta.source
- Temp dir cleanup in all paths (try/finally)
Files:
- scanners/lib/vsix-fetch.mjs (HTTPS fetcher, host whitelist, streaming SHA-256)
- scanners/lib/zip-extract.mjs (zero-dep parser with hardening caps)
- knowledge/marketplace-api-notes.md (endpoint reference)
- 3 test files (48 tests added: vsix-fetch, zip-extract, ide-extension-url)
Tests: 1296 → 1344 (all green).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add threshold-based exit codes (--fail-on <severity>) and compact
output mode (--compact) to scan-orchestrator and CLI. Pipeline
templates for GitHub Actions, Azure DevOps, GitLab CI with SARIF
upload. CI/CD guide with Schrems II/NSM compliance documentation.
npm publish preparation (files whitelist, .npmignore). Policy ci
section for distributable CI defaults. Version 6.1.0.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Harden git clone attack surface for remote scans with defense-in-depth:
Layer 1 (all platforms): 8 git config flags disable hooks, symlinks,
filter/smudge drivers, fsmonitor, local file protocol. 4 env vars
isolate from system/user git config and block interactive prompts.
Layer 2 (OS sandbox): macOS sandbox-exec and Linux bubblewrap (bwrap)
restrict file writes to only the specific temp directory. bwrap
probe-tests availability before use. Graceful fallback on Windows
and Ubuntu 24.04+ (git config hardening only).
Additional: post-clone 100MB size check, UUID-unique evidence filenames,
evidence file cleanup, cleanup guarantee in scan/plugin-audit commands.
32 new tests (1147 total).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>