The plugin lives in ktg-plugin-marketplace and is distributed via the
Claude Code marketplace mechanism. There is no standalone
open/claude-code-llm-security repo; references to it were aspirational
and never realized.
- package.json: homepage now deep-links to plugins/llm-security/ in the
marketplace; repository.url uses the marketplace repo with directory
field (npm convention for monorepo plugins); bugs.url routes to
marketplace issue tracker.
- CLAUDE.md: "Public Repository" section replaced with "Distribution"
section documenting the marketplace install path.
- CONTRIBUTING.md: issue tracker URL points at marketplace issues with
[llm-security] prefix convention.
- CHANGELOG.md: v7.3.1 entry rewritten to reflect actual change
(URLs corrected to marketplace, not "fixed from one wrong URL to
another wrong URL").
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Replace count-based pass-rate with severity-weighted penalty:
- penalty = sum(count[s] * WEIGHTS[s])
- maxBudget = max(10, findingCount * 4)
- passRate = max(0, 100 - penalty / maxBudget * 100)
A few lows no longer crater an area's grade; a single high or critical
consumes a large fraction of budget. Mirrors the operator intuition that
severity, not count, is the signal.
BREAKING (intentional): scoring semantics differ from v4 for non-clean
configs. Add scoringVersion: 'v5' to the returned struct so consumers
can detect the version. baseline-all-a remains all-A (no critical/high
on that fixture).
Tests: +6 cases for severity weighting; existing "many findings" test
updated to use highs (where v5 still drops the grade as expected).
Promote WEIGHTS const to named export with Object.freeze for downstream
use in scoring.mjs (severity-weighted scoreByArea, F3).
Tests: +2 cases asserting WEIGHTS shape.
No behavior changes. Sets the public stance, tightens documentation, and
removes coherence drift so anyone forking or downloading the plugin gets
a consistent starting point.
Added:
- CONTRIBUTING.md — public fork-and-own guide. Why PRs are not accepted,
how to fork well, what is welcome via issues.
- README "Project scope" section — out-of-scope table naming what is
fork-and-own territory (web dashboard, fleet policy, runtime firewall,
IDE LSP, compliance pack, ticketing, multi-tenancy, ML detectors,
marketplace UI, SSO/SCIM/RBAC) with commercial alternatives.
- package.json: bugs.url, CONTRIBUTING/SECURITY/CHANGELOG in files
whitelist for npm publishing.
Changed:
- SECURITY.md rewritten. Supported-versions table from stale 5.1.x to
current reality (7.3.x active, 7.0-7.2 best-effort, <7.0 EOL).
Best-effort solo response timeline. Scope expanded to bin/.
- Scanner VERSION constants synced to plugin version. Was 6.0.0 in
dashboard-aggregator and posture-scanner.
- package.json repository.url corrected from fromaitochitta/ to open/.
- README "Feedback & contributing" links to CONTRIBUTING.md.
Fixed:
- pre-compact-scan size-cap timing test ceiling raised 500ms -> 1000ms.
Was a flake on Intel Mac and CI under load. Design target unchanged
(<500ms, documented in CLAUDE.md).
Notes:
- First patch on the stabilization line (post-2026-05-01).
- Wave E attack-simulator scenarios deferred indefinitely; coverage
remains at 72.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Step 7 of v2.0 plan. Registers SessionStart, Stop, and statusLine hooks.
Note: statusLine top-level placement in hooks.json is an open assumption
(brief Assumption 1) — verified to be valid JSON syntax; live smoke-test
required to confirm Claude Code loads it from this location vs requiring
settings.json placement.
Step 6 of v2.0 plan. SessionStart hook fires on source: resume or
source: compact, walks up to 3 levels searching for
NEXT-SESSION-*.local.md, injects content via additionalContext, and
archives the file (rename to *.archived.local.md) to prevent stale-load
in later sessions. 9 tests cover sources, multi-level search,
topic-slug variants, archive filtering, malformed payload.
Step 4 of v2.0 plan. statusLine hook reads context_window.used_percentage
from stdin payload and prints display-only hint at 60% / 70%. NEVER runs
git (research/03 — statusLine scripts can be cancelled mid-flight, unsafe
for side effects). 9 tests cover thresholds, null payload, malformed JSON.
Includes hook-helper.mjs copied from llm-security as test infrastructure.
Step 1 of v2.0 plan. Hard cut from commands/ to skills/ per Anthropic
recommendation for new plugins. Frontmatter sets disable-model-invocation:
true and pins model: claude-sonnet-4-6. Docs (README, CLAUDE.md, root
README) deferred to Step 9 per plan.
Batch C release. Closes 12 implementation tasks (E3, E8-E14, 8.4, 8.6,
8.7, 8.10) across four execution waves: A (bash + decoder), B (supply
chain + workflow scanner), C (MCP cumulative drift), D (code quality).
Wave E (9 new attack-simulator scenarios for the new defenses) deferred
to v7.3.1 — defenses are unit-tested per wave; the deferred work adds
attack-simulator regression coverage on top, not the primary safety net.
Tests: 1665+ → 1777 (Wave A-D cumulative, +112).
Version sync targets touched:
- package.json
- .claude-plugin/plugin.json
- CLAUDE.md (header)
- README.md (badge + new release-history row)
- scanners/ide-extension-scanner.mjs (VERSION constant)
- ../../README.md (marketplace root plugin entry)
- CHANGELOG.md (new [7.3.0] section per Keep a Changelog, all 12 task
IDs covered individually under Added/Changed/Documentation/Tests/Notes)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- package.json med node:test runner og scripts (test, simulate), zero deps
- settings.json: fjern vestigial exploration- og agentTeam-blokker (verifisert leset av ingen kode via grep)
- docs/: commit subagent-delegation-audit.md og ultraexecute-v2-observations-from-config-audit-v4.md (begge real arkitektur-notater)
- docs/: arkiver ultra-suite-brief_2.md som _archive- (var paste fra annet plugin-arbeid, irrelevant her)
- tests/helpers/hook-helper.mjs kopiert fra llm-security m/ provenance-kommentar
Forberedelse for Spor 1 (lib/-moduler), Spor 2 (HANDOVER-CONTRACTS + PreCompact-hook), Spor 3 (bug-fixes + CC-features).
Plan: ~/.claude/plans/det-neste-vi-gj-r-eventual-adleman.md
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Extract `/ultra-cc-architect-local` and `/ultra-skill-author-local` plus all 7
supporting agents, the `cc-architect-catalog` skill (13 files), the
`ngram-overlap.mjs` IP-hygiene script, and the skill-factory test fixtures
from `ultraplan-local` v2.4.0 into a new `ultra-cc-architect` plugin v0.1.0.
Why: ultraplan-local had drifted into containing two distinct domains — a
universal planning pipeline (brief → research → plan → execute) and a
Claude-Code-specific architecture phase. Keeping them together forced users
to inherit an unfinished CC-feature catalog (~11 seeds) when they only
wanted the planning pipeline, and locked the catalog and the pipeline into
the same release cadence. The architect was already optional and decoupled
at the code level — only one filesystem touchpoint remained
(auto-discovery of `architecture/overview.md`), which already handles
absence gracefully.
Plugin manifests:
- ultraplan-local: 2.4.0 → 3.0.0 (description + keywords updated)
- ultra-cc-architect: new at 0.1.0 (pre-release; catalog is thin, Fase 2/3
of skill-factory unbuilt, decision-layer empty, fallback list still
needed)
What stays in ultraplan-local: brief/research/plan/execute commands, all
19 planning agents, security hooks, plan auto-discovery of
`architecture/overview.md` (filesystem-level contract, not code-level).
What moved (28 files via git mv, R100 — full history preserved):
- 2 commands, 8 agents, 1 skill catalog (13 files), 2 scripts, 8 fixtures
Documentation updates: plugin CLAUDE.md and README.md for both plugins,
root README.md (added ultra-cc-architect section, updated ultraplan-local
section), root CLAUDE.md (added ultra-cc-architect to repo-struktur),
marketplace.json (registered ultra-cc-architect), ultraplan-local
CHANGELOG.md (v3.0.0 entry with migration guidance).
Test verification: ngram-overlap.test.mjs passes 23/23 from new location.
Memory updated: feedback_no_architect_until_v3.md now points at the new
plugin and reframes the threshold around catalog maturity rather than an
ultraplan-local milestone.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Wave C step C3: closes E14 with the user-facing reset command.
After a legitimate MCP server upgrade the sticky baseline (added in C1)
becomes a stale "what the tool used to say" anchor and every subsequent
post-mcp-verify advisory will re-flag the change. /security mcp-baseline-reset
lets the user acknowledge the upgrade so the next call seeds a fresh
baseline.
New files:
- scanners/mcp-baseline-reset.mjs — small CLI wrapper around clearBaseline /
listBaselines. Modes: --list (read-only), --target <name>, no-args (all).
Outputs JSON summary on stdout. Exit 0 always (idempotent).
- commands/mcp-baseline-reset.md — dispatcher following mcp-inspect.md
shape. Frontmatter: name=security:mcp-baseline-reset, sonnet model,
Read/Bash/AskUserQuestion tools. 4-step body (list -> confirm scope
-> execute -> confirm result).
- tests/scanners/mcp-baseline-reset.test.mjs — 10 CLI tests across
--list, --target, clear-all, idempotency, history preservation, and
bare-positional sugar.
Updated:
- commands/security.md — new row in commands table after mcp-inspect.
- CLAUDE.md — new commands-table row + new v7.3.0 narrative section
describing the baseline schema, cumulative-drift detection, reset
semantics, and the LLM_SECURITY_MCP_CACHE_FILE override.
- Plugin README.md — new MCP-baseline-reset row in commands table,
scanner count 12 standalone -> 13 standalone, new "MCP Description
Drift (E14, v7.3.0)" subsection explaining the sticky baseline,
cumulative threshold, reset semantics, and env-var override.
- Root marketplace README.md — scanner count 22 -> 23 (10 orchestrated +
13 standalone), command count 19 -> 20, test count 1511 -> 1768.
Wave C complete: 1738 -> 1768 tests (+30 across C1/C2/C3). Per plan,
Wave C does NOT bump the plugin version — that lands at the wave-bundle
release. The advisory text in post-mcp-verify already references the
new command path so the user has a ready remediation step.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Wave C step C2: surface the cumulative-drift signal from
checkDescriptionDrift() (added in C1) as a separate MEDIUM advisory
with finding category mcp-cumulative-drift. Independent of the existing
per-update drift advisory — a slow-burn rug-pull that keeps each update
below the 10% per-update threshold but cumulatively drifts >=25% from
the sticky baseline now triggers the new advisory without ever crossing
the per-update bar.
The advisory references /security mcp-baseline-reset (added in C3) so
the user knows how to acknowledge a legitimate MCP server upgrade.
CLAUDE.md updates:
- post-mcp-verify hooks-table row mentions per-update + cumulative drift
- mcp-description-cache lib bullet documents baseline schema, history,
cumulative threshold policy key, and LLM_SECURITY_MCP_CACHE_FILE
override.
Tests: 2 new hook tests using LLM_SECURITY_MCP_CACHE_FILE for cache
isolation. Existing 68 still pass; total 70.
Plugin README and root marketplace README updates land in C3 alongside
the new /security mcp-baseline-reset slash command (combined Wave-C
doc update per plan §"Wave C — Touch" list).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Wave C step C1: extend the MCP description cache schema with a sticky
baseline slot per tool and a rolling history array (last 10 drift events).
Cumulative drift = levenshtein(current, baseline) / max(|current|, |baseline|);
emits a separate signal when ratio >= mcp.cumulative_drift_threshold
(default 0.25). Per-update drift logic and threshold unchanged.
- loadCache(): TTL purge now skips entries with a baseline, preserving
cumulative-drift detection across the 7-day window. v7.2.0 entries
(no history field) are migrated on read by seeding baseline from the
current description and adding an empty history array. Entries with
history but no baseline (post-clearBaseline) are NOT re-seeded.
- checkDescriptionDrift(): when an entry exists with history but no
baseline (i.e. baseline was cleared), the next call re-seeds baseline
from the incoming description so the legitimate next version becomes
the new baseline.
- clearBaseline(toolName?): removes baseline for one tool or all tools.
Preserves description / firstSeen / lastSeen / history.
- listBaselines(): read-only listing for the upcoming reset CLI.
- LLM_SECURITY_MCP_CACHE_FILE env var override for end-to-end testing.
- New policy key mcp.cumulative_drift_threshold (default 0.25).
Tests: 23 new unit tests; existing 10 still pass.
Docs deferred: CLAUDE.md update lands in C3 alongside the new
/security mcp-baseline-reset command. C2 adds the hooks-table footer
note. Combined wave docs match plan §"Wave C — Touch" list.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Closes E11. Three new pieces, plus integration:
1. Re-interpolation detector (Appsmith GHSL-2024-277 stealth pattern).
The scanner now collects env: bindings (key -> source-expression
text) by walking parsed events whose parentChain includes 'env',
then for each `${{ env.<KEY> }}` inside run:, re-injects MEDIUM
if the binding source matches the 23-field blacklist. This
catches the pattern where developers apply env-indirection but
then re-interpolate the env var in run:, which cancels the
mitigation (template substitution happens before shell parsing).
2. Auth-bypass category (Synacktiv 2023 Dependabot spoofing).
Detects `if: ${{ github.actor == 'dependabot[bot]' }}` and
variants. MEDIUM, owasp: 'LLM06' (Excessive Agency). Distinct
from injection — same expression syntax, different threat class.
Recommendation steers users to `github.event.pull_request.user.login`.
3. severity.mjs OWASP map registration. WFL prefix added to all
four maps:
- OWASP_MAP['WFL'] = ['LLM02', 'LLM06']
- OWASP_AGENTIC_MAP['WFL'] = ['ASI04']
- OWASP_SKILLS_MAP['WFL'] = []
- OWASP_MCP_MAP['WFL'] = []
Empty arrays for skills/MCP are explicit, not omitted — keeps
`Object.keys(OWASP_MAP)` symmetric across maps.
4. scan-orchestrator.mjs registration. workflowScan added between
supply-chain and toxic-flow (toxic-flow correlates after primaries).
Verified via integration: orchestrator emits 9 WFL findings on
tests/fixtures/workflows/.
Bug fix: extractTriggers in workflow-yaml-state.mjs was collecting
sub-properties (`branches:`, `types:`) as triggers. Now tracks the
first nested indent level and ignores anything deeper.
Tests:
- 6 new cases in tests/scanners/workflow-scanner.test.mjs:
re-interp TP, no-double-count, auth-bypass TP, auth-bypass FP
(startsWith head_ref is not auth-bypass), OWASP map shape,
orchestrator import + SCANNERS array entry.
- 2 new fixtures: tp-reinterpolation.yml, auth-bypass-dependabot.yml.
- Existing 14 scanner tests + 15 state-machine tests unchanged.
Test count: 1732 -> 1738 (+6). Wave B total: +53 over baseline 1685.
Pre-compact-scan flake unchanged (passes in isolation).
Adds a scope-hopping detector to the npm install gate. When a user
installs `@<scope>/<unscoped>`, the hook now emits a MEDIUM warning
on stderr (exit 0, never blocks) if:
- `<unscoped>` matches a popular npm package (POPULAR_NPM, ~80
names from knowledge/top-packages.json), AND
- `<scope>` is not on NPM_OFFICIAL_SCOPES (built-in 22 entries) or
on policy.json `supply_chain.allowed_scopes`.
Why: an attacker publishing `@evilcorp/lodash` cannot squat the bare
`lodash` name, but they can register an unrelated scope and rely on
typo or copy-paste to trick installs. NPM_OFFICIAL_SCOPES anchors the
known-good scopes (@types, @reduxjs, @nestjs, …) so legitimate
installs stay silent.
Implementation:
- `scanners/lib/supply-chain-data.mjs`: exports POPULAR_NPM,
NPM_OFFICIAL_SCOPES, and `checkScopeHop(name, extraAllowedScopes)` —
pure function, no policy/network dependency, fully unit-testable.
- `knowledge/typosquat-allowlist.json`: mirrors NPM_OFFICIAL_SCOPES as
`npm_official_scopes`. A doc-consistency assertion ensures the two
lists never drift.
- `hooks/scripts/pre-install-supply-chain.mjs`: imports checkScopeHop,
reads `supply_chain.allowed_scopes` from policy, and pushes a
warning before existing compromised/audit checks.
Tests:
- 9 new cases in tests/hooks/pre-install-supply-chain.test.mjs:
TP @evilcorp/lodash, TP @attacker/express, allowlist @types,
allowlist @reduxjs, allowlist @modelcontextprotocol, FP unscoped
name not in top-100, bare unscoped name, policy override, defensive
non-string input, NPM_OFFICIAL_SCOPES <-> typosquat-allowlist.json
consistency.
Adds scanGitAttributes(repoDir) — pure function that parses
.gitattributes after a sandboxed clone and returns the
{filter,diff,merge} driver entries that would run on checkout. The
clone CLI prints each entry as a "MEDIUM" stderr advisory followed by
a recommendation to verify the smudge/clean command before moving the
clone outside the sandbox.
Why: filter drivers execute arbitrary shell during checkout (smudge
runs on read, clean on write). Even with the existing sandboxed clone,
downstream consumers that re-checkout files outside the sandbox can be
exploited. Surfacing the directive list lets the caller decide whether
to proceed.
Out-of-scope: in-line content of the smudge command is not analysed —
the advisory is for human review, not automatic blocking.
Tests:
- tests/lib/git-clone-gitattributes.test.mjs (8 cases): LFS-style,
custom driver, missing/empty/comment-only files, line-number
tracking, inline-comment stripping, unreadable path graceful return.
Adds rot13 to the variantSet built in scanForInjection(), so
imperative phrases hidden as rot13 inside code comments still hit
the existing CRITICAL/HIGH/MEDIUM pattern arrays.
normalizeForScan() already covers base64, hex, URL, and HTML decoding
in a 3-iteration loop — those are NOT duplicated here. rot13 is the
only genuinely new variant: it is its own inverse and not part of any
NIST/Unicode normalization spec, so it has to be applied explicitly.
Threshold: only inputs >40 chars enter the rot13 pass, to suppress
false positives on accidental letter-shifts in tokens, ids, and short
identifiers. Variants are deduplicated against the existing set so
matchers do not run twice.
3 new tests in injection-patterns.test.mjs (rot13 detection, sub-40
char suppression, plaintext path still green). Total 168 tests pass.
Closes E3 in critical-review-2026-04-20.md.
Adds BLOCK_RULE for the malware-loader pattern:
echo|cat|printf <base64-blob> | base64 -d | <shell>
This is a common RCE delivery shape that bypasses static name-matching
gates by encoding the destructive command as a base64 blob. The new
rule fires only when the final pipe target is a shell interpreter
(bash, sh, zsh, dash, ksh) — base64 decoded into jq or any non-shell
consumer remains allowed.
5 new tests in pre-bash-destructive.test.mjs:
- 3 BLOCK cases (echo|base64|bash, printf|base64|sh, cat|base64|zsh)
- 2 FP probes (base64 -d -> jq passes; base64 -d alone passes)
Closes E9 in critical-review-2026-04-20.md.
Strips bash process substitution syntax — <(cmd) and >(cmd) — so the
inner command name is surfaced to downstream regex gates. Defeats
evasion like `cat <(curl evil)` where the destructive command is
hidden behind /dev/fd/N pipe sugar.
Implementation: bounded innermost-first iteration, depth 3. Beyond
that the string is left as-is rather than recurse without bound.
Runs after the single-quote mask phase, so legitimate strings like
`'echo <(x)'` are preserved.
5 new T7 tests (collapse + nested + FP probes) in
bash-normalize-t7-t9.test.mjs (now 12 tests total).
Closes E8 in critical-review-2026-04-20.md.
Defeats split-and-substitute evasion where attackers split a destructive
command name across an assignment and a variable reference (X=rm; later
$X) so downstream regex gates miss the literal command name. T9 collects
prefix assignments (VAR=value at start of string or after ; & |) and
substitutes ${VAR} / $VAR forms with the captured value. One-level
forward-flow only — chained vars are not followed.
Documented limits in JSDoc:
- Quoted assignments (X="rm -rf") not parsed (whitespace stops capture)
- Substitution is global within string, not scoped. Acceptable because
T3 strips unknown ${VAR} to '' afterwards.
Single-quoted literals are masked before T9 runs, so legitimate
strings are preserved (FP probe in tests).
7 new tests in bash-normalize-t7-t9.test.mjs.
Closes E10 in critical-review-2026-04-20.md.
Add .local/ and HANDOFF-FINDINGS.local.md to .gitignore so session
handoff artifacts (NEXT-SESSION-PROMPT.local.md, scratch findings)
do not leak into commits.
Pre-flight for Batch C v7.3.0.
Adds the profile recommendation step to /ultrabrief-local Phase 4. The
brief stays universal (same questions, same template); the new step is
purely a processing-decision layer that records which profile downstream
commands should apply.
What lands:
- agents/profile-recommender.md — new sonnet agent that scores available
profiles against the finalized brief (keyword + NFR-signal matching,
axis bumps, hallucination gate that forbids inventing profile names).
Emits a fenced JSON block with ranked entries.
- templates/ultrabrief-template.md — frontmatter gains
recommended_profile, profile_match, profile_rationale (default values
applied when only `default` is available — true at M1).
- commands/ultrabrief-local.md — Phase 4 gains Step 4h with explicit
branches: short-circuit when only `default` exists; AskUserQuestion
confirmation when top score ≥ 0.7; explicit fallback message when below
threshold; manual selection sub-question on user override. Persists the
three frontmatter fields to brief.md after user confirmation. JSON
parser failure falls back to `default` with `profile_match: fallback`
rather than blocking — silent fallback is the worst outcome, but a
*visible* fallback is acceptable.
- scripts/profile-loader.mjs — adds selectRecommendation(ranked, opts) +
RECOMMENDATION_THRESHOLD=0.7 export. Single source of truth for the
threshold logic so the command spec and the helper agree.
- scripts/profile-loader.test.mjs — 10 new tests for selectRecommendation
(default-only, empty/malformed input, above/below threshold, custom
threshold, max-by-score, missing fields). Total now 36/36.
- README.md / CLAUDE.md / marketplace landing — docs reflect M0 + M1
shipped, M2 + M3 still pending.
In practice nothing changes for users at M1 because only `default` is
available — Step 4h takes the short-circuit path and writes
`profile_match: default-only`. M2 ships the additional profiles that
make the recommender meaningful.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Introduces a profile-loader infrastructure for runtime-instantiable
ultraplan variants (depth × domain × goal axes). M0 ships only the
`default` profile, which mirrors the current hardcoded Phase 5/9 agent
set — so existing flows are unaffected.
What lands:
- profiles/default.yaml — schema v1, lists current 8 exploration agents
+ 2 review agents, captures today's adversarial regime
- scripts/profile-loader.mjs — null-deps Node loader with limited-subset
YAML parser, listProfiles(), loadProfile(), validateProfile() that
cross-checks every referenced agent exists in agents/
- scripts/profile-loader.test.mjs — 26 node:test cases (parser, validation,
loader, integration with built-in default.yaml)
- commands/ultraplan-local.md — Phase 1 gains a "Resolve the profile"
step (--profile flag → brief.recommended_profile → default fallback)
and prints profile + source in the mode report. Phase 5/9 unchanged.
- README.md, CLAUDE.md, marketplace README — documentation of the M0
foundation, the universal-brief design principle, and the M1/M2/M3
milestones to come.
M1 (next) wires profile recommendation into ultrabrief Phase 4. M2
ships the additional built-in profiles (quick, bugfix, feature, refactor,
security-deep, research-heavy) and replaces the hardcoded Phase 5 agent
table with profile-driven selection. M3 adds user-extensible profiles.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The v7.0.0 entropy-scanner rule 18 suppressed every line whose pattern
matched  — regardless of the URL host or what the URL
carried. A markdown image URL pointing at a non-CDN host (or carrying a
secret-shaped token in its query string) would therefore mask a real
high-entropy credential.
Refactor:
* MARKDOWN_IMAGE now captures the full URL (was a host-only prefix
matcher), so rule 18 can inspect host and query.
* MARKDOWN_IMAGE_CDN_HOSTS allowlist constant covers cdn./images./
media./assets./static./*.cdn./*.amazonaws.com/{s3,cloudfront}/
*.cloudflare./*.fastly./*.akamaized./raw.githubusercontent.com/
*.imgix.net/*.cloudinary.com/.
* MARKDOWN_IMAGE_QUERY_SECRET catches secret-shaped query keys
(token, key, secret, password, api_key, access_token, auth) plus
well-known provider prefixes (AKIA, Bearer, sk_live_, ghp_, ghs_,
ghu_, gho_, ghr_, npm_).
* Rule 18 now suppresses iff (host matches CDN allowlist) AND
(query has no secret-shaped token). Anything else falls through
to entropy classification.
+4 tests in tests/scanners/entropy-context.test.mjs (29 → 33).
Existing rule 18 fixture (cdn.example.com, no secret query) still
suppresses, so no regression on the legitimate path.
Refs: Batch B Wave 5 / Step 13 / v7.2.0
critical-review-2026-04-20.md §E18
The v7.0.0 entropy-scanner ran rules 11-13 (GLSL/CSS-in-JS/inline-markup
line-proximity suppressions) for every line regardless of file type. A
polyglot `.ts` file with an embedded fragment-shader template literal
could therefore mask a real high-entropy credential when the credential
literal happened to share a line with a GLSL keyword. Critical-review
B5 documented the false-negative class.
Refactor:
* New `classifyFileContext(absPath, lines)` returns
`'shader-dominant' | 'markup-dominant' | 'code-dominant' | 'mixed'`,
keyed off file extension with a content-density fallback for
code-extension files (≥50% of sampled non-blank lines matching
GLSL/inline-markup → downgrade to `mixed`).
* `isFalsePositive(str, line, absPath, context)` gates rules 11-13
on `context !== 'code-dominant'`. Rules 1-10 and 14-19 still run
unconditionally, so URL/path/test-fixture/ffmpeg/UA/SQL/error-
template suppression behaves identically.
* `scanFileContent` computes `fileContext` once per file and threads
it through every per-string suppression check.
Conservative defaults to keep the regression surface minimal:
* Files with `<5` sampled non-blank lines fall back to `mixed`
(preserves the existing rule-11/12/13 behaviour for the single-
line .js fixtures used by entropy-context.test.mjs).
* Unknown extensions fall back to `mixed`.
* Code-extension files densely populated with shader/markup
content fall back to `mixed`.
Net effect: a `.ts` file with an embedded GLSL block but mostly TS
code on the surrounding lines now surfaces credentials that the
v7.0.0 line-proximity heuristic suppressed. Pure shader/markup
files are unaffected (extension skip / mixed default).
New fixture: tests/fixtures/entropy/polyglot-ts-with-glsl.ts (with
runtime placeholder so it does not commit a high-entropy literal).
+3 tests in tests/scanners/entropy-context.test.mjs (26 → 29).
Existing entropy.test.mjs and entropy-context.test.mjs all remain
green. Full suite 1658 → 1661.
Refs: Batch B Wave 5 / Step 12 / v7.2.0
critical-review-2026-04-20.md §B5
The existing CRITICAL pattern in injection-patterns.mjs only fires when
a comment body contains AGENT/AI/HIDDEN markers. Adversaries can drop
the marker and still hide instructions inside <!-- ... --> for any
agent that reads page source. This generalizes the comment scan: every
comment body is HTML-entity-decoded and run through the full
injection rule set. The existing keyword-restricted pattern still
fires (defense-in-depth).
Emits at the strongest tier with category html-comment-injection.
+3 tests (65 → 68).
Refs: Batch B Wave 4 / Step 11 / v7.2.0
SVG containers carry text that is invisible in the rendered image but
fully parsed by an agent reading the source. <desc>, <title>,
<metadata>, and <foreignObject> are all valid surfaces for adversarial
injection.
Adds a per-element extractor inside the existing HTML-tag gate, gated
on /<svg[\s>]/i so it only fires for actual SVG content. Inner text is
HTML-entity-decoded then run through scanForInjection. Emits at the
strongest tier with category svg-element-injection.
+3 tests (62 → 65).
Refs: Batch B Wave 4 / Step 10 / v7.2.0
Adversarial payloads in markdown link title attributes (rendered as
tooltips, parsed by agents) bypassed the existing HTML-content checks
which gated on `<tag>` presence. Pattern: [text](url "title").
Adds linkTitleRegex extraction to the HTML-content block, runs each
captured title through scanForInjection, emits at the strongest tier
encountered with category markdown-link-title-injection.
+3 tests (62 → 62 in post-mcp-verify.test.mjs file, was 59).
Refs: Batch B Wave 4 / Step 9 / v7.2.0
Two follow-up fixes after E16 + E17 landed:
1. foldHomoglyphs ASCII fast-path
- scanForInjection calls foldHomoglyphs on every scan (raw + normalized).
- Pre-fix: NFKC normalization runs unconditionally, even on pure
ASCII inputs where it's a no-op.
- Result: benchmark.test.mjs timed out at 120s on the full suite.
- Fix: charCodeAt sweep for >=128, short-circuit return s when
all ASCII. NFKC and HOMOGLYPH_MAP iteration only run when
non-ASCII chars are present (the actual attack case).
- Verified: benchmark.test.mjs passes within timeout.
2. Attack-scenario UNI-003 expectation
- Pre-E16: "Homoglyph Cyrillic-Latin mixing" payload triggered only
a MEDIUM "obfuscation present" advisory (exit 0, stdout match
"MEDIUM").
- Post-E16: the same payload is folded to Latin BEFORE pattern
matching, so it now matches CRITICAL "ignore previous instructions"
and blocks (exit 2).
- This is the intended v7.2.0 behavior — not a regression. Updated
expectation: exit_code 2, stdout_match "block". Renamed scenario
to "now blocked via E16 fold, v7.2.0".
Suite: pre-compact-scan flake remains (perf-budget under load,
passes isolated). All other tests green.
Critical-review §4 E17 finding: pre-v7.2.0 the delegation-after-input
advisory fired only within a 5-call window. Attackers who deliberately
waited 6+ calls before delegating bypassed detection. Window was also
hardcoded — operators couldn't tune it for their environment.
Two coordinated changes:
1. LLM_SECURITY_ESCALATION_WINDOW env var (primary window override)
- parseInt(env) || getPolicyValue('trifecta', 'escalation_window', 5)
- Mirrors the established pattern from
LLM_SECURITY_TRIFECTA_MODE et al.
- Setting env=3 narrows; env=8 expands.
2. Secondary 20-call MEDIUM advisory (slow-burn variant)
- DELEGATION_ESCALATION_WINDOW_MEDIUM = 20 (hardcoded — same value
for all operators; tunable in a future patch if needed)
- checkEscalationAfterInput now returns `tier: 'primary'|'secondary'|null`
- formatEscalationWarning emits a different message for secondary —
mentions "slow-burn", references env-var, distinct from the
primary "DeepMind Category 4" framing
Hook reads max(WINDOW_SIZE, secondary+5) entries to cover the wider
window. Existing duplicate-suppression (`escalation_warning` state
entry) covers both tiers. Audit-trail event captures `tier` field.
Tests: +5 cases in tests/hooks/post-session-guard.test.mjs:
- secondary window catches 9-call distance (slow-burn)
- secondary boundary at exactly 20 calls
- primary regression guard (1-call distance)
- env=3 narrows primary (4-call distance becomes secondary)
- env=8 expands primary (7-call distance stays primary)
Updated existing test "does NOT trigger when input_source is >5 calls
ago" — now requires >20 calls (secondary window catches 6-20).
Suite: 1644 → 1672 (+28 from new tests + extended scope). All green.
CLAUDE.md hooks table updated to document both windows and the env var.
Critical-review §4 E16 finding: pre-v7.2.0 homoglyph normalization fired
ONLY for the MEDIUM-advisory "obfuscation present" signal. Pattern
matchers in scanForInjection compared against raw + decoded variants
only — they did NOT compare against a fold-normalized variant. As a
result, "ignоre previous instructions" (Cyrillic о, U+043E) bypassed
the CRITICAL "ignore previous" pattern.
Two coordinated edits:
scanners/lib/string-utils.mjs
- Adds HOMOGLYPH_MAP (frozen) — surgical Cyrillic/Greek → Latin map.
~25 entries focused on injection-vocabulary letters
(a, e, o, c, p, x, y, i, j, s, l, A, E, O, C, P, X, Y, T).
- Adds foldHomoglyphs(s) — pipeline: NFKC → apply HOMOGLYPH_MAP.
NFKC handles Mathematical Alphanumeric (U+1D400 block), fullwidth
Latin (U+FF21 block), ligatures, width variants.
Excluded by design from HOMOGLYPH_MAP:
- Latin Extended (æ, ø, å, é, è, ñ, ü, ö, ä, ç, ß, þ, ð) — legitimate
Norwegian/German/French/Spanish letters. Map them and we false-positive
on every non-English source file.
- Greek letters not visually overlapping (β, γ, δ, ...)
- Cyrillic letters not visually overlapping (б, г, д, ж, ...)
scanners/lib/injection-patterns.mjs
- scanForInjection now builds a 4-variant set: raw, normalized,
folded(raw), folded(normalized). Set deduplication skips redundant
identical variants. Existing dedup-by-label (seenLabels Set) prevents
double-counts when the same pattern matches in multiple variants.
- foldHomoglyphs added to the imports.
Tests: +27 cases in tests/lib/string-utils-homoglyph.test.mjs:
- 6 Cyrillic → Latin (lowercase, uppercase, multiple substitutions,
Palochka U+04CF)
- 3 Greek → Latin
- 2 NFKC normalization (Math Bold, Fullwidth)
- 8 preserves-non-confusable (Norwegian æøå, German umlauts, French
accents, Spanish ñ, emoji, CJK, Arabic/Hebrew)
- 3 edge cases (empty, null/undefined, idempotency)
- 5 scanForInjection integration (Cyrillic ignore, Cyrillic Assistant,
Norwegian non-trigger, benign "ignore" comment, mixed Cyrillic+Greek)
Test-development found: U+1D5DC is "I" not "A" (test pin caught my
codepoint mistake — fixed during dev).
Suite: 1617 → 1644 (+27). All green.