Commit graph

119 commits

Author SHA1 Message Date
92fb0087fa feat(llm-security): add toxic-agent-demo example for TFA scanner [skip-docs]
Single-component lethal-trifecta walkthrough that drives
scanners/toxic-flow-analyzer.mjs against a deliberately
misconfigured fixture plugin. The fixture agent declares
tools: [Bash, Read, WebFetch], which alone covers all three
trifecta legs (input surface + data access + exfil sink). No
hooks/hooks.json is shipped, so TFA's mitigation logic finds
no active guards and emits a CRITICAL "Lethal trifecta:"
finding without downgrade.

Plugin marker is plugin.fixture.json (recognised by isPlugin())
rather than .claude-plugin/plugin.json — the latter is blocked
by the plugin's own pre-write-pathguard hook, and
plugin.fixture.json exists in isPlugin() specifically so
example fixtures can self-mark without touching guarded paths.

Three independent assertions (3/3 must pass): direct trifecta
present and CRITICAL; finding mentions the exfil-helper
component; description confirms "no hook guards detected"
(proves the mitigation path stayed inactive). expected-findings.md
documents the contract.

OWASP / framework mapping: ASI01, ASI02, ASI05, LLM01, LLM02, LLM06.

Docs updated: plugin README "Other runnable examples", plugin
CLAUDE.md "Examples" tabellen, CHANGELOG [Unreleased] Added.
[skip-docs] is appropriate because examples don't change what
the plugin "synes å dekke utad" — marketplace root README is
unaffected.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-05 15:15:04 +02:00
ca5a8cec67 feat(llm-security): add 3 more runnable threat examples [skip-docs]
Three new self-contained, runnable threat demonstrations under
examples/, continuing the batch started in 583a78c. Each example
has README.md + run-*.mjs + expected-findings.md and uses
state-isolation discipline so the user's real cache/state files
are never polluted.

- examples/supply-chain-attack/ — two-layer demonstration:
  pre-install-supply-chain (PreToolUse) blocks compromised
  event-stream version 3.3.6 and emits a scope-hop advisory for
  the @evilcorp scope; dep-auditor (DEP scanner, offline) flags
  5 typosquat dependencies plus a curl-piped install-script
  vector in the fixture package.json. Maps to LLM03/LLM05/ASI04.

- examples/poisoned-claude-md/ — all 6 memory-poisoning detectors
  fire on a deliberately poisoned CLAUDE.md plus a fixture
  agent file under .claude/agents (E15/v7.2.0 surface):
  detectInjection, detectShellCommands, detectSuspiciousUrls,
  detectCredentialPaths, detectPermissionExpansion,
  detectEncodedPayloads. No agent runtime needed — scanner
  imported directly. Maps to LLM01/LLM06/ASI04.

- examples/bash-evasion-gallery/ — one disguised variant per
  T1 through T9 evasion technique fed through pre-bash-destructive,
  verified BLOCK after bash-normalize strips the evasion. T8
  base64-pipe-shell uses its own BLOCK_RULE. The canonical
  destructive form uses a path token rather than the bare slash
  (regex word-boundary requires it). Source-string fragmentation
  pattern reused from the e2e attack-chain test. Maps to
  LLM06/ASI01/LLM01.

Plugin README "Other runnable examples" section + plugin
CLAUDE.md "Examples" table + CHANGELOG Unreleased/Added
all updated. Marketplace root README unchanged
([skip-docs] for marketplace-level gate — plugin's outward
coverage is unchanged, only demonstrations were added).
2026-05-05 15:01:20 +02:00
583a78c6cc feat(llm-security): add lethal-trifecta + mcp-rug-pull example contents [skip-docs]
Companion to 8df5d5c (which only carried the doc updates — the example
directories themselves were left out of staging by mistake). This
commit adds the actual example mappes:

- examples/lethal-trifecta-walkthrough/{README.md, run-trifecta.mjs,
  expected-findings.md}
- examples/mcp-rug-pull/{README.md, run-rug-pull.mjs,
  expected-findings.md}

Plus plugin CLAUDE.md "Examples (runnable demonstrations)" section
with a 4-row table covering malicious-skill-demo, prompt-injection-
showcase, lethal-trifecta-walkthrough, and mcp-rug-pull plus the
state-isolation discipline notes.

Marketplace root README unchanged since plugin's outward coverage
is unchanged ([skip-docs] covers the marketplace-level gate).
2026-05-05 14:45:39 +02:00
8df5d5c70e feat(llm-security): add lethal-trifecta + mcp-rug-pull examples [skip-docs]
Two new self-contained, runnable threat demonstrations under examples/:

- lethal-trifecta-walkthrough/ — feeds 5 hook calls (WebFetch, Read .env,
  Bash curl POST + suppression follow-ups) into post-session-guard and
  verifies the Rule-of-Two advisory fires exactly on leg 3. State
  isolated via run-script PID so /tmp/llm-security-session-*.jsonl is
  not polluted. Treffer post-session-guard, ASI01/ASI02, LLM01/LLM02.

- mcp-rug-pull/ — mutates an MCP tool description across 8 stages.
  Each per-update <10% Levenshtein, cumulative reaches 32.2% by stage
  7 — proves the v7.3.0 (E14) mcp-cumulative-drift MEDIUM advisory
  catches slow-burn rug-pulls that the per-update detection would
  miss. Uses LLM_SECURITY_MCP_CACHE_FILE to isolate cache. Treffer
  post-mcp-verify, mcp-description-cache.mjs, OWASP MCP05/LLM03/ASI04.

Each example: README.md + run-*.mjs + expected-findings.md.
Plugin README "Other runnable examples" section + CHANGELOG
[Unreleased] Added bullets + plugin CLAUDE.md "Examples" section
all updated in this commit. Marketplace root README unchanged
since plugin's outward coverage is unchanged ([skip-docs]
covers the marketplace-level gate).
2026-05-05 14:45:15 +02:00
f835777c1e test(llm-security): add e2e suite proving framework works as coordinated system
Three new files in tests/e2e/ (45 tests, 1777 -> 1822):

- attack-chain.test.mjs (17): full hook stack against attack payloads in
  sequence -- prompt injection at the gate; T1/T5/T8 bash evasions;
  pathguard on .env / .ssh; secrets hook on AWS-shaped keys and PEM
  headers; markdown link-title and HTML-comment poisoning in tool
  output; trifecta accumulation over a single session with dedup on
  the next benign call.

- multi-session.test.mjs (9): state persistence across simulated
  session boundaries. Uses the fact that a hook child's process.ppid
  equals the test runner's process.pid, so writing the session state
  file directly simulates "previous session" history. Covers slow-burn
  trifecta (legs spread >50 calls), MCP cumulative description drift
  via LLM_SECURITY_MCP_CACHE_FILE override, and pre-compact transcript
  poisoning in warn / block / clean / missing-file modes.

- scan-pipeline.test.mjs (19): scan-orchestrator + all 10 scanners +
  toxic-flow correlator against poisoned-project (BLOCK / 95 / Extreme)
  and grade-a-project (WARNING / 48 / High). Asserts envelope shape,
  verdict, risk_score, severity counts, OWASP coverage, scanner
  enumeration, and a narrative-coherence cross-check that the BLOCK
  scan strictly outranks the WARNING scan along every axis.

Test files build credential-shaped payloads at runtime via concatenation
so they contain no literal matches for the pre-edit-secrets regexes
(memory rule feedback_secrets_hook_test_fixtures.md).

Doc updates in same commit per marketplace policy:
- CLAUDE.md header: 1777+ -> 1822+ tests, mentions tests/e2e/
- README.md badge tests-1777 -> tests-1822, body text updated
- CHANGELOG.md: new [Unreleased] Added section describing scope

No version bump. No behavior changes outside tests/.
2026-05-05 12:06:57 +02:00
490d4eddc6 docs: introduce GOVERNANCE.md and unify fork-and-own blurb
Establish a single governance document at marketplace root and copy
it into each of the 9 plugins so every plugin folder remains 100%
self-contained. Replace the inconsistent provocative blurb across
all READMEs with a uniform fork-and-own paragraph that links to
the local GOVERNANCE.md.

[skip-docs]

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-03 14:57:00 +02:00
Kjell Tore Guttormsen
9ea5a2e6c6 chore(privacy): scrub real-org references from plugin internals (phase 2)
Same bulk replacement applied to plugin-internal KB, examples, fixtures,
tests, and docs. Real organization names, persona names, internal system
identifiers, and domain-specific terms replaced with fictional generic
public-sector entity (DDT) and generic terminology.

Scope:
- okr/ — examples, governance, framework, integrations, sources
- ms-ai-architect/ — KB references (engineering, governance, security,
  infrastructure, advisor), tests/fixtures, agents, docs
- linkedin-thought-leadership/ — voice samples, network-builder,
  examples (genericized identifying headlines to "[your organization]")
- llm-security/ — research notes, scan report

Manual genericization beyond bulk replace:
- okr SKILL.md "Primary user / Domain" — generic Norwegian public sector
- linkedin-voice SKILL.md headline placeholder
- network-builder.md headline placeholder
- high-engagement-posts.md voice sample employer line + hashtag

Phase 3 (factual-attribution review) remains: a few KB files attribute
publicly known transport-sector docs/datasets (e.g. håndbok V440, NVDB)
to the fictional DDT after bulk replace. Needs manual semantic review
to either remove or restore correct citation without re-introducing
affiliation references.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-03 04:28:15 +02:00
Kjell Tore Guttormsen
8ca391fdb2 fix(llm-security): correct distribution URLs to marketplace path
The plugin lives in ktg-plugin-marketplace and is distributed via the
Claude Code marketplace mechanism. There is no standalone
open/claude-code-llm-security repo; references to it were aspirational
and never realized.

- package.json: homepage now deep-links to plugins/llm-security/ in the
  marketplace; repository.url uses the marketplace repo with directory
  field (npm convention for monorepo plugins); bugs.url routes to
  marketplace issue tracker.
- CLAUDE.md: "Public Repository" section replaced with "Distribution"
  section documenting the marketplace install path.
- CONTRIBUTING.md: issue tracker URL points at marketplace issues with
  [llm-security] prefix convention.
- CHANGELOG.md: v7.3.1 entry rewritten to reflect actual change
  (URLs corrected to marketplace, not "fixed from one wrong URL to
  another wrong URL").

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-01 06:20:54 +02:00
Kjell Tore Guttormsen
62a9335772 chore(llm-security): v7.3.1 — stabilization patch for forkers and downstream users
No behavior changes. Sets the public stance, tightens documentation, and
removes coherence drift so anyone forking or downloading the plugin gets
a consistent starting point.

Added:
- CONTRIBUTING.md — public fork-and-own guide. Why PRs are not accepted,
  how to fork well, what is welcome via issues.
- README "Project scope" section — out-of-scope table naming what is
  fork-and-own territory (web dashboard, fleet policy, runtime firewall,
  IDE LSP, compliance pack, ticketing, multi-tenancy, ML detectors,
  marketplace UI, SSO/SCIM/RBAC) with commercial alternatives.
- package.json: bugs.url, CONTRIBUTING/SECURITY/CHANGELOG in files
  whitelist for npm publishing.

Changed:
- SECURITY.md rewritten. Supported-versions table from stale 5.1.x to
  current reality (7.3.x active, 7.0-7.2 best-effort, <7.0 EOL).
  Best-effort solo response timeline. Scope expanded to bin/.
- Scanner VERSION constants synced to plugin version. Was 6.0.0 in
  dashboard-aggregator and posture-scanner.
- package.json repository.url corrected from fromaitochitta/ to open/.
- README "Feedback & contributing" links to CONTRIBUTING.md.

Fixed:
- pre-compact-scan size-cap timing test ceiling raised 500ms -> 1000ms.
  Was a flake on Intel Mac and CI under load. Design target unchanged
  (<500ms, documented in CLAUDE.md).

Notes:
- First patch on the stabilization line (post-2026-05-01).
- Wave E attack-simulator scenarios deferred indefinitely; coverage
  remains at 72.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-01 06:14:03 +02:00
Kjell Tore Guttormsen
7219a5fe20 docs(readme): total overhaul for v7.3.0
Rewrites README.md from 919 → 484 lines (47% reduction). Modernized
structure, all counts updated to v7.3.0 reality (commands 19→20,
scanners 22→23, knowledge 19→22, tests 1665→1777), trimmed Version
History to last 3 versions with link to CHANGELOG.md.

Structural changes:
- Removed dated "Prompt Injection Showcase (v5.0)" section
- Removed verbose Directory Structure tree (file paths discoverable
  from CLAUDE.md and the file system itself)
- Collapsed Knowledge Base 18-row table into 5-category summary
- Merged "Architecture" mermaid + "What's inside" into single layered
  overview
- Tightened Compliance & Governance, OWASP Coverage, Workflow Examples
  to essentials only
- Added explicit v7.3.0 sections inline:
  - npm scope-hop typosquat in supply-chain hook (E13)
  - workflow-scanner W F L row in Scanners (E11)
  - .gitattributes post-clone advisory in remote scanning table (E12)
  - MCP cumulative-drift baseline + reset in Output verification + own subsection (E14)
  - rot13 + T7-T9 bash-normalize in Prompt injection + Destructive commands hooks (E3/E8/E9/E10)
  - env-var deprecation runway in Compliance & Governance (8.7)
  - Hook count corrected to 9 throughout (8.10)
- New badges: commands-20, scanners-23, knowledge-22, tests-1777

Content preserved (load-bearing):
- AI-generated disclosure
- "no PRs accepted" framing
- Sandbox defense-in-depth tables
- OWASP coverage matrix
- Defense philosophy section
- Self-scan + malicious-skill-demo references
- Recommended-combo with parry-guard

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-01 05:37:42 +02:00
Kjell Tore Guttormsen
c4183b8b4d chore(release): bump to v7.3.0
Batch C release. Closes 12 implementation tasks (E3, E8-E14, 8.4, 8.6,
8.7, 8.10) across four execution waves: A (bash + decoder), B (supply
chain + workflow scanner), C (MCP cumulative drift), D (code quality).

Wave E (9 new attack-simulator scenarios for the new defenses) deferred
to v7.3.1 — defenses are unit-tested per wave; the deferred work adds
attack-simulator regression coverage on top, not the primary safety net.

Tests: 1665+ → 1777 (Wave A-D cumulative, +112).

Version sync targets touched:
- package.json
- .claude-plugin/plugin.json
- CLAUDE.md (header)
- README.md (badge + new release-history row)
- scanners/ide-extension-scanner.mjs (VERSION constant)
- ../../README.md (marketplace root plugin entry)
- CHANGELOG.md (new [7.3.0] section per Keep a Changelog, all 12 task
  IDs covered individually under Added/Changed/Documentation/Tests/Notes)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-01 05:28:45 +02:00
Kjell Tore Guttormsen
97c5c9d934 docs(claude-md): 8.10 — fix hooks count + add doc-consistency test for hook-table sync 2026-04-30 17:12:49 +02:00
Kjell Tore Guttormsen
ba5f2b64ad feat(policy-loader): 8.7 — env-var deprecation warnings (v8.0.0 removal) 2026-04-30 17:11:07 +02:00
Kjell Tore Guttormsen
e8ea75fe6b docs(hardening-guide): 8.6 — sandbox-architecture rationale (no code consolidation) 2026-04-30 16:55:45 +02:00
Kjell Tore Guttormsen
2b7329151c docs(severity): 8.4 — @deprecated annotation on riskScoreV1 2026-04-30 16:54:37 +02:00
Kjell Tore Guttormsen
001df2ebe8 feat(commands): E14 part 3 — /security mcp-baseline-reset slash command
Wave C step C3: closes E14 with the user-facing reset command.

After a legitimate MCP server upgrade the sticky baseline (added in C1)
becomes a stale "what the tool used to say" anchor and every subsequent
post-mcp-verify advisory will re-flag the change. /security mcp-baseline-reset
lets the user acknowledge the upgrade so the next call seeds a fresh
baseline.

New files:
- scanners/mcp-baseline-reset.mjs — small CLI wrapper around clearBaseline /
  listBaselines. Modes: --list (read-only), --target <name>, no-args (all).
  Outputs JSON summary on stdout. Exit 0 always (idempotent).
- commands/mcp-baseline-reset.md — dispatcher following mcp-inspect.md
  shape. Frontmatter: name=security:mcp-baseline-reset, sonnet model,
  Read/Bash/AskUserQuestion tools. 4-step body (list -> confirm scope
  -> execute -> confirm result).
- tests/scanners/mcp-baseline-reset.test.mjs — 10 CLI tests across
  --list, --target, clear-all, idempotency, history preservation, and
  bare-positional sugar.

Updated:
- commands/security.md — new row in commands table after mcp-inspect.
- CLAUDE.md — new commands-table row + new v7.3.0 narrative section
  describing the baseline schema, cumulative-drift detection, reset
  semantics, and the LLM_SECURITY_MCP_CACHE_FILE override.
- Plugin README.md — new MCP-baseline-reset row in commands table,
  scanner count 12 standalone -> 13 standalone, new "MCP Description
  Drift (E14, v7.3.0)" subsection explaining the sticky baseline,
  cumulative threshold, reset semantics, and env-var override.
- Root marketplace README.md — scanner count 22 -> 23 (10 orchestrated +
  13 standalone), command count 19 -> 20, test count 1511 -> 1768.

Wave C complete: 1738 -> 1768 tests (+30 across C1/C2/C3). Per plan,
Wave C does NOT bump the plugin version — that lands at the wave-bundle
release. The advisory text in post-mcp-verify already references the
new command path so the user has a ready remediation step.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-30 16:49:01 +02:00
Kjell Tore Guttormsen
427b68eca9 feat(post-mcp-verify): E14 part 2 — cumulative-drift MEDIUM advisory [skip-docs]
Wave C step C2: surface the cumulative-drift signal from
checkDescriptionDrift() (added in C1) as a separate MEDIUM advisory
with finding category mcp-cumulative-drift. Independent of the existing
per-update drift advisory — a slow-burn rug-pull that keeps each update
below the 10% per-update threshold but cumulatively drifts >=25% from
the sticky baseline now triggers the new advisory without ever crossing
the per-update bar.

The advisory references /security mcp-baseline-reset (added in C3) so
the user knows how to acknowledge a legitimate MCP server upgrade.

CLAUDE.md updates:
- post-mcp-verify hooks-table row mentions per-update + cumulative drift
- mcp-description-cache lib bullet documents baseline schema, history,
  cumulative threshold policy key, and LLM_SECURITY_MCP_CACHE_FILE
  override.

Tests: 2 new hook tests using LLM_SECURITY_MCP_CACHE_FILE for cache
isolation. Existing 68 still pass; total 70.

Plugin README and root marketplace README updates land in C3 alongside
the new /security mcp-baseline-reset slash command (combined Wave-C
doc update per plan §"Wave C — Touch" list).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-30 16:40:52 +02:00
Kjell Tore Guttormsen
eaac830300 feat(mcp-description-cache): E14 part 1 — baseline + history schema (cumulative drift) [skip-docs]
Wave C step C1: extend the MCP description cache schema with a sticky
baseline slot per tool and a rolling history array (last 10 drift events).
Cumulative drift = levenshtein(current, baseline) / max(|current|, |baseline|);
emits a separate signal when ratio >= mcp.cumulative_drift_threshold
(default 0.25). Per-update drift logic and threshold unchanged.

- loadCache(): TTL purge now skips entries with a baseline, preserving
  cumulative-drift detection across the 7-day window. v7.2.0 entries
  (no history field) are migrated on read by seeding baseline from the
  current description and adding an empty history array. Entries with
  history but no baseline (post-clearBaseline) are NOT re-seeded.
- checkDescriptionDrift(): when an entry exists with history but no
  baseline (i.e. baseline was cleared), the next call re-seeds baseline
  from the incoming description so the legitimate next version becomes
  the new baseline.
- clearBaseline(toolName?): removes baseline for one tool or all tools.
  Preserves description / firstSeen / lastSeen / history.
- listBaselines(): read-only listing for the upcoming reset CLI.
- LLM_SECURITY_MCP_CACHE_FILE env var override for end-to-end testing.
- New policy key mcp.cumulative_drift_threshold (default 0.25).

Tests: 23 new unit tests; existing 10 still pass.

Docs deferred: CLAUDE.md update lands in C3 alongside the new
/security mcp-baseline-reset command. C2 adds the hooks-table footer
note. Combined wave docs match plan §"Wave C — Touch" list.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-30 16:37:33 +02:00
Kjell Tore Guttormsen
ede37219a3 feat(workflow-scanner): E11 part 2 — re-interpolation + auth-bypass + WFL prefix + orchestrator
Closes E11. Three new pieces, plus integration:

1. Re-interpolation detector (Appsmith GHSL-2024-277 stealth pattern).
   The scanner now collects env: bindings (key -> source-expression
   text) by walking parsed events whose parentChain includes 'env',
   then for each `${{ env.<KEY> }}` inside run:, re-injects MEDIUM
   if the binding source matches the 23-field blacklist. This
   catches the pattern where developers apply env-indirection but
   then re-interpolate the env var in run:, which cancels the
   mitigation (template substitution happens before shell parsing).

2. Auth-bypass category (Synacktiv 2023 Dependabot spoofing).
   Detects `if: ${{ github.actor == 'dependabot[bot]' }}` and
   variants. MEDIUM, owasp: 'LLM06' (Excessive Agency). Distinct
   from injection — same expression syntax, different threat class.
   Recommendation steers users to `github.event.pull_request.user.login`.

3. severity.mjs OWASP map registration. WFL prefix added to all
   four maps:
   - OWASP_MAP['WFL'] = ['LLM02', 'LLM06']
   - OWASP_AGENTIC_MAP['WFL'] = ['ASI04']
   - OWASP_SKILLS_MAP['WFL'] = []
   - OWASP_MCP_MAP['WFL'] = []
   Empty arrays for skills/MCP are explicit, not omitted — keeps
   `Object.keys(OWASP_MAP)` symmetric across maps.

4. scan-orchestrator.mjs registration. workflowScan added between
   supply-chain and toxic-flow (toxic-flow correlates after primaries).
   Verified via integration: orchestrator emits 9 WFL findings on
   tests/fixtures/workflows/.

Bug fix: extractTriggers in workflow-yaml-state.mjs was collecting
sub-properties (`branches:`, `types:`) as triggers. Now tracks the
first nested indent level and ignores anything deeper.

Tests:
- 6 new cases in tests/scanners/workflow-scanner.test.mjs:
  re-interp TP, no-double-count, auth-bypass TP, auth-bypass FP
  (startsWith head_ref is not auth-bypass), OWASP map shape,
  orchestrator import + SCANNERS array entry.
- 2 new fixtures: tp-reinterpolation.yml, auth-bypass-dependabot.yml.
- Existing 14 scanner tests + 15 state-machine tests unchanged.

Test count: 1732 -> 1738 (+6). Wave B total: +53 over baseline 1685.
Pre-compact-scan flake unchanged (passes in isolation).
2026-04-30 15:57:10 +02:00
Kjell Tore Guttormsen
c31d4b1718 feat(workflow-scanner): E11 part 1 — core file-walk + 23-field blacklist + sink-restriction
Adds a deterministic GitHub Actions / Forgejo Actions injection
scanner. Detects \${{ <dangerous-field> }} interpolations inside
\`run:\` step blocks under privileged or semi-privileged triggers.
Sink-restricted: \`if:\` / \`with:\` / \`env:\` (block-level) are
evaluated by the runner expression engine, not the shell, so they
are NOT injection sinks and are suppressed at parser level.

Why: workflow expression injection is the most prevalent SAST class
on GitHub (CodeQL preview: 800K+ findings across 158K repos). The
graduated severity matrix (HIGH for pull_request_target / discussion
/ workflow_run; MEDIUM for pull_request / workflow_dispatch) is the
community-converged calibration target — uniform HIGH causes alert
fatigue.

Components:
- scanners/lib/workflow-yaml-state.mjs — line-based YAML state
  machine. Tracks indentation, parent-context stack, and
  \`run: |\` / \`run: >\` block-scalar entry/exit. Zero deps.
- scanners/workflow-scanner.mjs — discoverWorkflows() probes
  .github/workflows/ and .forgejo/workflows/ directly (file-discovery
  has no glob include). 23-field blacklist (GHSL 17 + 6 GlueStack-
  class additions). Platform encoded via file path; no schema
  extension to finding(). Forgejo-specific: workflow_run advisory
  emitted to stderr; recommendation text mentions Forgejo's
  server-level token scoping (job-level permissions: is ignored).
- knowledge/workflow-injection-patterns.md — 23-field blacklist,
  trigger taxonomy, severity matrix, Forgejo divergences, NVD CVE
  corpus.

Tests (47 new):
- tests/lib/workflow-yaml-state.test.mjs (15): trigger forms
  (string / inline-list / block-list / block-mapping), single-line
  run, block-scalar | and > tracking, env/with sink-mismatch,
  multi-line, comment stripping, line-number accuracy.
- tests/scanners/workflow-scanner.test.mjs (14): TP head_ref
  pull_request_target, TP discussion.title gluestack pattern,
  TP comment.body pull_request, TP issue.body block-scalar,
  FP if-context, FP env-block, INFO numeric, Forgejo TP, Forgejo
  workflow_run advisory, envelope shape, WFL prefix.
- 9 fixtures in tests/fixtures/workflows/{.github,.forgejo}/workflows/.

Out of scope (B4 / Batch D):
- Re-interpolation detection (env.VAR after env: from blacklisted source)
- github.actor authorization-bypass category
- WFL prefix in severity.mjs OWASP maps + scan-orchestrator
  registration (B4)
- Composite-action input tracing, GITHUB_ENV poisoning (Batch D)

Test count: 1685 → 1732 (+47). Pre-compact-scan flake unchanged
(passes in isolation).
2026-04-30 15:48:48 +02:00
Kjell Tore Guttormsen
ad86f5031a feat(pre-install-supply-chain): E13 — npm scope-hopping MEDIUM advisory with allowlist
Adds a scope-hopping detector to the npm install gate. When a user
installs `@<scope>/<unscoped>`, the hook now emits a MEDIUM warning
on stderr (exit 0, never blocks) if:
  - `<unscoped>` matches a popular npm package (POPULAR_NPM, ~80
    names from knowledge/top-packages.json), AND
  - `<scope>` is not on NPM_OFFICIAL_SCOPES (built-in 22 entries) or
    on policy.json `supply_chain.allowed_scopes`.

Why: an attacker publishing `@evilcorp/lodash` cannot squat the bare
`lodash` name, but they can register an unrelated scope and rely on
typo or copy-paste to trick installs. NPM_OFFICIAL_SCOPES anchors the
known-good scopes (@types, @reduxjs, @nestjs, …) so legitimate
installs stay silent.

Implementation:
- `scanners/lib/supply-chain-data.mjs`: exports POPULAR_NPM,
  NPM_OFFICIAL_SCOPES, and `checkScopeHop(name, extraAllowedScopes)` —
  pure function, no policy/network dependency, fully unit-testable.
- `knowledge/typosquat-allowlist.json`: mirrors NPM_OFFICIAL_SCOPES as
  `npm_official_scopes`. A doc-consistency assertion ensures the two
  lists never drift.
- `hooks/scripts/pre-install-supply-chain.mjs`: imports checkScopeHop,
  reads `supply_chain.allowed_scopes` from policy, and pushes a
  warning before existing compromised/audit checks.

Tests:
- 9 new cases in tests/hooks/pre-install-supply-chain.test.mjs:
  TP @evilcorp/lodash, TP @attacker/express, allowlist @types,
  allowlist @reduxjs, allowlist @modelcontextprotocol, FP unscoped
  name not in top-100, bare unscoped name, policy override, defensive
  non-string input, NPM_OFFICIAL_SCOPES <-> typosquat-allowlist.json
  consistency.
2026-04-30 15:38:28 +02:00
Kjell Tore Guttormsen
0f4b0c5f2c feat(git-clone): E12 — .gitattributes filter-driver post-clone advisory
Adds scanGitAttributes(repoDir) — pure function that parses
.gitattributes after a sandboxed clone and returns the
{filter,diff,merge} driver entries that would run on checkout. The
clone CLI prints each entry as a "MEDIUM" stderr advisory followed by
a recommendation to verify the smudge/clean command before moving the
clone outside the sandbox.

Why: filter drivers execute arbitrary shell during checkout (smudge
runs on read, clean on write). Even with the existing sandboxed clone,
downstream consumers that re-checkout files outside the sandbox can be
exploited. Surfacing the directive list lets the caller decide whether
to proceed.

Out-of-scope: in-line content of the smudge command is not analysed —
the advisory is for human review, not automatic blocking.

Tests:
- tests/lib/git-clone-gitattributes.test.mjs (8 cases): LFS-style,
  custom driver, missing/empty/comment-only files, line-number
  tracking, inline-comment stripping, unreadable path graceful return.
2026-04-30 15:29:13 +02:00
Kjell Tore Guttormsen
950e4e4bce feat(injection): E3 — rot13 layer for comment-block injection
Adds rot13 to the variantSet built in scanForInjection(), so
imperative phrases hidden as rot13 inside code comments still hit
the existing CRITICAL/HIGH/MEDIUM pattern arrays.

normalizeForScan() already covers base64, hex, URL, and HTML decoding
in a 3-iteration loop — those are NOT duplicated here. rot13 is the
only genuinely new variant: it is its own inverse and not part of any
NIST/Unicode normalization spec, so it has to be applied explicitly.

Threshold: only inputs >40 chars enter the rot13 pass, to suppress
false positives on accidental letter-shifts in tokens, ids, and short
identifiers. Variants are deduplicated against the existing set so
matchers do not run twice.

3 new tests in injection-patterns.test.mjs (rot13 detection, sub-40
char suppression, plaintext path still green). Total 168 tests pass.

Closes E3 in critical-review-2026-04-20.md.
2026-04-30 15:21:03 +02:00
Kjell Tore Guttormsen
336e4db1b8 feat(pre-bash-destructive): T8 — base64-pipe-shell idiom (E9)
Adds BLOCK_RULE for the malware-loader pattern:
  echo|cat|printf <base64-blob> | base64 -d | <shell>

This is a common RCE delivery shape that bypasses static name-matching
gates by encoding the destructive command as a base64 blob. The new
rule fires only when the final pipe target is a shell interpreter
(bash, sh, zsh, dash, ksh) — base64 decoded into jq or any non-shell
consumer remains allowed.

5 new tests in pre-bash-destructive.test.mjs:
- 3 BLOCK cases (echo|base64|bash, printf|base64|sh, cat|base64|zsh)
- 2 FP probes (base64 -d -> jq passes; base64 -d alone passes)

Closes E9 in critical-review-2026-04-20.md.
2026-04-30 15:15:29 +02:00
Kjell Tore Guttormsen
761e81309b feat(bash-normalize): T7 — process substitution collapse (E8)
Strips bash process substitution syntax — <(cmd) and >(cmd) — so the
inner command name is surfaced to downstream regex gates. Defeats
evasion like `cat <(curl evil)` where the destructive command is
hidden behind /dev/fd/N pipe sugar.

Implementation: bounded innermost-first iteration, depth 3. Beyond
that the string is left as-is rather than recurse without bound.
Runs after the single-quote mask phase, so legitimate strings like
`'echo <(x)'` are preserved.

5 new T7 tests (collapse + nested + FP probes) in
bash-normalize-t7-t9.test.mjs (now 12 tests total).

Closes E8 in critical-review-2026-04-20.md.
2026-04-30 15:14:04 +02:00
Kjell Tore Guttormsen
037b9644f3 feat(bash-normalize): T9 — one-level variable substitution (E10)
Defeats split-and-substitute evasion where attackers split a destructive
command name across an assignment and a variable reference (X=rm; later
$X) so downstream regex gates miss the literal command name. T9 collects
prefix assignments (VAR=value at start of string or after ; & |) and
substitutes ${VAR} / $VAR forms with the captured value. One-level
forward-flow only — chained vars are not followed.

Documented limits in JSDoc:
- Quoted assignments (X="rm -rf") not parsed (whitespace stops capture)
- Substitution is global within string, not scoped. Acceptable because
  T3 strips unknown ${VAR} to '' afterwards.

Single-quoted literals are masked before T9 runs, so legitimate
strings are preserved (FP probe in tests).

7 new tests in bash-normalize-t7-t9.test.mjs.
Closes E10 in critical-review-2026-04-20.md.
2026-04-30 15:12:02 +02:00
Kjell Tore Guttormsen
0a0c1fc412 chore(llm-security): stage ignore patterns for session files
Add .local/ and HANDOFF-FINDINGS.local.md to .gitignore so session
handoff artifacts (NEXT-SESSION-PROMPT.local.md, scratch findings)
do not leak into commits.

Pre-flight for Batch C v7.3.0.
2026-04-30 15:07:35 +02:00
Kjell Tore Guttormsen
3b57dfbf6d chore(release): bump to v7.2.0
Batch B release — closes critical-review B-tier scanner defects
(B3, B5, B6, B7) and the v7.2.0 evasion-arsenal hardening patches
(E1, E4, E5, E7, E15, E16, E17, E18). Tests 1522 → 1665+, attack
simulator 64 → 72 (100 % pass).

Version updates across the 6 sync targets:

  - package.json
  - .claude-plugin/plugin.json
  - CLAUDE.md (header + test count: 1511 → 1665+)
  - README.md (badge + Version History row)
  - scanners/ide-extension-scanner.mjs (VERSION constant)
  - ../../README.md (marketplace root)

CHANGELOG [7.2.0] entry per Keep a Changelog with full Added /
Changed / Documentation / Tests / Notes breakdown.

Refs: Batch B Wave 6 / Step 15
2026-04-29 15:40:15 +02:00
Kjell Tore Guttormsen
8d8d4e7002 feat(red-team): 8 new evasion-arsenal scenarios for v7.2.0 (E1/E4/E5/E7/E16/E17)
Adds attack-simulator coverage for the new defenses landed earlier in
Batch B. All eight scenarios pass against the current hooks (72/72,
zero gaps). E15 (memory-poisoning glob) and E18 (entropy markdown-image
CDN allowlist) are scanner-only and have unit/integration coverage in
their respective scanner test files.

  unicode-evasion (pre-prompt-inject-scan):
    UNI-007  E1  PUA-A range hidden Unicode               → HIGH advisory
    UNI-008  E1  PUA-B range hidden Unicode               → HIGH advisory
    UNI-009  E16 Greek-Latin homoglyph fold               → CRITICAL block

  mcp-output (post-mcp-verify):
    MCP-005  E4  Markdown link-title injection            → markdown-link-title-injection
    MCP-006  E5  SVG <desc> injection                     → svg-element-injection
    MCP-007  E5  SVG <foreignObject> injection            → svg-element-injection
    MCP-008  E7  HTML comment-node injection (no marker)  → html-comment-injection

  session-trifecta (post-session-guard):
    TRI-004  E17 Escalation-after-input (WebFetch → Task) → escalation-after-input advisory

Payload helpers `buildPuaAPayload` / `buildPuaBPayload` shift each
character into Supplementary Private Use Area-A / -B respectively.
The Greek-fold payload uses Greek ι (U+03B9 → i) and ο (U+03BF → o)
so foldHomoglyphs reproduces the canonical "ignore previous
instructions" CRITICAL pattern.

Total: 64 → 72 scenarios.

Refs: Batch B Wave 6 / Step 14 / v7.2.0
2026-04-29 15:35:32 +02:00
Kjell Tore Guttormsen
f0fb7505fb fix(entropy): E18 — rule 18 markdown-image CDN-aware + secret pre-check
The v7.0.0 entropy-scanner rule 18 suppressed every line whose pattern
matched ![…](https?://…) — regardless of the URL host or what the URL
carried. A markdown image URL pointing at a non-CDN host (or carrying a
secret-shaped token in its query string) would therefore mask a real
high-entropy credential.

Refactor:

  * MARKDOWN_IMAGE now captures the full URL (was a host-only prefix
    matcher), so rule 18 can inspect host and query.
  * MARKDOWN_IMAGE_CDN_HOSTS allowlist constant covers cdn./images./
    media./assets./static./*.cdn./*.amazonaws.com/{s3,cloudfront}/
    *.cloudflare./*.fastly./*.akamaized./raw.githubusercontent.com/
    *.imgix.net/*.cloudinary.com/.
  * MARKDOWN_IMAGE_QUERY_SECRET catches secret-shaped query keys
    (token, key, secret, password, api_key, access_token, auth) plus
    well-known provider prefixes (AKIA, Bearer, sk_live_, ghp_, ghs_,
    ghu_, gho_, ghr_, npm_).
  * Rule 18 now suppresses iff (host matches CDN allowlist) AND
    (query has no secret-shaped token). Anything else falls through
    to entropy classification.

+4 tests in tests/scanners/entropy-context.test.mjs (29 → 33).
Existing rule 18 fixture (cdn.example.com, no secret query) still
suppresses, so no regression on the legitimate path.

Refs: Batch B Wave 5 / Step 13 / v7.2.0
critical-review-2026-04-20.md §E18
2026-04-29 15:18:37 +02:00
Kjell Tore Guttormsen
04f1593df3 refactor(entropy): B5 — two-stage context-classified suppression pipeline
The v7.0.0 entropy-scanner ran rules 11-13 (GLSL/CSS-in-JS/inline-markup
line-proximity suppressions) for every line regardless of file type. A
polyglot `.ts` file with an embedded fragment-shader template literal
could therefore mask a real high-entropy credential when the credential
literal happened to share a line with a GLSL keyword. Critical-review
B5 documented the false-negative class.

Refactor:

  * New `classifyFileContext(absPath, lines)` returns
    `'shader-dominant' | 'markup-dominant' | 'code-dominant' | 'mixed'`,
    keyed off file extension with a content-density fallback for
    code-extension files (≥50% of sampled non-blank lines matching
    GLSL/inline-markup → downgrade to `mixed`).

  * `isFalsePositive(str, line, absPath, context)` gates rules 11-13
    on `context !== 'code-dominant'`. Rules 1-10 and 14-19 still run
    unconditionally, so URL/path/test-fixture/ffmpeg/UA/SQL/error-
    template suppression behaves identically.

  * `scanFileContent` computes `fileContext` once per file and threads
    it through every per-string suppression check.

Conservative defaults to keep the regression surface minimal:

  * Files with `<5` sampled non-blank lines fall back to `mixed`
    (preserves the existing rule-11/12/13 behaviour for the single-
    line .js fixtures used by entropy-context.test.mjs).
  * Unknown extensions fall back to `mixed`.
  * Code-extension files densely populated with shader/markup
    content fall back to `mixed`.

Net effect: a `.ts` file with an embedded GLSL block but mostly TS
code on the surrounding lines now surfaces credentials that the
v7.0.0 line-proximity heuristic suppressed. Pure shader/markup
files are unaffected (extension skip / mixed default).

New fixture: tests/fixtures/entropy/polyglot-ts-with-glsl.ts (with
runtime placeholder so it does not commit a high-entropy literal).

+3 tests in tests/scanners/entropy-context.test.mjs (26 → 29).
Existing entropy.test.mjs and entropy-context.test.mjs all remain
green. Full suite 1658 → 1661.

Refs: Batch B Wave 5 / Step 12 / v7.2.0
critical-review-2026-04-20.md §B5
2026-04-29 15:13:13 +02:00
Kjell Tore Guttormsen
d441abba20 feat(post-mcp-verify): E7 — scan HTML comment nodes for injection
The existing CRITICAL pattern in injection-patterns.mjs only fires when
a comment body contains AGENT/AI/HIDDEN markers. Adversaries can drop
the marker and still hide instructions inside <!-- ... --> for any
agent that reads page source. This generalizes the comment scan: every
comment body is HTML-entity-decoded and run through the full
injection rule set. The existing keyword-restricted pattern still
fires (defense-in-depth).

Emits at the strongest tier with category html-comment-injection.

+3 tests (65 → 68).

Refs: Batch B Wave 4 / Step 11 / v7.2.0
2026-04-29 15:01:56 +02:00
Kjell Tore Guttormsen
716c8384d9 feat(post-mcp-verify): E5 — scan SVG desc/title/metadata/foreignObject
SVG containers carry text that is invisible in the rendered image but
fully parsed by an agent reading the source. <desc>, <title>,
<metadata>, and <foreignObject> are all valid surfaces for adversarial
injection.

Adds a per-element extractor inside the existing HTML-tag gate, gated
on /<svg[\s>]/i so it only fires for actual SVG content. Inner text is
HTML-entity-decoded then run through scanForInjection. Emits at the
strongest tier with category svg-element-injection.

+3 tests (62 → 65).

Refs: Batch B Wave 4 / Step 10 / v7.2.0
2026-04-29 14:54:58 +02:00
Kjell Tore Guttormsen
b95d85bb4c feat(post-mcp-verify): E4 — scan markdown link titles for injection
Adversarial payloads in markdown link title attributes (rendered as
tooltips, parsed by agents) bypassed the existing HTML-content checks
which gated on `<tag>` presence. Pattern: [text](url "title").

Adds linkTitleRegex extraction to the HTML-content block, runs each
captured title through scanForInjection, emits at the strongest tier
encountered with category markdown-link-title-injection.

+3 tests (62 → 62 in post-mcp-verify.test.mjs file, was 59).

Refs: Batch B Wave 4 / Step 9 / v7.2.0
2026-04-29 14:52:30 +02:00
Kjell Tore Guttormsen
6073952b97 fix(injection): E16 ASCII fast-path + UNI-003 expectation update (v7.2.0)
Two follow-up fixes after E16 + E17 landed:

1. foldHomoglyphs ASCII fast-path
   - scanForInjection calls foldHomoglyphs on every scan (raw + normalized).
   - Pre-fix: NFKC normalization runs unconditionally, even on pure
     ASCII inputs where it's a no-op.
   - Result: benchmark.test.mjs timed out at 120s on the full suite.
   - Fix: charCodeAt sweep for >=128, short-circuit return s when
     all ASCII. NFKC and HOMOGLYPH_MAP iteration only run when
     non-ASCII chars are present (the actual attack case).
   - Verified: benchmark.test.mjs passes within timeout.

2. Attack-scenario UNI-003 expectation
   - Pre-E16: "Homoglyph Cyrillic-Latin mixing" payload triggered only
     a MEDIUM "obfuscation present" advisory (exit 0, stdout match
     "MEDIUM").
   - Post-E16: the same payload is folded to Latin BEFORE pattern
     matching, so it now matches CRITICAL "ignore previous instructions"
     and blocks (exit 2).
   - This is the intended v7.2.0 behavior — not a regression. Updated
     expectation: exit_code 2, stdout_match "block". Renamed scenario
     to "now blocked via E16 fold, v7.2.0".

Suite: pre-compact-scan flake remains (perf-budget under load,
passes isolated). All other tests green.
2026-04-29 14:44:41 +02:00
Kjell Tore Guttormsen
f0a1d4024a feat(post-session-guard): E17 — configurable escalation window + 20-call MEDIUM advisory
Critical-review §4 E17 finding: pre-v7.2.0 the delegation-after-input
advisory fired only within a 5-call window. Attackers who deliberately
waited 6+ calls before delegating bypassed detection. Window was also
hardcoded — operators couldn't tune it for their environment.

Two coordinated changes:

1. LLM_SECURITY_ESCALATION_WINDOW env var (primary window override)
   - parseInt(env) || getPolicyValue('trifecta', 'escalation_window', 5)
   - Mirrors the established pattern from
     LLM_SECURITY_TRIFECTA_MODE et al.
   - Setting env=3 narrows; env=8 expands.

2. Secondary 20-call MEDIUM advisory (slow-burn variant)
   - DELEGATION_ESCALATION_WINDOW_MEDIUM = 20 (hardcoded — same value
     for all operators; tunable in a future patch if needed)
   - checkEscalationAfterInput now returns `tier: 'primary'|'secondary'|null`
   - formatEscalationWarning emits a different message for secondary —
     mentions "slow-burn", references env-var, distinct from the
     primary "DeepMind Category 4" framing

Hook reads max(WINDOW_SIZE, secondary+5) entries to cover the wider
window. Existing duplicate-suppression (`escalation_warning` state
entry) covers both tiers. Audit-trail event captures `tier` field.

Tests: +5 cases in tests/hooks/post-session-guard.test.mjs:
- secondary window catches 9-call distance (slow-burn)
- secondary boundary at exactly 20 calls
- primary regression guard (1-call distance)
- env=3 narrows primary (4-call distance becomes secondary)
- env=8 expands primary (7-call distance stays primary)

Updated existing test "does NOT trigger when input_source is >5 calls
ago" — now requires >20 calls (secondary window catches 6-20).

Suite: 1644 → 1672 (+28 from new tests + extended scope). All green.

CLAUDE.md hooks table updated to document both windows and the env var.
2026-04-29 14:26:18 +02:00
Kjell Tore Guttormsen
ec4ae268da feat(injection): E16 — homoglyph NFKC fold before every pattern match
Critical-review §4 E16 finding: pre-v7.2.0 homoglyph normalization fired
ONLY for the MEDIUM-advisory "obfuscation present" signal. Pattern
matchers in scanForInjection compared against raw + decoded variants
only — they did NOT compare against a fold-normalized variant. As a
result, "ignоre previous instructions" (Cyrillic о, U+043E) bypassed
the CRITICAL "ignore previous" pattern.

Two coordinated edits:

scanners/lib/string-utils.mjs
- Adds HOMOGLYPH_MAP (frozen) — surgical Cyrillic/Greek → Latin map.
  ~25 entries focused on injection-vocabulary letters
  (a, e, o, c, p, x, y, i, j, s, l, A, E, O, C, P, X, Y, T).
- Adds foldHomoglyphs(s) — pipeline: NFKC → apply HOMOGLYPH_MAP.
  NFKC handles Mathematical Alphanumeric (U+1D400 block), fullwidth
  Latin (U+FF21 block), ligatures, width variants.

Excluded by design from HOMOGLYPH_MAP:
- Latin Extended (æ, ø, å, é, è, ñ, ü, ö, ä, ç, ß, þ, ð) — legitimate
  Norwegian/German/French/Spanish letters. Map them and we false-positive
  on every non-English source file.
- Greek letters not visually overlapping (β, γ, δ, ...)
- Cyrillic letters not visually overlapping (б, г, д, ж, ...)

scanners/lib/injection-patterns.mjs
- scanForInjection now builds a 4-variant set: raw, normalized,
  folded(raw), folded(normalized). Set deduplication skips redundant
  identical variants. Existing dedup-by-label (seenLabels Set) prevents
  double-counts when the same pattern matches in multiple variants.
- foldHomoglyphs added to the imports.

Tests: +27 cases in tests/lib/string-utils-homoglyph.test.mjs:
- 6 Cyrillic → Latin (lowercase, uppercase, multiple substitutions,
  Palochka U+04CF)
- 3 Greek → Latin
- 2 NFKC normalization (Math Bold, Fullwidth)
- 8 preserves-non-confusable (Norwegian æøå, German umlauts, French
  accents, Spanish ñ, emoji, CJK, Arabic/Hebrew)
- 3 edge cases (empty, null/undefined, idempotency)
- 5 scanForInjection integration (Cyrillic ignore, Cyrillic Assistant,
  Norwegian non-trigger, benign "ignore" comment, mixed Cyrillic+Greek)

Test-development found: U+1D5DC is "I" not "A" (test pin caught my
codepoint mistake — fixed during dev).

Suite: 1617 → 1644 (+27). All green.
2026-04-29 14:22:05 +02:00
Kjell Tore Guttormsen
6cef80c640 feat(unicode): E1 — extend hidden-Unicode detection to PUA-A and PUA-B
Critical-review §4 E1 finding: pre-v7.2.0 the Unicode-stego detector
(`containsUnicodeTags`) covered only U+E0001-E007F (Tag block). Private
Use Areas — also invisible in most terminals and surviving normalization
— were not detected. Attackers could encode payloads in PUA codepoints
that pass through `scanForInjection` undetected.

Coverage extended to:
- U+E0001-E007F  Unicode Tag block       (existing — DeepMind kat. 1)
- U+F0000-FFFFD  Supplementary PUA-A      (NEW — E1)
- U+100000-10FFFD Supplementary PUA-B     (NEW — E1)

Detection-only for PUA: PUA characters have NO standard ASCII mapping,
so `decodeUnicodeTags` leaves them unchanged. Detection alone is
sufficient — `scanForInjection` emits HIGH on any presence, regardless
of decoded content.

Function name `containsUnicodeTags` preserved for back-compat. All
existing call sites (injection-patterns.mjs:259, etc.) work unchanged.
Semantically the function is now "containsHiddenUnicode".

Tests: +21 cases in tests/lib/string-utils-hidden-unicode.test.mjs:
- 5 Tag-block regression guards
- 4 PUA-A range cases (start, just-inside, end, buried-in-ASCII)
- 3 PUA-B range cases
- 5 boundary cases (gap U+E0080-EFFFF, U+10FFFE noncharacter, emoji,
  CJK, Latin Extended — all must be FALSE)
- 4 decodeUnicodeTags passthrough cases (PUA-A unchanged, PUA-B
  unchanged, Tag block still decodes, mixed Tag+PUA)

Suite: 1596 → 1617 (+21). All green.
2026-04-29 14:18:49 +02:00
Kjell Tore Guttormsen
b0f1a9abfd fix(memory-poisoning): E15 — add .claude/agents/*.md to target glob
Critical-review §4 E15 finding: agent files in .claude/agents/ are loaded
as Claude Code subagent system prompts and are a direct memory-poisoning
surface. Pre-v7.2.0 the scanner covered CLAUDE.md, .claude/rules/*.md,
memory/*.md, REMEMBER.md, .local.md, and .claude-plugin/plugin.json —
but not .claude/agents/*.md.

Single-line addition to MEMORY_FILE_PATTERNS:
  /(?:^|\/)\.claude\/agents\/[^/]+\.md$/

The existing scan loop, scanForInjection integration, and severity-
mapping logic all apply unchanged. STRICT_FILES_PATTERN intentionally
NOT extended — agents may legitimately quote shell commands as examples
(consistent with CLAUDE.md treatment).

Tests: +3 cases in tests/scanners/memory-poisoning.test.mjs:
- "scans .claude/agents/*.md" (smoke test — at least one finding from
  the new fixture)
- "agent file injection pattern detected"
- "agent file credential path detected"

New fixture: tests/fixtures/memory-scan/poisoned-project/.claude/agents/
poisoned-agent.md — agent with injection, credential ref, permission
expansion, and exfil URL. Triggers all 4 detection categories.

Suite: 1591 → 1594 (+3). All green.
2026-04-29 14:13:01 +02:00
Kjell Tore Guttormsen
5f8f2d3c41 fix(dep): B7 — token-overlap typosquat heuristic alongside Levenshtein
Critical-review §2 B7 finding: pure Levenshtein <=2 misses the most common
modern typosquat pattern — popular-name + token-injection suffix. Examples:
  lodash → lodash-utils    (edit distance 6, not flagged pre-B7)
  react  → react-helper    (edit distance 7, not flagged pre-B7)
  express → express-wrapper (edit distance 8, not flagged pre-B7)

Three coordinated edits:

scanners/lib/string-utils.mjs
- Adds tokenize(name): string[]    splits on -/_, lowercases
- Adds tokenOverlap(a, b): number  intersection.size / min(|a|,|b|)
- Adds TYPOSQUAT_SUSPICIOUS_TOKENS frozen list of common typosquat
  suffixes. Excludes language-extension tokens (js, jsx, ts, tsx) — the
  v7.0.0 allowlist contains `tsx` as a legit package and including the
  same token in the suspicious set creates a contradiction. Caught by
  the new allowlist-intersection-guard test. Also excludes 'pro'
  (legitimate edition marker).

scanners/dep-auditor.mjs + scanners/supply-chain-recheck.mjs
- New checkTyposquatTokenOverlap() helper — fires AFTER Levenshtein 1/2
  branches, only when:
    1. popular package's tokens ⊆ declared name's tokens (strict superset)
    2. declared name has at least one suspicious suffix
    3. popular package is in topCutoff window
  All three conditions required — conservative by design. Allowlist
  precedence preserved (existing 22 npm + 13 PyPI entries always pass).
  MEDIUM severity, NOT block. New finding title prefix:
  "Possible typosquatting via token-overlap".

Tests: +21 cases across two new files
- tests/lib/string-utils-tokens.test.mjs (15) — tokenize, tokenOverlap,
  TYPOSQUAT_SUSPICIOUS_TOKENS frozen contract, allowlist-intersection
  guard (caught the tsx conflict on first run)
- tests/scanners/dep-token-overlap.test.mjs (7) — integration via
  in-memory tmpdir fixtures: lodash-utils flagged, react-helper flagged,
  express-wrapper flagged, lodash exact NOT flagged, allowlist tools
  (knip/tsx/nx/rimraf) NOT flagged, react-router-dom (no suspicious
  suffix) NOT flagged, react itself (equal token set, not superset)
  NOT flagged.

Existing dep.test.mjs and supply-chain-recheck.test.mjs unchanged —
all green (149 → 149 regression guard).

Suite: 1570 → 1591 (+21). All green.
2026-04-29 14:10:53 +02:00
Kjell Tore Guttormsen
68b9ea2692 fix(taint-tracer): B6 — recognize destructuring + spread + rest patterns
Critical-review §2 B6 finding: extractAssignedVariable handled
`const X = ...` and `X = ...` but missed every modern JS/TS
destructuring pattern. Sinks downstream of destructured/spread vars
produced false negatives at the propagation step.

Patterns now recognized:
- `const { x } = source`               object destructuring
- `const { x, y } = source`            multi-key
- `const { secret: alias } = source`   renamed (key NOT bound)
- `const { x, ...spread } = source`    object rest
- `const { a, b: { c } } = source`     nested object (key NOT bound)
- `const [a, b] = source`              array destructuring
- `const [first, ...rest] = source`    array rest
- `const [a, [b, c]] = source`         nested array
- `const { user: { id }, ...rest }`    mixed nested

Implementation: regex-based two-pass walker. Pass 1 detects whether
the LHS is a destructuring pattern (`{...}` or `[...]`). If yes, the
new `extractDestructuredNames` helper walks the pattern body via a
balanced-bracket depth counter, recurses into nested patterns, and
distinguishes keys (`key:`) from bindings. If no, the plain-decl
branch matches `\b(?:const|let|var)\s+(\w+)`.

Plain-assignment branch (`X = ...` without keyword) and Python-style
patterns are unchanged.

The function is now exported for direct unit testing — same pattern
as `_resetCacheForTest` in policy-loader. The internal walker
(`extractDestructuredNames`) remains module-private.

Tests: +19 cases in tests/scanners/taint-destructuring.test.mjs:
  - 5 pre-B6 patterns (regression guard: plain decl, plain assign,
    no-match on equality)
  - 12 destructuring patterns covering object/array/rest/nested
  - 2 non-destructuring regressions (return literal, arrow param)

Existing taint-tracer.test.mjs and taint.test.mjs unchanged — both
green (14 → 14, fixture-based integration tests not affected).

Suite: 1551 → 1570 (+19). All green.
2026-04-29 14:05:34 +02:00
Kjell Tore Guttormsen
d3b1157a08 docs(scoring): unify scan/audit/mcp-scanner/posture-assessor to v2 formula
Closes the v7.1.1 out-of-scope item: commands/scan.md:113-114 retained
the v1 formula. Exploration found two more v1 surfaces that v7.1.1
missed: commands/audit.md:46 and agents/mcp-scanner-agent.md:419, plus
agents/posture-assessor-agent.md:376 (caught by the new doc-consistency
test). Four files unified to v2 in one atomic commit.

Three-way → four-way verdict-divergence is now closed:
- scanners/lib/severity.mjs (v2, BLOCK ≥65, WARNING ≥15) — authoritative
- agents/skill-scanner-agent.md (v2 since v7.1.1)
- templates/unified-report.md (v2 since v7.1.1)
- commands/scan.md (v2 — this commit)
- commands/audit.md (v2 — this commit)
- agents/mcp-scanner-agent.md (v2 — this commit)
- agents/posture-assessor-agent.md (v2 — this commit)

New: tests/lib/doc-consistency.test.mjs walks commands/ + agents/ and
asserts NO file contains v1 formula tokens. Pinned regex set:
  - score >= 61, score >= 21, score ≥ 61, score ≥ 21
  - critical * 25, Critical × 25
  - min(100, critical*25 ...)

Plus three v2-cutoff anchors asserting commands/scan.md, commands/audit.md,
and agents/mcp-scanner-agent.md document the v2 BLOCK ≥65 cutoff (or
reference riskScore() helper).

Tests: 1523 → 1551 (+28 from doc-consistency: 25 file walks + 3 anchors).
All green.
2026-04-29 13:58:25 +02:00
Kjell Tore Guttormsen
3cd68dc9fb docs(severity): B3 — document info as scoring-inert (v7.2.0 prep)
Critical-review §2 B3 finding: `riskScore({info: N}) = 0` silently masks
info-volume findings. The behavior was correct (info is scoring-inert by
design) but undocumented. Operators reading a report with N info findings
had no way to know they contribute zero to verdict/band.

Three coordinated edits:
- scanners/lib/severity.mjs JSDoc — explicit "Info severity" subsection
  spelling out: scoring-inert, surfaced in owaspCategorize aggregates,
  treat as observability telemetry not verdict input. @param updated to
  mark info as accepted but ignored.
- CLAUDE.md v7.0.0 risk-score-v2 line — one-sentence anchor pointing to
  severity.mjs JSDoc.
- tests/lib/severity.test.mjs — anchor test alongside the existing
  4-critical=93 anchor: asserts riskScore({info: 50}) === 0,
  riskScore({info: 1000}) === 0, verdict({info: 100}) === 'ALLOW',
  riskBand(riskScore({info: 500})) === 'Low'.

Decision: skip the optional `infoScore()` helper from the brief. No
current consumer would use it; doc-only fix keeps API surface minimal.
Revisit if a consumer emerges.

Tests: 1522 → 1523 (+1 anchor block, 4 assertions). All green.
2026-04-29 13:56:11 +02:00
Kjell Tore Guttormsen
b18cb329ef docs(llm-security): v7.1.1 — narrative coherence patch
Documents the v7.1.1 narrative-coherence patch in CLAUDE.md (mini-block
appended after the v7.0.0 paragraph) and CHANGELOG.md (new [7.1.1]
section per Keep a Changelog convention, placed above [7.1.0]).

Plan: .claude/plans/ultraplan-2026-04-29-report-coherence.md
Brief: .claude/ultraplan-spec-2026-04-29-report-coherence.md

Verification gates passed:
- npm test: 1522/1522 (was 1511; +11 from new narrative test)
- node --test tests/lib/severity.test.mjs: 86/86 (co-monotonicity sweep
  at lines 252-303 unchanged and green)
- node --test tests/scanners/skill-scanner-narrative.test.mjs: 11/11
- Orchestrator against fixture: WARNING / 48 / 1 HIGH (HITL trap caught
  correctly, no whiplash)
- SARIF inline check via toSARIF import: sarif-version 2.1.0, runs: 1
- Zero remaining v1 cutoffs in agent + template

Out of scope but flagged for Batch B (deferred to v7.2.0):
- commands/scan.md:113-114 retains v1 risk formula

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-29 12:57:54 +02:00
Kjell Tore Guttormsen
5cfbc70472 test(llm-security): narrative-coherence contract test (v7.1.1)
11 assertions across 4 describe groups against tests/fixtures/skill-scan/
hyperframes-like/. Tests the deterministic input layer that feeds
skill-scanner-agent — does NOT invoke the LLM (no precedent in 1511 tests).

Coverage:
- content-extractor (5 it): exit 0 on animation markup; exactly 1 HIGH
  HITL trap; >= 2 process.env credential refs; has_injection=true (any
  injection signal flips it); has_critical_injection=false (no CRITICAL
  in fixture).
- entropy scanner (2 it): calibration block present; <= 1 finding (rest
  suppressed via line-context rules).
- co-monotonicity (2 it): {high:1} → WARNING/High; {high:1, info:1} →
  WARNING (info scoring-inert). Inline guard mirrors the sweep at
  tests/lib/severity.test.mjs:252-303 so this file fails fast if the
  invariant drifts.
- agent prompt contract (2 it): static asserts that
  agents/skill-scanner-agent.md contains 'Step 2.5: Context-First
  Severity Assignment', 'summary.narrative_audit.suppressed_findings',
  'score>=65', AND zero remaining 'score >= 61' references; same v2-
  cutoff + narrative-audit contract on templates/unified-report.md.

Part of v7.1.1 narrative-coherence patch.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-29 12:50:27 +02:00
Kjell Tore Guttormsen
3abd7ffeab test(llm-security): hyperframes-like fixture for narrative coherence
Synthetic skill content mimicking the noise profile of frontend
animation projects (HTML5 canvas, framework env-vars, inline SVG data
URIs, CSS keyframes) plus exactly one genuine HITL trap signal.

Used by tests/scanners/skill-scanner-narrative.test.mjs (added in
v7.1.1) to exercise:
- content-extractor: HIGH HITL trap signal + framework env-var
  references (process.env.REACT_APP_*, VITE_PUBLIC_*)
- entropy scanner: inline SVG data URI suppressed via line-context rules

The .llm-security-ignore file uses the SCANNER:glob format
(scanners/scan-orchestrator.mjs:34-40) — ENT:**/*.md suppresses any
entropy-scanner findings when the fixture is run through scan-orchestrator
in the Step 6 smoke test.

Part of v7.1.1 narrative-coherence patch.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-29 12:49:19 +02:00
Kjell Tore Guttormsen
67ffff13a4 fix(llm-security): skill-scanner-agent — context-first severity, v2 alignment, Suppressed Signals section
Five coordinated edits to address scan-rapport whiplash at the agent
prompt level:

- Step 2.5 (NEW): Context-First Severity Assignment. Every signal has
  exactly one disposition — suppressed (counted only) or reported (full
  finding). The split happens BEFORE severity is assigned. Forbids
  'false positive', 'legitimate framework', 'no action required' in
  finding-body text; reserves them for the Suppressed Signals section.
- Verdict Logic: replaces stale v1 sum-and-cap formula (BLOCK >=61) with
  v2 reference (severity-dominated, BLOCK >=65) matching severity.mjs
  since v7.0.0. Documents that severity counts MUST exclude suppressed
  signals; introduces verdict_rationale field for descriptive context
  when suppressed >= 5 AND reported <= 1 high.
- Output Format: adds Suppressed Signals as required section #4 with
  category-level bullet format. Documents the trailing JSON shape
  including summary.narrative_audit.suppressed_findings.{count,
  by_category} and verdict_rationale fields.
- Comment block before Category 2 suppression rules clarifies that
  'false positive' as taxonomy language is OK; only finding-body
  description fields are forbidden from using the phrase.
- Step 0 (Norwegian generaliseringsgrense) preserved unchanged.

Part of v7.1.1 narrative-coherence patch (plan: .claude/plans/ultraplan-2026-04-29-report-coherence.md).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-29 12:47:58 +02:00
Kjell Tore Guttormsen
899cb5c121 fix(llm-security): template — v1 → v2 risk constants + narrative_audit block
Updates the HTML-comment risk-formula reference at lines 55-66 from the
stale v1 sum-and-cap formula to the v2 severity-dominated tiers that
have been authoritative in scanners/lib/severity.mjs since v7.0.0. Adds
a Narrative Audit block inside the Executive Summary section surfacing
summary.narrative_audit.suppressed_findings.{count,by_category} from
the agent's trailing JSON. The block is transparency only — it does
NOT affect risk_score, riskBand, or verdict.

Part of v7.1.1 narrative-coherence patch (plan: .claude/plans/ultraplan-2026-04-29-report-coherence.md).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-29 12:45:28 +02:00
Kjell Tore Guttormsen
1e555b6833 docs(llm-security): add v7.1.0 row to README version history
The v7.1.0 release commit (621db14) bumped the version badge and added a
CHANGELOG entry, but missed the README Version History table. Adding
the row now so the public-facing version history at
git.fromaitochitta.com/open/ktg-plugin-marketplace reflects v7.1.0.

Row covers: B1 + B2 + B4 fixes, A3 honesty-sweep (7 phrases), B8 CaMeL
nedton, test count 1487 → 1511, "why" framing tied to critical-review
§F CISO perspective.
2026-04-29 12:03:10 +02:00
Kjell Tore Guttormsen
621db144bd chore(release): bump llm-security to v7.1.0
Closes A4 of v7.1.0 critical-review patch — release artefacts.

- Version bump 7.0.0 → 7.1.0 across active version sources:
  * package.json
  * .claude-plugin/plugin.json
  * CLAUDE.md header
  * README.md badge
  * scanners/ide-extension-scanner.mjs (VERSION constant)
  * marketplace root README plugin entry
- Marketplace root README test count: 1487 → 1511.
- CHANGELOG.md: new [7.1.0] - 2026-04-29 section above [7.0.0],
  documenting B1, B2, B4, B8, honesty-sweep (7 phrases), and
  test-count delta (+24 → 1511 total).
- docs/security-hardening-guide.md: §6 last-updated bump + new
  v7.1.0 calibration note on hook-level fixes (pathguard regex
  hole, distributed-trifecta block-mode bypass).

Historical references to "7.0.0" intentionally preserved in:
- CHANGELOG [7.0.0] entries (history)
- README.md version-history table v5.0.0/v7.0.0 rows (history)
- CLAUDE.md §"v7.0.0 — Severity-dominated risk scoring" (describes
  what changed at v7.0.0 release)
- scanners/ JSDoc comments noting "v7.0.0+" formula provenance
- agents/ + tests/ + knowledge/ provenance comments

Pre-existing untracked/modified tracker noise (.gitignore,
marketplace.json, config-audit/docs, ultraplan-local/docs) is not
part of this commit per the v7.1.0 NEXT-SESSION-PROMPT handoff.

Tests: 1511/1511 green.
2026-04-29 11:57:16 +02:00