Kjell Tore Guttormsen 3b034d9266 feat(llm-security): v7.7.0 — HTML-rapport for alle 18 skill-kommandoer

Hver /security <cmd> som produserer rapport printer nå en klikkbar
file://-lenke til en self-contained HTML-versjon. Levert over fem
sesjoner; sesjon 5 wirer de 14 resterende skill-filene + slipper
v7.7.0 (versjonsbump + docs).

Sesjon-historikk:
- Sesjon 1 (0dc7ff4) — playground katalog list-view + builder-pane med
  copy-knapp på alle 18 rapporter
- Sesjon 2 (86d6ecd) — playground prosjekt-surface opprydding
  (stub-screen + topbar-splitt)
- Sesjon 3 (fa5fb48) — extract 18 inline parsers + 18 inline renderers
  fra playground til canonical ESM-modul scripts/lib/report-renderers.mjs
  (playground beholder bit-identisk inline-kopi siden ESM import ikke
  fungerer fra file://)
- Sesjon 4 (db80854) — ny zero-dep CLI scripts/render-report.mjs
  (stdin/file/stdout-modus, kebab→camel commandId-routing, ~140 KB
  self-contained HTML med 6 inlined DS-stylesheets + lokal .report-table,
  absolutte file://-paths for Ghostty cmd-click). 4 skills wired:
  scan, audit, posture, deep-scan.
- Sesjon 5 (denne) — 14 resterende skills wired: plugin-audit, mcp-audit,
  mcp-inspect, ide-scan, supply-check, dashboard, pre-deploy, diff,
  watch, registry, clean, harden, threat-model, red-team. Hver skill-fil
  har nå en HTML Report-step som instruerer Claude å skrive markdown
  verbatim, kjøre CLI, og appende klikkbar file://-lenke til respons.

Release-arbeid:
- Versjonsbump v7.6.1 → v7.7.0 i 6 plugin-filer + 2 rot-filer
  (package.json, .claude-plugin/plugin.json, README badge, CLAUDE.md
  header + state-seksjon, docs/version-history.md, plugin Recent versions-
  tabell, rot README plugin-entry, rot CLAUDE.md plugin-katalog)
- CHANGELOG [7.7.0] med full historikk fra sesjon 1-5
- docs/version-history.md v7.7.0-seksjon

Verifisert:
- 18/18 commandIds i CLI gir > 138 KB self-contained HTML
- 1819/1820 tester grønne (pre-compact-scan-perf-flake fyrte under last,
  passerer i isolasjon på 1582 ms — pre-eksisterende, defer til v7.7.x)
- 18/18 skill-filer har HTML Report-step
- Ingen kildefil-treff på 7.6.1 utenfor historiske changelog/version-
  history/README releases-tabell

Ingen scanner- eller hook-atferdsendringer — purely additive surface.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-05-18 13:12:21 +02:00

5.3 KiB

Raw Blame History

name	description	allowed-tools	model
llm-security:red-team	Attack simulation — test hook defenses with crafted payloads	Bash, Read	sonnet

Red Team — Attack Simulation

Run crafted attack payloads against the plugin's own hooks to verify defenses.

What was requested

The user ran /security red-team to test their hook defenses.

Arguments

Parse $ARGUMENTS for:

--category <name> — filter: secrets, destructive, supply-chain, prompt-injection, pathguard, mcp-output, session-trifecta, hybrid, unicode-evasion, bash-evasion, hitl-traps, long-horizon, all
--json — raw JSON output
--adaptive — mutation-based evasion testing (5 mutation rounds per passing scenario)

Default: all categories, fixed mode.

Steps

Run the attack simulator:

node scanners/attack-simulator.mjs [--category <name>] [--verbose] [--adaptive]

The simulator runs 64 attack scenarios across 12 categories against the plugin's hooks. Each scenario sends a crafted payload and verifies the hook blocks or detects it.

In adaptive mode (--adaptive), for each scenario that passes (attack blocked), the simulator applies 5 mutation rounds:

Homoglyph substitution (Latin chars replaced with Cyrillic lookalikes)
Encoding wrapping (URL-encoded keywords)
Zero-width character injection (ZW chars inserted between keyword letters)
Case alternation (aLtErNaTiNg case)
Synonym substitution (keyword replacement from synonym table)

Bypasses are reported as findings but not auto-fixed.

Present the results as a narrative report:

For each category, explain:

What was tested (the attack type)
How many attacks were blocked
Whether defenses are adequate

If any scenarios fail, explain the gap and what hook needs attention.

In adaptive mode, also explain:

How many mutations were tested
Which mutations found bypasses
That bypasses are expected for synonym and encoding mutations (deterministic hooks cannot catch all evasions)

Defense Score interpretation:

100% — All hooks functioning correctly. No defense gaps.
90-99% — Minor gaps. Review failed scenarios.
Below 90% — Significant gaps. Hooks may be misconfigured or missing.

Category	Hook Tested	Scenarios
secrets	pre-edit-secrets.mjs	7 secret types (AWS, GitHub, PEM, DB, Bearer, Azure, Slack)
destructive	pre-bash-destructive.mjs	8 commands (rm -rf, chmod 777, curl\|bash, fork bomb, mkfs, dd, eval)
supply-chain	pre-install-supply-chain.mjs	4 managers (npm, pip, cargo, gem)
prompt-injection	pre-prompt-inject-scan.mjs	6 patterns (override, spoofed headers, identity, evasion)
pathguard	pre-write-pathguard.mjs	6 paths (.env, .ssh, .aws, .npmrc, /etc, hooks)
mcp-output	post-mcp-verify.mjs	4 threats (injection, secrets, HTML traps, MCP injection)
session-trifecta	post-session-guard.mjs	3 patterns (classic trifecta, MCP-concentrated, volume)
hybrid	post-mcp-verify.mjs	8 patterns (P2SQL, recursive injection, XSS variants)
unicode-evasion	pre-prompt-inject-scan.mjs	6 patterns (Unicode Tags, ZW chars, homoglyphs, BIDI, HTML entities, multi-lang)
bash-evasion	pre-bash-destructive.mjs	5 patterns (empty quotes, dollar expansion, backslash splitting, supply chain)
hitl-traps	post-mcp-verify.mjs	4 patterns (approval urgency, summary suppression, scope minimization, cognitive load)
long-horizon	post-session-guard.mjs	3 patterns (delegation-after-input, sensitive path, MCP-concentrated trifecta)

Mutation Types (Adaptive Mode)

Mutation	Technique	Expected Bypass Rate
homoglyph	Cyrillic/Latin lookalike substitution	Low (MEDIUM patterns detect)
encoding	URL-encode keywords	High (hooks normalize some, not all)
zero_width	Insert zero-width chars in keywords	Low (normalizer strips these)
case_alternation	aLtErNaTiNg case	Low (regex uses /i flag)
synonym	Replace with semantic equivalents	Medium (novel synonyms evade patterns)

Important

This tests the plugin's OWN hooks — it does not perform real exploits
No network calls, no file modifications, no LLM invocations
Safe to run repeatedly — all state is cleaned up after each run
Adaptive mode bypasses are expected — they document evasion resistance limits

HTML Report

After producing the markdown red-team narrative report above:

Compute a temp markdown path:

node -p "require('path').join(require('os').tmpdir(), 'sec-red-team-' + Date.now() + '.md')"

Use the Write tool to save the entire markdown report you just produced (per-category narrative + scenario pass/fail + defense score + adaptive-mode bypasses if --adaptive) to that temp path. Verbatim.
Run the renderer:
```
node <plugin-root>/scripts/render-report.mjs red-team --in "<temp-md-path>"
```
The CLI writes reports/red-team-<YYYYMMDD-HHmmss>.html relative to CWD and prints file:///abs/path.html on stdout.
Append to your response (markdown link, no bare URL):

HTML-rapport: Åpne i nettleser

If the CLI exits non-zero, mention the error but do not block — the markdown report above is the primary deliverable.

5.3 KiB Raw Blame History