ktg-plugin-marketplace/plugins/llm-security/commands/red-team.md
Kjell Tore Guttormsen 03b8885b6e chore(llm-security): v7.7.2 — language consistency pass
~/.claude/CLAUDE.md specifies English for code and documentation,
Norwegian for dialog only. Norwegian had crept into surface text
across v7.5-v7.7. Translated to English in eight surfaces.

No scanner, hook, or behavior changes — purely surface text.

- 18 skill commands: the HTML Report-step now reads "HTML report:
  [Open in browser]" instead of "HTML-rapport: [Åpne i nettleser]"
- scripts/lib/report-renderers.mjs: key-stat labels, lede defaults,
  table headers, maturity-ladder descriptions, action-tier labels,
  clean buckets, dry-run/apply copy, and JS comments. Regex
  alternations /^high|^høy/ and /resolution|løsning/i preserved.
- playground/llm-security-playground.html: same renderer changes
  mirrored bit-identical, plus playground-only UI strings (catalog,
  breadcrumb aria-label, theme toggle, builder-modal hint,
  guide-panel "no projects yet", delete confirmation, alert/copy).
  Demo-state fixture content for dft-komplett-demo preserved
  (intentional Norwegian persona).
- agents/skill-scanner-agent.md + agents/mcp-scanner-agent.md:
  Generaliseringsgrense + Parallell Read-strategi sections translated
  to Generalization boundary + Parallel Read strategy.
- README.md: playground architecture prose + Recent versions table
  (v7.5.0 — v7.7.1).
- CLAUDE.md: v7.7.1 highlights translated, new v7.7.2 highlights
  added.
- ../../README.md: llm-security v7.5.0 — v7.7.1 bullets.
- ../../CLAUDE.md: llm-security catalog entry.
- docs/scanner-reference.md: six runnable-examples table cells.
- docs/version-history.md: new v7.7.2 entry. v7.5-v7.7 narrative
  sections left in original language (deferred per operator).
- Version bumped 7.7.1 → 7.7.2 in package.json,
  .claude-plugin/plugin.json, README badge + Recent versions,
  CLAUDE.md header + state, docs/version-history.md, playground
  renderHome hardcoded string, root README + CLAUDE.md llm-security
  entries.

Tests: 1820/1820 green. CLI smoke-test: 18/18 commandIds produce
>138 KB self-contained HTML. Browser-dogfood verified.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-19 06:47:44 +02:00

5.2 KiB

name description allowed-tools model
llm-security:red-team Attack simulation — test hook defenses with crafted payloads Bash, Read sonnet

Red Team — Attack Simulation

Run crafted attack payloads against the plugin's own hooks to verify defenses.

What was requested

The user ran /security red-team to test their hook defenses.

Arguments

Parse $ARGUMENTS for:

  • --category <name> — filter: secrets, destructive, supply-chain, prompt-injection, pathguard, mcp-output, session-trifecta, hybrid, unicode-evasion, bash-evasion, hitl-traps, long-horizon, all
  • --json — raw JSON output
  • --adaptive — mutation-based evasion testing (5 mutation rounds per passing scenario)

Default: all categories, fixed mode.

Steps

  1. Run the attack simulator:
node scanners/attack-simulator.mjs [--category <name>] [--verbose] [--adaptive]

The simulator runs 64 attack scenarios across 12 categories against the plugin's hooks. Each scenario sends a crafted payload and verifies the hook blocks or detects it.

In adaptive mode (--adaptive), for each scenario that passes (attack blocked), the simulator applies 5 mutation rounds:

  1. Homoglyph substitution (Latin chars replaced with Cyrillic lookalikes)
  2. Encoding wrapping (URL-encoded keywords)
  3. Zero-width character injection (ZW chars inserted between keyword letters)
  4. Case alternation (aLtErNaTiNg case)
  5. Synonym substitution (keyword replacement from synonym table)

Bypasses are reported as findings but not auto-fixed.

  1. Present the results as a narrative report:

For each category, explain:

  • What was tested (the attack type)
  • How many attacks were blocked
  • Whether defenses are adequate

If any scenarios fail, explain the gap and what hook needs attention.

In adaptive mode, also explain:

  • How many mutations were tested
  • Which mutations found bypasses
  • That bypasses are expected for synonym and encoding mutations (deterministic hooks cannot catch all evasions)
  1. Defense Score interpretation:
  • 100% — All hooks functioning correctly. No defense gaps.
  • 90-99% — Minor gaps. Review failed scenarios.
  • Below 90% — Significant gaps. Hooks may be misconfigured or missing.

Categories

Category Hook Tested Scenarios
secrets pre-edit-secrets.mjs 7 secret types (AWS, GitHub, PEM, DB, Bearer, Azure, Slack)
destructive pre-bash-destructive.mjs 8 commands (rm -rf, chmod 777, curl|bash, fork bomb, mkfs, dd, eval)
supply-chain pre-install-supply-chain.mjs 4 managers (npm, pip, cargo, gem)
prompt-injection pre-prompt-inject-scan.mjs 6 patterns (override, spoofed headers, identity, evasion)
pathguard pre-write-pathguard.mjs 6 paths (.env, .ssh, .aws, .npmrc, /etc, hooks)
mcp-output post-mcp-verify.mjs 4 threats (injection, secrets, HTML traps, MCP injection)
session-trifecta post-session-guard.mjs 3 patterns (classic trifecta, MCP-concentrated, volume)
hybrid post-mcp-verify.mjs 8 patterns (P2SQL, recursive injection, XSS variants)
unicode-evasion pre-prompt-inject-scan.mjs 6 patterns (Unicode Tags, ZW chars, homoglyphs, BIDI, HTML entities, multi-lang)
bash-evasion pre-bash-destructive.mjs 5 patterns (empty quotes, dollar expansion, backslash splitting, supply chain)
hitl-traps post-mcp-verify.mjs 4 patterns (approval urgency, summary suppression, scope minimization, cognitive load)
long-horizon post-session-guard.mjs 3 patterns (delegation-after-input, sensitive path, MCP-concentrated trifecta)

Mutation Types (Adaptive Mode)

Mutation Technique Expected Bypass Rate
homoglyph Cyrillic/Latin lookalike substitution Low (MEDIUM patterns detect)
encoding URL-encode keywords High (hooks normalize some, not all)
zero_width Insert zero-width chars in keywords Low (normalizer strips these)
case_alternation aLtErNaTiNg case Low (regex uses /i flag)
synonym Replace with semantic equivalents Medium (novel synonyms evade patterns)

Important

  • This tests the plugin's OWN hooks — it does not perform real exploits
  • No network calls, no file modifications, no LLM invocations
  • Safe to run repeatedly — all state is cleaned up after each run
  • Adaptive mode bypasses are expected — they document evasion resistance limits

HTML Report

After producing the markdown red-team narrative report above:

  1. Compute a temp markdown path:

    node -p "require('path').join(require('os').tmpdir(), 'sec-red-team-' + Date.now() + '.md')"
    
  2. Use the Write tool to save the entire markdown report you just produced (per-category narrative + scenario pass/fail + defense score + adaptive-mode bypasses if --adaptive) to that temp path. Verbatim.

  3. Run the renderer:

    node <plugin-root>/scripts/render-report.mjs red-team --in "<temp-md-path>"
    

    The CLI writes reports/red-team-<YYYYMMDD-HHmmss>.html relative to CWD and prints file:///abs/path.html on stdout.

  4. Append to your response (markdown link, no bare URL):

    HTML report: Open in browser

If the CLI exits non-zero, mention the error but do not block — the markdown report above is the primary deliverable.