Kjell Tore Guttormsen f418a8fe08 feat(llm-security-copilot): port llm-security v5.1.0 to GitHub Copilot CLI

Full port of llm-security plugin for internal use on Windows with GitHub
Copilot CLI. Protocol translation layer (copilot-hook-runner.mjs)
normalizes Copilot camelCase I/O to Claude Code snake_case format — all
original hook scripts run unmodified.

- 8 hooks with protocol translation (stdin/stdout/exit code)
- 18 SKILL.md skills (Agent Skills Open Standard)
- 6 .agent.md agent definitions
- 20 scanners + 14 scanner lib modules (unchanged)
- 14 knowledge files (unchanged)
- 39 test files including copilot-port-verify.mjs (17 tests)
- Windows-ready: node:path, os.tmpdir(), process.execPath, no bash

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-04-09 21:56:10 +02:00

1.9 KiB

Raw Blame History

name	description
security-red-team	Attack simulation — test hook defenses with crafted payloads across 12 categories

Red Team

Attack simulation testing hook defenses with crafted payloads. 64 scenarios across 12 categories.

Step 1: Parse Arguments

--category <name> — Run only one category
--verbose — Show individual scenario results
--adaptive — Enable mutation-based evasion testing (5 rounds per passing scenario)
--json — Raw JSON output

Step 2: Run Simulator

node <plugin-root>/scanners/attack-simulator.mjs [--category <name>] [--verbose] [--adaptive] [--json]

Step 3: Narrative Report

For each category, explain: what was tested, how many scenarios passed (blocked correctly), what gaps exist.

Categories (12):

Category	Hook Tested	Scenarios
secrets	pre-edit-secrets	Multiple
destructive	pre-bash-destructive	Multiple
supply-chain	pre-install-supply-chain	Multiple
prompt-injection	pre-prompt-inject-scan	Multiple
pathguard	pre-write-pathguard	Multiple
mcp-output	post-mcp-verify	Multiple
session-trifecta	post-session-guard	Multiple
hybrid	Multiple hooks	Multiple
unicode-evasion	pre-prompt-inject-scan	Multiple
bash-evasion	pre-bash-destructive	Multiple
hitl-traps	post-mcp-verify	Multiple
long-horizon	post-session-guard	Multiple

Step 4: Defense Score

100%: All scenarios correctly blocked
90-99%: Minor gaps, review failing scenarios
<90%: Significant gaps, prioritize fixes

Step 5: Adaptive Results (if --adaptive)

Mutation types: homoglyph substitution, encoding variants, zero-width insertion, case alternation, synonym replacement. Expected bypass rate varies by category.

Safety: No real exploits executed. No network calls. No file modifications. All payloads are synthetic test data.

1.9 KiB Raw Blame History