ktg-plugin-marketplace/plugins/llm-security/knowledge/mitigation-matrix.md

16 KiB

Mitigation Matrix

Maps OWASP LLM Top 10 threats to Claude Code-specific controls.

Used by posture-assessor-agent to evaluate which controls are in place and which are missing.

How to Read This Matrix

  • Automated: Controls enforced by hooks (no human intervention required)
  • Configured: Controls that require explicit setup in settings.json, CLAUDE.md, or plugin config
  • Advisory: Controls provided by scanning/auditing commands — humans must act on findings
  • External: Controls outside Claude Code's scope (network, IAM, model provider, OS)

Verification checks are concrete, machine-readable conditions the posture assessor can evaluate.


Matrix

LLM01 — Prompt Injection (MITRE ATLAS: AML.T0051)

Attacker injects instructions via external content (files, web pages, tool outputs) that override intended behavior.

Control Type Implementation Verification Check
Deny-first tool permissions Configured settings.json → deny Write/Edit/Bash by default; grant only what is needed settings.json has "deny": ["Write", "Edit", "Bash"] or equivalent
Skill/command vetting Advisory /security scan before installing third-party skills or commands Scan report exists and is clean for installed skills
CLAUDE.md anti-override guardrails Configured CLAUDE.md includes explicit anti-jailbreak instructions and scope boundaries CLAUDE.md contains security or scope-guard section
Input sanitization hook Automated pre-edit-secrets.mjs scans file edits for injection patterns Hook file exists and is registered in hooks.json
MCP output verification Automated post-mcp-verify.mjs checks MCP tool outputs for unexpected instruction content Hook file exists and is registered in hooks.json
Minimal context exposure Configured CLAUDE.md and system prompts avoid embedding sensitive credentials or secrets CLAUDE.md contains no secret patterns (run secrets-patterns check)
Prompt injection input scanning Automated pre-prompt-inject-scan.mjs detects CRITICAL/HIGH/MEDIUM injection patterns in user prompts Hook file exists; MEDIUM advisory enabled
Unicode Tag steganography detection Automated string-utils.mjs decodes U+E0000-E007F tags; injection-patterns.mjs escalates to CRITICAL/HIGH decodeUnicodeTags() in normalization pipeline
Bash evasion normalization Automated bash-normalize.mjs strips parameter expansion before pattern matching normalizeBashExpansion() called by both bash hooks
Rule of Two enforcement Automated post-session-guard.mjs detects trifecta (untrusted input + sensitive data + exfil) LLM_SECURITY_TRIFECTA_MODE env var respected; block mode available
Long-horizon monitoring Automated post-session-guard.mjs 100-call window + behavioral drift detection Long-horizon window active alongside 20-call window
HITL trap detection Automated injection-patterns.mjs HIGH patterns for approval urgency, summary suppression, scope minimization HITL patterns present in HIGH_PATTERNS array
Hybrid attack detection Automated injection-patterns.mjs HYBRID_PATTERNS for P2SQL, recursive injection, XSS Hybrid patterns checked in tool output scanning

LLM02 — Sensitive Information Disclosure (MITRE ATLAS: AML.T0024)

Model reveals sensitive data from training, context, or external sources in its outputs.

Control Type Implementation Verification Check
Secrets pattern detection (edit) Automated pre-edit-secrets.mjs blocks writes containing API keys, passwords, tokens Hook exists; knowledge/secrets-patterns.md is present
Path guard for sensitive files Automated pre-write-pathguard.mjs blocks writes to .env, *.key, credentials.*, .aws/ Hook exists; sensitive path list is up to date
MCP output scanning Automated post-mcp-verify.mjs scans MCP responses for PII or secret patterns Hook registered for PostToolUse/Bash
.gitignore discipline Configured .env, *.key, *.pem, secrets.* in .gitignore Project .gitignore includes standard secret exclusions
No secrets in CLAUDE.md Advisory /security audit checks CLAUDE.md and agents for embedded secrets Audit report shows no secret patterns in markdown files
Env-var pattern enforcement Configured Templates use .env/.template pattern; actual values never committed No .env files tracked in git (git ls-files *.env empty)

LLM03 — Supply Chain Vulnerabilities (MITRE ATLAS: AML.T0010)

Compromised models, plugins, or MCP servers introduce malicious behavior.

Control Type Implementation Verification Check
MCP server audit Advisory /security mcp-audit reviews all MCP configs for source, permissions, network exposure MCP audit report exists and is current
Plugin source verification Advisory /security scan on skill/agent files before activation Skill scanner report clean for all installed plugins
Dependency pinning Configured MCP server dependencies pinned to specific versions in package.json or requirements.txt No unpinned latest or * versions in MCP server deps
Pre-deploy checklist Advisory /security pre-deploy includes supply chain verification step Pre-deploy report completed before production deployment
Minimal MCP permissions Configured MCP servers granted only required scopes; no wildcard access MCP configs do not use * scope grants

LLM04 — Data and Model Poisoning (MITRE ATLAS: AML.T0020)

Malicious training data or fine-tuning corrupts model behavior.

Control Type Implementation Verification Check
Use vetted base models only External Organizational policy: approved model list from provider (Anthropic, Azure OpenAI) Model IDs in config match approved list
No untrusted fine-tuning External Fine-tuning pipelines gated by data review process Fine-tuning dataset provenance documented
Knowledge base integrity Advisory /security audit checks knowledge files for injected malicious content Audit covers knowledge/ directories
Prompt content review Advisory Skill scanner checks agent/command prompts for anomalous instructions skill-scanner-agent run on all agents
Threat model coverage Advisory /security threat-model includes data pipeline as attack surface Threat model document exists and covers data sources

LLM05 — Improper Output Handling (MITRE ATLAS: AML.T0043)

Model output treated as trusted without sanitization, leading to injection in downstream systems.

Control Type Implementation Verification Check
MCP output verification Automated post-mcp-verify.mjs scans tool outputs before they reach downstream consumers Hook registered and active
Destructive command blocking Automated pre-bash-destructive.mjs prevents shell injection from model-generated commands Hook exists; blocklist includes rm -rf, DROP TABLE, curl | sh patterns
No direct shell execution of model output Configured CLAUDE.md explicitly prohibits passing raw model output to eval or shell CLAUDE.md has output-handling guardrail
Output template enforcement Advisory Report templates in templates/ provide structured output that avoids raw passthrough Templates used by scan/audit commands
Code review before execution Advisory /security pre-deploy requires human review of model-generated scripts Pre-deploy checklist includes output review step

LLM06 — Excessive Agency (MITRE ATLAS: AML.T0061)

Model granted too many permissions or capabilities, enabling unintended high-impact actions.

Control Type Implementation Verification Check
Deny-first permissions Configured settings.json starts from deny-all; explicit allow-list per command settings.json does not use broad "allow": ["*"]
Tool allowlist per command Configured Each command's frontmatter declares minimum required tools All commands/*.md have explicit allowed-tools list
Agent tool restriction Configured Agent frontmatter limits tools to Read/Glob/Grep unless justified Agents do not have Write/Bash without documented rationale
Over-permissioning scan Advisory skill-scanner-agent flags commands/agents with excessive tool grants Skill scanner report shows no over-permissioning findings
No autonomous external calls Configured Agents restricted from making unapproved network calls via Bash pre-bash-destructive.mjs blocks curl, wget without approval
Human-in-the-loop for destructive ops Automated Destructive bash commands blocked; require explicit user re-invocation Hook blocks and logs; no auto-bypass mechanism

LLM07 — System Prompt Leakage (MITRE ATLAS: AML.T0024)

System prompt or CLAUDE.md exposed through adversarial extraction, revealing security controls.

Control Type Implementation Verification Check
Security-by-design (not obscurity) Configured Controls enforced by hooks and settings, not just prompt instructions Hooks exist independently of CLAUDE.md instructions
No secrets in system prompt Advisory /security audit checks CLAUDE.md for embedded secrets or keys Audit report clean for CLAUDE.md content
Minimal sensitive detail in prompts Configured CLAUDE.md describes policy intent, not implementation bypass paths CLAUDE.md reviewed for info that aids bypass
Prompt disclosure awareness Advisory Threat model documents that CLAUDE.md may be readable by the model Threat model includes system prompt as attack surface
Defense in depth Configured Multiple independent control layers so prompt leakage does not collapse security Hooks + settings + CLAUDE.md all present (not sole reliance on one layer)

LLM08 — Vector and Embedding Weaknesses (MITRE ATLAS: AML.T0020)

Manipulated embeddings or vector store content used to inject malicious context into RAG pipelines.

Control Type Implementation Verification Check
Knowledge base content review Advisory /security audit scans knowledge/ files for injected instructions Audit includes knowledge base scan
Source attribution in KB Configured Knowledge files include source and date metadata KB files have provenance headers
RAG input sanitization External Vector store / RAG pipeline sanitizes retrieved chunks before injection RAG pipeline has input validation (organizational control)
Embedding access control External Vector stores gated by IAM; not publicly writable Access control documented for vector infrastructure
Retrieval result verification Advisory Agents instructed to verify retrieved content plausibility before use Agent prompts include retrieval skepticism instruction

LLM09 — Misinformation (MITRE ATLAS: AML.T0031)

Model generates plausible but false information, leading to incorrect decisions.

Control Type Implementation Verification Check
Authoritative knowledge base Configured Plugin uses curated knowledge/ files as grounding for security recommendations knowledge/ directory contains up-to-date OWASP and threat pattern files
Source citation in outputs Configured Commands instruct agents to cite knowledge file sources in reports Report templates include source section
Human review gate Advisory All advisory reports require human review before action CLAUDE.md and command docs state reports are advisory, not authoritative
Threat model validation Advisory /security threat-model output reviewed by security professional Threat model review step documented in pre-deploy checklist
Confidence indicators Advisory Agents use hedged language for uncertain findings Agent prompts instruct use of HIGH/MEDIUM/LOW confidence levels
Hallucination risk documentation Configured CLAUDE.md explicitly documents that AI outputs require validation CLAUDE.md contains disclaimer on AI-generated security findings

LLM10 — Unbounded Consumption (MITRE ATLAS: AML.T0029)

Model or agents consume excessive compute, tokens, or API calls, causing denial of service or cost overruns.

Control Type Implementation Verification Check
Scoped scanning targets Configured Commands accept explicit file/directory targets; no default full-repo scan scan.md and audit.md require explicit scope argument
Agent timeout discipline Configured Agents instructed to limit research depth and report within scope Agent prompts include scope and depth constraints
No recursive agent spawning Configured Agents do not spawn additional agents without explicit command Agent frontmatter and prompts prohibit autonomous subagent creation
MCP call limiting Configured MCP-using commands have documented call budgets mcp-audit.md documents expected MCP call count
Cost-aware model selection Configured Expensive operations (threat modeling) use Opus; scanning uses Sonnet Command frontmatter uses model: sonnet for scan/audit, model: opus for threat-model
Session scope guard Configured CLAUDE.md scope-guard prevents unbounded task escalation CLAUDE.md has scope-guard section

Coverage Summary

Category Name Automated Configured Advisory External Total Controls Coverage
LLM01 Prompt Injection 9 3 1 0 13 92%
LLM02 Sensitive Info Disclosure 3 2 1 0 6 83%
LLM03 Supply Chain 0 2 3 0 5 60%
LLM04 Data & Model Poisoning 0 0 3 2 5 40%
LLM05 Improper Output Handling 2 2 1 0 5 80%
LLM06 Excessive Agency 3 3 0 0 6 100%
LLM07 System Prompt Leakage 0 3 2 0 5 60%
LLM08 Vector & Embedding Weaknesses 0 1 2 2 5 40%
LLM09 Misinformation 0 3 3 0 6 50%
LLM10 Unbounded Consumption 0 5 1 0 6 83%

Coverage scoring:

  • 100% = All applicable controls implemented
  • 80-99% = Strong coverage, minor gaps
  • 60-79% = Moderate coverage, notable gaps
  • 40-59% = Partial coverage, significant gaps
  • <40% = Minimal coverage — high risk

Note: LLM04 and LLM08 score lower because their primary controls are external (model provider and infrastructure). For Claude Code projects, these categories require organizational controls beyond what the plugin can enforce.


Posture Assessor Checklist

When posture-assessor-agent evaluates a project, verify the following in order:

Automated Controls (hooks) — Verify All Present

  • hooks/scripts/pre-edit-secrets.mjs exists
  • hooks/scripts/pre-write-pathguard.mjs exists
  • hooks/scripts/pre-bash-destructive.mjs exists
  • hooks/scripts/post-mcp-verify.mjs exists
  • hooks/hooks.json registers all four hooks

Configured Controls — Verify in settings.json and CLAUDE.md

  • settings.json has deny-first permissions (no broad "allow": ["*"])
  • Command frontmatter has explicit allowed-tools lists
  • Agent frontmatter restricts tools to minimum required
  • CLAUDE.md has scope-guard / anti-override section
  • .gitignore excludes .env, *.key, *.pem, credentials.*
  • No secrets embedded in CLAUDE.md, agent prompts, or command files

Advisory Controls — Evidence of Use

  • /security scan report present or run recently
  • /security audit report present or run recently
  • /security mcp-audit report if MCP servers are configured
  • /security threat-model report present for production systems
  • /security pre-deploy checklist completed before deployment

Scoring Guidance

Automated controls present Configured controls present Advisory evidence Score Band
5/5 6/6 3/5 A (90+)
4/5 5/6 2/5 B (75-89)
3/5 4/6 1/5 C (60-74)
2/5 3/6 0/5 D (40-59)
<2/5 <3/6 0/5 F (<40)