ktg-plugin-marketplace/plugins/llm-security/knowledge/mitigation-matrix.md

# Mitigation Matrix

Maps OWASP LLM Top 10 threats to Claude Code-specific controls.

Used by `posture-assessor-agent` to evaluate which controls are in place and which are missing.

## How to Read This Matrix

- **Automated:** Controls enforced by hooks (no human intervention required)
- **Configured:** Controls that require explicit setup in settings.json, CLAUDE.md, or plugin config
- **Advisory:** Controls provided by scanning/auditing commands — humans must act on findings
- **External:** Controls outside Claude Code's scope (network, IAM, model provider, OS)

**Verification checks** are concrete, machine-readable conditions the posture assessor can evaluate.

---

## Matrix

### LLM01 — Prompt Injection (MITRE ATLAS: AML.T0051)

Attacker injects instructions via external content (files, web pages, tool outputs) that override intended behavior.

| Control | Type | Implementation | Verification Check |
|---------|------|----------------|--------------------|
| Deny-first tool permissions | Configured | `settings.json` → deny Write/Edit/Bash by default; grant only what is needed | `settings.json` has `"deny": ["Write", "Edit", "Bash"]` or equivalent |
| Skill/command vetting | Advisory | `/security scan` before installing third-party skills or commands | Scan report exists and is clean for installed skills |
| CLAUDE.md anti-override guardrails | Configured | CLAUDE.md includes explicit anti-jailbreak instructions and scope boundaries | CLAUDE.md contains security or scope-guard section |
| Input sanitization hook | Automated | `pre-edit-secrets.mjs` scans file edits for injection patterns | Hook file exists and is registered in `hooks.json` |
| MCP output verification | Automated | `post-mcp-verify.mjs` checks MCP tool outputs for unexpected instruction content | Hook file exists and is registered in `hooks.json` |
| Minimal context exposure | Configured | CLAUDE.md and system prompts avoid embedding sensitive credentials or secrets | CLAUDE.md contains no secret patterns (run secrets-patterns check) |
| Prompt injection input scanning | Automated | `pre-prompt-inject-scan.mjs` detects CRITICAL/HIGH/MEDIUM injection patterns in user prompts | Hook file exists; MEDIUM advisory enabled |
| Unicode Tag steganography detection | Automated | `string-utils.mjs` decodes U+E0000-E007F tags; `injection-patterns.mjs` escalates to CRITICAL/HIGH | `decodeUnicodeTags()` in normalization pipeline |
| Bash evasion normalization | Automated | `bash-normalize.mjs` strips parameter expansion before pattern matching | `normalizeBashExpansion()` called by both bash hooks |
| Rule of Two detection (block-mode opt-in) | Automated | `post-session-guard.mjs` detects trifecta (untrusted input + sensitive data + exfil); blocks only when `LLM_SECURITY_TRIFECTA_MODE=block` AND high-confidence trifecta is observed; default `warn` | `LLM_SECURITY_TRIFECTA_MODE` env var respected; block mode opt-in |
| Long-horizon monitoring | Automated | `post-session-guard.mjs` 100-call window + behavioral drift detection | Long-horizon window active alongside 20-call window |
| HITL trap detection | Automated | `injection-patterns.mjs` HIGH patterns for approval urgency, summary suppression, scope minimization | HITL patterns present in HIGH_PATTERNS array |
| Hybrid attack detection | Automated | `injection-patterns.mjs` HYBRID_PATTERNS for P2SQL, recursive injection, XSS | Hybrid patterns checked in tool output scanning |

---

### LLM02 — Sensitive Information Disclosure (MITRE ATLAS: AML.T0024)

Model reveals sensitive data from training, context, or external sources in its outputs.

| Control | Type | Implementation | Verification Check |
|---------|------|----------------|--------------------|
| Secrets pattern detection (edit) | Automated | `pre-edit-secrets.mjs` blocks writes containing API keys, passwords, tokens | Hook exists; `knowledge/secrets-patterns.md` is present |
| Path guard for sensitive files | Automated | `pre-write-pathguard.mjs` blocks writes to `.env`, `*.key`, `credentials.*`, `.aws/` | Hook exists; sensitive path list is up to date |
| MCP output scanning | Automated | `post-mcp-verify.mjs` scans MCP responses for PII or secret patterns | Hook registered for PostToolUse/Bash |
| `.gitignore` discipline | Configured | `.env`, `*.key`, `*.pem`, `secrets.*` in `.gitignore` | Project `.gitignore` includes standard secret exclusions |
| No secrets in CLAUDE.md | Advisory | `/security audit` checks CLAUDE.md and agents for embedded secrets | Audit report shows no secret patterns in markdown files |
| Env-var pattern enforcement | Configured | Templates use `.env`/`.template` pattern; actual values never committed | No `.env` files tracked in git (`git ls-files *.env` empty) |

---

### LLM03 — Supply Chain Vulnerabilities (MITRE ATLAS: AML.T0010)

Compromised models, plugins, or MCP servers introduce malicious behavior.

| Control | Type | Implementation | Verification Check |
|---------|------|----------------|--------------------|
| MCP server audit | Advisory | `/security mcp-audit` reviews all MCP configs for source, permissions, network exposure | MCP audit report exists and is current |
| Plugin source verification | Advisory | `/security scan` on skill/agent files before activation | Skill scanner report clean for all installed plugins |
| Dependency pinning | Configured | MCP server dependencies pinned to specific versions in `package.json` or `requirements.txt` | No unpinned `latest` or `*` versions in MCP server deps |
| Pre-deploy checklist | Advisory | `/security pre-deploy` includes supply chain verification step | Pre-deploy report completed before production deployment |
| Minimal MCP permissions | Configured | MCP servers granted only required scopes; no wildcard access | MCP configs do not use `*` scope grants |

---

### LLM04 — Data and Model Poisoning (MITRE ATLAS: AML.T0020)

Malicious training data or fine-tuning corrupts model behavior.

| Control | Type | Implementation | Verification Check |
|---------|------|----------------|--------------------|
| Use vetted base models only | External | Organizational policy: approved model list from provider (Anthropic, Azure OpenAI) | Model IDs in config match approved list |
| No untrusted fine-tuning | External | Fine-tuning pipelines gated by data review process | Fine-tuning dataset provenance documented |
| Knowledge base integrity | Advisory | `/security audit` checks knowledge files for injected malicious content | Audit covers `knowledge/` directories |
| Prompt content review | Advisory | Skill scanner checks agent/command prompts for anomalous instructions | `skill-scanner-agent` run on all agents |
| Threat model coverage | Advisory | `/security threat-model` includes data pipeline as attack surface | Threat model document exists and covers data sources |

---

### LLM05 — Improper Output Handling (MITRE ATLAS: AML.T0043)

Model output treated as trusted without sanitization, leading to injection in downstream systems.

| Control | Type | Implementation | Verification Check |
|---------|------|----------------|--------------------|
| MCP output verification | Automated | `post-mcp-verify.mjs` scans tool outputs before they reach downstream consumers | Hook registered and active |
| Destructive command blocking | Automated | `pre-bash-destructive.mjs` prevents shell injection from model-generated commands | Hook exists; blocklist includes `rm -rf`, `DROP TABLE`, `curl \| sh` patterns |
| No direct shell execution of model output | Configured | CLAUDE.md explicitly prohibits passing raw model output to `eval` or shell | CLAUDE.md has output-handling guardrail |
| Output template enforcement | Advisory | Report templates in `templates/` provide structured output that avoids raw passthrough | Templates used by scan/audit commands |
| Code review before execution | Advisory | `/security pre-deploy` requires human review of model-generated scripts | Pre-deploy checklist includes output review step |

---

### LLM06 — Excessive Agency (MITRE ATLAS: AML.T0061)

Model granted too many permissions or capabilities, enabling unintended high-impact actions.

| Control | Type | Implementation | Verification Check |
|---------|------|----------------|--------------------|
| Deny-first permissions | Configured | `settings.json` starts from deny-all; explicit allow-list per command | `settings.json` does not use broad `"allow": ["*"]` |
| Tool allowlist per command | Configured | Each command's frontmatter declares minimum required tools | All `commands/*.md` have explicit `allowed-tools` list |
| Agent tool restriction | Configured | Agent frontmatter limits tools to Read/Glob/Grep unless justified | Agents do not have Write/Bash without documented rationale |
| Over-permissioning scan | Advisory | `skill-scanner-agent` flags commands/agents with excessive tool grants | Skill scanner report shows no over-permissioning findings |
| No autonomous external calls | Configured | Agents restricted from making unapproved network calls via Bash | `pre-bash-destructive.mjs` blocks `curl`, `wget` without approval |
| Human-in-the-loop for destructive ops | Automated | Destructive bash commands blocked; require explicit user re-invocation | Hook blocks and logs; no auto-bypass mechanism |

---

### LLM07 — System Prompt Leakage (MITRE ATLAS: AML.T0024)

System prompt or CLAUDE.md exposed through adversarial extraction, revealing security controls.

| Control | Type | Implementation | Verification Check |
|---------|------|----------------|--------------------|
| Security-by-design (not obscurity) | Configured | Controls enforced by hooks and settings, not just prompt instructions | Hooks exist independently of CLAUDE.md instructions |
| No secrets in system prompt | Advisory | `/security audit` checks CLAUDE.md for embedded secrets or keys | Audit report clean for CLAUDE.md content |
| Minimal sensitive detail in prompts | Configured | CLAUDE.md describes policy intent, not implementation bypass paths | CLAUDE.md reviewed for info that aids bypass |
| Prompt disclosure awareness | Advisory | Threat model documents that CLAUDE.md may be readable by the model | Threat model includes system prompt as attack surface |
| Defense in depth | Configured | Multiple independent control layers so prompt leakage does not collapse security | Hooks + settings + CLAUDE.md all present (not sole reliance on one layer) |

---

### LLM08 — Vector and Embedding Weaknesses (MITRE ATLAS: AML.T0020)

Manipulated embeddings or vector store content used to inject malicious context into RAG pipelines.

| Control | Type | Implementation | Verification Check |
|---------|------|----------------|--------------------|
| Knowledge base content review | Advisory | `/security audit` scans `knowledge/` files for injected instructions | Audit includes knowledge base scan |
| Source attribution in KB | Configured | Knowledge files include source and date metadata | KB files have provenance headers |
| RAG input sanitization | External | Vector store / RAG pipeline sanitizes retrieved chunks before injection | RAG pipeline has input validation (organizational control) |
| Embedding access control | External | Vector stores gated by IAM; not publicly writable | Access control documented for vector infrastructure |
| Retrieval result verification | Advisory | Agents instructed to verify retrieved content plausibility before use | Agent prompts include retrieval skepticism instruction |

---

### LLM09 — Misinformation (MITRE ATLAS: AML.T0031)

Model generates plausible but false information, leading to incorrect decisions.

| Control | Type | Implementation | Verification Check |
|---------|------|----------------|--------------------|
| Authoritative knowledge base | Configured | Plugin uses curated `knowledge/` files as grounding for security recommendations | `knowledge/` directory contains up-to-date OWASP and threat pattern files |
| Source citation in outputs | Configured | Commands instruct agents to cite knowledge file sources in reports | Report templates include source section |
| Human review gate | Advisory | All advisory reports require human review before action | CLAUDE.md and command docs state reports are advisory, not authoritative |
| Threat model validation | Advisory | `/security threat-model` output reviewed by security professional | Threat model review step documented in pre-deploy checklist |
| Confidence indicators | Advisory | Agents use hedged language for uncertain findings | Agent prompts instruct use of `HIGH/MEDIUM/LOW` confidence levels |
| Hallucination risk documentation | Configured | CLAUDE.md explicitly documents that AI outputs require validation | CLAUDE.md contains disclaimer on AI-generated security findings |

---

### LLM10 — Unbounded Consumption (MITRE ATLAS: AML.T0029)

Model or agents consume excessive compute, tokens, or API calls, causing denial of service or cost overruns.

| Control | Type | Implementation | Verification Check |
|---------|------|----------------|--------------------|
| Scoped scanning targets | Configured | Commands accept explicit file/directory targets; no default full-repo scan | `scan.md` and `audit.md` require explicit scope argument |
| Agent timeout discipline | Configured | Agents instructed to limit research depth and report within scope | Agent prompts include scope and depth constraints |
| No recursive agent spawning | Configured | Agents do not spawn additional agents without explicit command | Agent frontmatter and prompts prohibit autonomous subagent creation |
| MCP call limiting | Configured | MCP-using commands have documented call budgets | `mcp-audit.md` documents expected MCP call count |
| Cost-aware model selection | Configured | Expensive operations (threat modeling) use Opus; scanning uses Sonnet | Command frontmatter uses `model: sonnet` for scan/audit, `model: opus` for threat-model |
| Session scope guard | Configured | CLAUDE.md scope-guard prevents unbounded task escalation | CLAUDE.md has scope-guard section |

---

## Coverage Summary

| Category | Name | Automated | Configured | Advisory | External | Total Controls | Coverage |
|----------|------|-----------|------------|----------|----------|----------------|----------|
| LLM01 | Prompt Injection | 9 | 3 | 1 | 0 | 13 | 92% |
| LLM02 | Sensitive Info Disclosure | 3 | 2 | 1 | 0 | 6 | 83% |
| LLM03 | Supply Chain | 0 | 2 | 3 | 0 | 5 | 60% |
| LLM04 | Data & Model Poisoning | 0 | 0 | 3 | 2 | 5 | 40% |
| LLM05 | Improper Output Handling | 2 | 2 | 1 | 0 | 5 | 80% |
| LLM06 | Excessive Agency | 3 | 3 | 0 | 0 | 6 | 100% |
| LLM07 | System Prompt Leakage | 0 | 3 | 2 | 0 | 5 | 60% |
| LLM08 | Vector & Embedding Weaknesses | 0 | 1 | 2 | 2 | 5 | 40% |
| LLM09 | Misinformation | 0 | 3 | 3 | 0 | 6 | 50% |
| LLM10 | Unbounded Consumption | 0 | 5 | 1 | 0 | 6 | 83% |

**Coverage scoring:**
- 100% = All applicable controls implemented
- 80-99% = Strong coverage, minor gaps
- 60-79% = Moderate coverage, notable gaps
- 40-59% = Partial coverage, significant gaps
- <40% = Minimal coverage — high risk

**Note:** LLM04 and LLM08 score lower because their primary controls are external (model provider and infrastructure). For Claude Code projects, these categories require organizational controls beyond what the plugin can enforce.

---

## Posture Assessor Checklist

When `posture-assessor-agent` evaluates a project, verify the following in order:

### Automated Controls (hooks) — Verify All Present
- [ ] `hooks/scripts/pre-edit-secrets.mjs` exists
- [ ] `hooks/scripts/pre-write-pathguard.mjs` exists
- [ ] `hooks/scripts/pre-bash-destructive.mjs` exists
- [ ] `hooks/scripts/post-mcp-verify.mjs` exists
- [ ] `hooks/hooks.json` registers all four hooks

### Configured Controls — Verify in settings.json and CLAUDE.md
- [ ] `settings.json` has deny-first permissions (no broad `"allow": ["*"]`)
- [ ] Command frontmatter has explicit `allowed-tools` lists
- [ ] Agent frontmatter restricts tools to minimum required
- [ ] CLAUDE.md has scope-guard / anti-override section
- [ ] `.gitignore` excludes `.env`, `*.key`, `*.pem`, `credentials.*`
- [ ] No secrets embedded in CLAUDE.md, agent prompts, or command files

### Advisory Controls — Evidence of Use
- [ ] `/security scan` report present or run recently
- [ ] `/security audit` report present or run recently
- [ ] `/security mcp-audit` report if MCP servers are configured
- [ ] `/security threat-model` report present for production systems
- [ ] `/security pre-deploy` checklist completed before deployment

### Scoring Guidance

| Automated controls present | Configured controls present | Advisory evidence | Score Band |
|----------------------------|-----------------------------|-------------------|------------|
| 5/5 | 6/6 | 3/5 | A (90+) |
| 4/5 | 5/6 | 2/5 | B (75-89) |
| 3/5 | 4/6 | 1/5 | C (60-74) |
| 2/5 | 3/6 | 0/5 | D (40-59) |
| <2/5 | <3/6 | 0/5 | F (<40) |