Closes A3 of v7.1.0 critical-review patch. Each rewrite preserves the underlying claim where it is accurate but removes hype/overreach language. Historical CHANGELOG/README version-table rows are intentionally left as-is (they document what was claimed at the time of release, not what is true today). Changes (CLAUDE.md, commands/ide-scan.md, knowledge/mitigation-matrix.md, docs/security-hardening-guide.md): - "Trustworthy scoring (BREAKING)" → "Severity-dominated risk scoring (v2 model, BREAKING)". Removes hype framing; describes the actual mechanism. - "Context-aware entropy scanner" → "Rule-based entropy scanner with file-extension skip, 8 line-level suppression rules, and configurable policy". No ML/context inference; just rules. - "1487 tests" → "1511 unit and integration tests; mutation-testing coverage not published". Updated count after A1+A2 (+24) and added qualifier. - "Fully Schrems II compatible" → "Schrems II compatible in default offline mode. Optional OSV.dev enrichment (`supply-chain-recheck --online`) transmits package identifiers to a Google-operated API and is a separate compliance consideration." Acknowledges the OSV.dev opt-in caveat. - "Rule of Two enforcement" → "Rule of Two detection (configurable; default warn; blocks on high-confidence trifectas in opt-in `block` mode; distributed trifectas detected but not blocked by default)". "Enforcement" implied block; default is warn. - "Hardened ZIP extractor" → suffix " — no fuzz-testing results published to date". Caps and class-of-attacks rejected are accurate; absence of formal fuzz coverage now stated. - "defense-in-depth" — preserved as framing, but quantified in security-hardening-guide §4: "three independent detection layers with documented bypass classes". Each layer named, each layer's known bypasses pointed to (critical-review §4 evasion arsenal). Tests: 1511/1511 green (no behavioural change).
232 lines
17 KiB
Markdown
232 lines
17 KiB
Markdown
# Mitigation Matrix
|
|
|
|
Maps OWASP LLM Top 10 threats to Claude Code-specific controls.
|
|
|
|
Used by `posture-assessor-agent` to evaluate which controls are in place and which are missing.
|
|
|
|
## How to Read This Matrix
|
|
|
|
- **Automated:** Controls enforced by hooks (no human intervention required)
|
|
- **Configured:** Controls that require explicit setup in settings.json, CLAUDE.md, or plugin config
|
|
- **Advisory:** Controls provided by scanning/auditing commands — humans must act on findings
|
|
- **External:** Controls outside Claude Code's scope (network, IAM, model provider, OS)
|
|
|
|
**Verification checks** are concrete, machine-readable conditions the posture assessor can evaluate.
|
|
|
|
---
|
|
|
|
## Matrix
|
|
|
|
### LLM01 — Prompt Injection (MITRE ATLAS: AML.T0051)
|
|
|
|
Attacker injects instructions via external content (files, web pages, tool outputs) that override intended behavior.
|
|
|
|
| Control | Type | Implementation | Verification Check |
|
|
|---------|------|----------------|--------------------|
|
|
| Deny-first tool permissions | Configured | `settings.json` → deny Write/Edit/Bash by default; grant only what is needed | `settings.json` has `"deny": ["Write", "Edit", "Bash"]` or equivalent |
|
|
| Skill/command vetting | Advisory | `/security scan` before installing third-party skills or commands | Scan report exists and is clean for installed skills |
|
|
| CLAUDE.md anti-override guardrails | Configured | CLAUDE.md includes explicit anti-jailbreak instructions and scope boundaries | CLAUDE.md contains security or scope-guard section |
|
|
| Input sanitization hook | Automated | `pre-edit-secrets.mjs` scans file edits for injection patterns | Hook file exists and is registered in `hooks.json` |
|
|
| MCP output verification | Automated | `post-mcp-verify.mjs` checks MCP tool outputs for unexpected instruction content | Hook file exists and is registered in `hooks.json` |
|
|
| Minimal context exposure | Configured | CLAUDE.md and system prompts avoid embedding sensitive credentials or secrets | CLAUDE.md contains no secret patterns (run secrets-patterns check) |
|
|
| Prompt injection input scanning | Automated | `pre-prompt-inject-scan.mjs` detects CRITICAL/HIGH/MEDIUM injection patterns in user prompts | Hook file exists; MEDIUM advisory enabled |
|
|
| Unicode Tag steganography detection | Automated | `string-utils.mjs` decodes U+E0000-E007F tags; `injection-patterns.mjs` escalates to CRITICAL/HIGH | `decodeUnicodeTags()` in normalization pipeline |
|
|
| Bash evasion normalization | Automated | `bash-normalize.mjs` strips parameter expansion before pattern matching | `normalizeBashExpansion()` called by both bash hooks |
|
|
| Rule of Two detection (block-mode opt-in) | Automated | `post-session-guard.mjs` detects trifecta (untrusted input + sensitive data + exfil); blocks only when `LLM_SECURITY_TRIFECTA_MODE=block` AND high-confidence trifecta is observed; default `warn` | `LLM_SECURITY_TRIFECTA_MODE` env var respected; block mode opt-in |
|
|
| Long-horizon monitoring | Automated | `post-session-guard.mjs` 100-call window + behavioral drift detection | Long-horizon window active alongside 20-call window |
|
|
| HITL trap detection | Automated | `injection-patterns.mjs` HIGH patterns for approval urgency, summary suppression, scope minimization | HITL patterns present in HIGH_PATTERNS array |
|
|
| Hybrid attack detection | Automated | `injection-patterns.mjs` HYBRID_PATTERNS for P2SQL, recursive injection, XSS | Hybrid patterns checked in tool output scanning |
|
|
|
|
---
|
|
|
|
### LLM02 — Sensitive Information Disclosure (MITRE ATLAS: AML.T0024)
|
|
|
|
Model reveals sensitive data from training, context, or external sources in its outputs.
|
|
|
|
| Control | Type | Implementation | Verification Check |
|
|
|---------|------|----------------|--------------------|
|
|
| Secrets pattern detection (edit) | Automated | `pre-edit-secrets.mjs` blocks writes containing API keys, passwords, tokens | Hook exists; `knowledge/secrets-patterns.md` is present |
|
|
| Path guard for sensitive files | Automated | `pre-write-pathguard.mjs` blocks writes to `.env`, `*.key`, `credentials.*`, `.aws/` | Hook exists; sensitive path list is up to date |
|
|
| MCP output scanning | Automated | `post-mcp-verify.mjs` scans MCP responses for PII or secret patterns | Hook registered for PostToolUse/Bash |
|
|
| `.gitignore` discipline | Configured | `.env`, `*.key`, `*.pem`, `secrets.*` in `.gitignore` | Project `.gitignore` includes standard secret exclusions |
|
|
| No secrets in CLAUDE.md | Advisory | `/security audit` checks CLAUDE.md and agents for embedded secrets | Audit report shows no secret patterns in markdown files |
|
|
| Env-var pattern enforcement | Configured | Templates use `.env`/`.template` pattern; actual values never committed | No `.env` files tracked in git (`git ls-files *.env` empty) |
|
|
|
|
---
|
|
|
|
### LLM03 — Supply Chain Vulnerabilities (MITRE ATLAS: AML.T0010)
|
|
|
|
Compromised models, plugins, or MCP servers introduce malicious behavior.
|
|
|
|
| Control | Type | Implementation | Verification Check |
|
|
|---------|------|----------------|--------------------|
|
|
| MCP server audit | Advisory | `/security mcp-audit` reviews all MCP configs for source, permissions, network exposure | MCP audit report exists and is current |
|
|
| Plugin source verification | Advisory | `/security scan` on skill/agent files before activation | Skill scanner report clean for all installed plugins |
|
|
| Dependency pinning | Configured | MCP server dependencies pinned to specific versions in `package.json` or `requirements.txt` | No unpinned `latest` or `*` versions in MCP server deps |
|
|
| Pre-deploy checklist | Advisory | `/security pre-deploy` includes supply chain verification step | Pre-deploy report completed before production deployment |
|
|
| Minimal MCP permissions | Configured | MCP servers granted only required scopes; no wildcard access | MCP configs do not use `*` scope grants |
|
|
|
|
---
|
|
|
|
### LLM04 — Data and Model Poisoning (MITRE ATLAS: AML.T0020)
|
|
|
|
Malicious training data or fine-tuning corrupts model behavior.
|
|
|
|
| Control | Type | Implementation | Verification Check |
|
|
|---------|------|----------------|--------------------|
|
|
| Use vetted base models only | External | Organizational policy: approved model list from provider (Anthropic, Azure OpenAI) | Model IDs in config match approved list |
|
|
| No untrusted fine-tuning | External | Fine-tuning pipelines gated by data review process | Fine-tuning dataset provenance documented |
|
|
| Knowledge base integrity | Advisory | `/security audit` checks knowledge files for injected malicious content | Audit covers `knowledge/` directories |
|
|
| Prompt content review | Advisory | Skill scanner checks agent/command prompts for anomalous instructions | `skill-scanner-agent` run on all agents |
|
|
| Threat model coverage | Advisory | `/security threat-model` includes data pipeline as attack surface | Threat model document exists and covers data sources |
|
|
|
|
---
|
|
|
|
### LLM05 — Improper Output Handling (MITRE ATLAS: AML.T0043)
|
|
|
|
Model output treated as trusted without sanitization, leading to injection in downstream systems.
|
|
|
|
| Control | Type | Implementation | Verification Check |
|
|
|---------|------|----------------|--------------------|
|
|
| MCP output verification | Automated | `post-mcp-verify.mjs` scans tool outputs before they reach downstream consumers | Hook registered and active |
|
|
| Destructive command blocking | Automated | `pre-bash-destructive.mjs` prevents shell injection from model-generated commands | Hook exists; blocklist includes `rm -rf`, `DROP TABLE`, `curl \| sh` patterns |
|
|
| No direct shell execution of model output | Configured | CLAUDE.md explicitly prohibits passing raw model output to `eval` or shell | CLAUDE.md has output-handling guardrail |
|
|
| Output template enforcement | Advisory | Report templates in `templates/` provide structured output that avoids raw passthrough | Templates used by scan/audit commands |
|
|
| Code review before execution | Advisory | `/security pre-deploy` requires human review of model-generated scripts | Pre-deploy checklist includes output review step |
|
|
|
|
---
|
|
|
|
### LLM06 — Excessive Agency (MITRE ATLAS: AML.T0061)
|
|
|
|
Model granted too many permissions or capabilities, enabling unintended high-impact actions.
|
|
|
|
| Control | Type | Implementation | Verification Check |
|
|
|---------|------|----------------|--------------------|
|
|
| Deny-first permissions | Configured | `settings.json` starts from deny-all; explicit allow-list per command | `settings.json` does not use broad `"allow": ["*"]` |
|
|
| Tool allowlist per command | Configured | Each command's frontmatter declares minimum required tools | All `commands/*.md` have explicit `allowed-tools` list |
|
|
| Agent tool restriction | Configured | Agent frontmatter limits tools to Read/Glob/Grep unless justified | Agents do not have Write/Bash without documented rationale |
|
|
| Over-permissioning scan | Advisory | `skill-scanner-agent` flags commands/agents with excessive tool grants | Skill scanner report shows no over-permissioning findings |
|
|
| No autonomous external calls | Configured | Agents restricted from making unapproved network calls via Bash | `pre-bash-destructive.mjs` blocks `curl`, `wget` without approval |
|
|
| Human-in-the-loop for destructive ops | Automated | Destructive bash commands blocked; require explicit user re-invocation | Hook blocks and logs; no auto-bypass mechanism |
|
|
|
|
---
|
|
|
|
### LLM07 — System Prompt Leakage (MITRE ATLAS: AML.T0024)
|
|
|
|
System prompt or CLAUDE.md exposed through adversarial extraction, revealing security controls.
|
|
|
|
| Control | Type | Implementation | Verification Check |
|
|
|---------|------|----------------|--------------------|
|
|
| Security-by-design (not obscurity) | Configured | Controls enforced by hooks and settings, not just prompt instructions | Hooks exist independently of CLAUDE.md instructions |
|
|
| No secrets in system prompt | Advisory | `/security audit` checks CLAUDE.md for embedded secrets or keys | Audit report clean for CLAUDE.md content |
|
|
| Minimal sensitive detail in prompts | Configured | CLAUDE.md describes policy intent, not implementation bypass paths | CLAUDE.md reviewed for info that aids bypass |
|
|
| Prompt disclosure awareness | Advisory | Threat model documents that CLAUDE.md may be readable by the model | Threat model includes system prompt as attack surface |
|
|
| Defense in depth | Configured | Multiple independent control layers so prompt leakage does not collapse security | Hooks + settings + CLAUDE.md all present (not sole reliance on one layer) |
|
|
|
|
---
|
|
|
|
### LLM08 — Vector and Embedding Weaknesses (MITRE ATLAS: AML.T0020)
|
|
|
|
Manipulated embeddings or vector store content used to inject malicious context into RAG pipelines.
|
|
|
|
| Control | Type | Implementation | Verification Check |
|
|
|---------|------|----------------|--------------------|
|
|
| Knowledge base content review | Advisory | `/security audit` scans `knowledge/` files for injected instructions | Audit includes knowledge base scan |
|
|
| Source attribution in KB | Configured | Knowledge files include source and date metadata | KB files have provenance headers |
|
|
| RAG input sanitization | External | Vector store / RAG pipeline sanitizes retrieved chunks before injection | RAG pipeline has input validation (organizational control) |
|
|
| Embedding access control | External | Vector stores gated by IAM; not publicly writable | Access control documented for vector infrastructure |
|
|
| Retrieval result verification | Advisory | Agents instructed to verify retrieved content plausibility before use | Agent prompts include retrieval skepticism instruction |
|
|
|
|
---
|
|
|
|
### LLM09 — Misinformation (MITRE ATLAS: AML.T0031)
|
|
|
|
Model generates plausible but false information, leading to incorrect decisions.
|
|
|
|
| Control | Type | Implementation | Verification Check |
|
|
|---------|------|----------------|--------------------|
|
|
| Authoritative knowledge base | Configured | Plugin uses curated `knowledge/` files as grounding for security recommendations | `knowledge/` directory contains up-to-date OWASP and threat pattern files |
|
|
| Source citation in outputs | Configured | Commands instruct agents to cite knowledge file sources in reports | Report templates include source section |
|
|
| Human review gate | Advisory | All advisory reports require human review before action | CLAUDE.md and command docs state reports are advisory, not authoritative |
|
|
| Threat model validation | Advisory | `/security threat-model` output reviewed by security professional | Threat model review step documented in pre-deploy checklist |
|
|
| Confidence indicators | Advisory | Agents use hedged language for uncertain findings | Agent prompts instruct use of `HIGH/MEDIUM/LOW` confidence levels |
|
|
| Hallucination risk documentation | Configured | CLAUDE.md explicitly documents that AI outputs require validation | CLAUDE.md contains disclaimer on AI-generated security findings |
|
|
|
|
---
|
|
|
|
### LLM10 — Unbounded Consumption (MITRE ATLAS: AML.T0029)
|
|
|
|
Model or agents consume excessive compute, tokens, or API calls, causing denial of service or cost overruns.
|
|
|
|
| Control | Type | Implementation | Verification Check |
|
|
|---------|------|----------------|--------------------|
|
|
| Scoped scanning targets | Configured | Commands accept explicit file/directory targets; no default full-repo scan | `scan.md` and `audit.md` require explicit scope argument |
|
|
| Agent timeout discipline | Configured | Agents instructed to limit research depth and report within scope | Agent prompts include scope and depth constraints |
|
|
| No recursive agent spawning | Configured | Agents do not spawn additional agents without explicit command | Agent frontmatter and prompts prohibit autonomous subagent creation |
|
|
| MCP call limiting | Configured | MCP-using commands have documented call budgets | `mcp-audit.md` documents expected MCP call count |
|
|
| Cost-aware model selection | Configured | Expensive operations (threat modeling) use Opus; scanning uses Sonnet | Command frontmatter uses `model: sonnet` for scan/audit, `model: opus` for threat-model |
|
|
| Session scope guard | Configured | CLAUDE.md scope-guard prevents unbounded task escalation | CLAUDE.md has scope-guard section |
|
|
|
|
---
|
|
|
|
## Coverage Summary
|
|
|
|
| Category | Name | Automated | Configured | Advisory | External | Total Controls | Coverage |
|
|
|----------|------|-----------|------------|----------|----------|----------------|----------|
|
|
| LLM01 | Prompt Injection | 9 | 3 | 1 | 0 | 13 | 92% |
|
|
| LLM02 | Sensitive Info Disclosure | 3 | 2 | 1 | 0 | 6 | 83% |
|
|
| LLM03 | Supply Chain | 0 | 2 | 3 | 0 | 5 | 60% |
|
|
| LLM04 | Data & Model Poisoning | 0 | 0 | 3 | 2 | 5 | 40% |
|
|
| LLM05 | Improper Output Handling | 2 | 2 | 1 | 0 | 5 | 80% |
|
|
| LLM06 | Excessive Agency | 3 | 3 | 0 | 0 | 6 | 100% |
|
|
| LLM07 | System Prompt Leakage | 0 | 3 | 2 | 0 | 5 | 60% |
|
|
| LLM08 | Vector & Embedding Weaknesses | 0 | 1 | 2 | 2 | 5 | 40% |
|
|
| LLM09 | Misinformation | 0 | 3 | 3 | 0 | 6 | 50% |
|
|
| LLM10 | Unbounded Consumption | 0 | 5 | 1 | 0 | 6 | 83% |
|
|
|
|
**Coverage scoring:**
|
|
- 100% = All applicable controls implemented
|
|
- 80-99% = Strong coverage, minor gaps
|
|
- 60-79% = Moderate coverage, notable gaps
|
|
- 40-59% = Partial coverage, significant gaps
|
|
- <40% = Minimal coverage — high risk
|
|
|
|
**Note:** LLM04 and LLM08 score lower because their primary controls are external (model provider and infrastructure). For Claude Code projects, these categories require organizational controls beyond what the plugin can enforce.
|
|
|
|
---
|
|
|
|
## Posture Assessor Checklist
|
|
|
|
When `posture-assessor-agent` evaluates a project, verify the following in order:
|
|
|
|
### Automated Controls (hooks) — Verify All Present
|
|
- [ ] `hooks/scripts/pre-edit-secrets.mjs` exists
|
|
- [ ] `hooks/scripts/pre-write-pathguard.mjs` exists
|
|
- [ ] `hooks/scripts/pre-bash-destructive.mjs` exists
|
|
- [ ] `hooks/scripts/post-mcp-verify.mjs` exists
|
|
- [ ] `hooks/hooks.json` registers all four hooks
|
|
|
|
### Configured Controls — Verify in settings.json and CLAUDE.md
|
|
- [ ] `settings.json` has deny-first permissions (no broad `"allow": ["*"]`)
|
|
- [ ] Command frontmatter has explicit `allowed-tools` lists
|
|
- [ ] Agent frontmatter restricts tools to minimum required
|
|
- [ ] CLAUDE.md has scope-guard / anti-override section
|
|
- [ ] `.gitignore` excludes `.env`, `*.key`, `*.pem`, `credentials.*`
|
|
- [ ] No secrets embedded in CLAUDE.md, agent prompts, or command files
|
|
|
|
### Advisory Controls — Evidence of Use
|
|
- [ ] `/security scan` report present or run recently
|
|
- [ ] `/security audit` report present or run recently
|
|
- [ ] `/security mcp-audit` report if MCP servers are configured
|
|
- [ ] `/security threat-model` report present for production systems
|
|
- [ ] `/security pre-deploy` checklist completed before deployment
|
|
|
|
### Scoring Guidance
|
|
|
|
| Automated controls present | Configured controls present | Advisory evidence | Score Band |
|
|
|----------------------------|-----------------------------|-------------------|------------|
|
|
| 5/5 | 6/6 | 3/5 | A (90+) |
|
|
| 4/5 | 5/6 | 2/5 | B (75-89) |
|
|
| 3/5 | 4/6 | 1/5 | C (60-74) |
|
|
| 2/5 | 3/6 | 0/5 | D (40-59) |
|
|
| <2/5 | <3/6 | 0/5 | F (<40) |
|