ktg-plugin-marketplace/plugins/llm-security/playground/test-fixtures/deep-scan.md
Kjell Tore Guttormsen ce3891bdd0 feat(llm-security): playground Fase 3 — v7.5.0 med 18 parsere/renderere
Single-file SPA playground har nå parser + renderer for alle 18
produces_report=true-kommandoer (Fase 2: 10 høy-prio + Fase 3: 8
gjenstående: mcp-inspect, supply-check, pre-deploy, diff, watch,
registry, clean, threat-model). 18 markdown test-fixtures fungerer
som kontrakt-anker for parser-utvikling.

Komplett demo-prosjekt `dft-komplett-demo` har alle 18 rapporter
ferdig parsed inline — klikk-gjennom uten "parser ikke implementert"-
paneler. 2 nye archetypes i KEY_STATS_CONFIG: kanban-buckets (clean)
og matrix-risk (threat-model).

Bug-fix: normalizeVerdictText sjekker nå GO-WITH-CONDITIONS /
CONDITIONAL / BETINGET FØR plain GO så betinget verdict (pre-deploy
med åpne vilkår) ikke kollapser til ALLOW.

Eksponert 11 window-globaler for testing/automasjon (__store,
__navigate, __loadDemoState, __PARSERS, __RENDERERS, __CATALOG,
__inferVerdict, __inferKeyStats, __renderPageShell,
__handlePasteImport, __scheduleRender). 12 Playwright-genererte
screenshots i playground/screenshots/v7.5.0/.

A11Y-rapport (WCAG 2.1 AA): 0 blokkerende, 3 mindre forbedringer
flagget for v7.5.x patch (skip-link, heading-hierarki på project,
aria-live toast).

Versjonsbump 7.4.0 -> 7.5.0 i 10 filer (package.json, plugin.json,
CLAUDE.md header, README badge, CHANGELOG-entry, 3 scanner VERSION-
konstanter, ROADMAP, marketplace-rot README).

Ingen scanner- eller hook-behavior-changes — purely additive surface.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-05 22:15:47 +02:00

136 lines
4.5 KiB
Markdown

# Deep-Scan Report — 10 deterministic scanners
---
## Header
| Field | Value |
|-------|-------|
| **Report type** | deep-scan |
| **Target** | ~/repos/example-app |
| **Date** | 2026-05-05 |
| **Version** | llm-security v7.4.0 |
| **Scope** | full repository |
| **Frameworks** | OWASP LLM Top 10, OWASP Agentic, OWASP MCP |
| **Triggered by** | /security deep-scan |
---
## Risk Dashboard
| Metric | Value |
|--------|-------|
| **Risk Score** | 58/100 |
| **Risk Band** | High |
| **Grade** | C |
| **Verdict** | WARNING |
| Severity | Count |
|----------|------:|
| Critical | 0 |
| High | 6 |
| Medium | 11 |
| Low | 8 |
| Info | 14 |
| **Total** | **39** |
**Verdict rationale:** No critical findings. 6 high-severity findings (4 from taint, 2 from memory-poisoning) push score to 58.
---
## Executive Summary
The 10-scanner orchestrator produced 39 findings in 4.7 seconds. Highest concentration is in taint-tracer (untrusted input flowing to dangerous sinks in `commands/research.md`) and memory-poisoning-scanner (encoded imperatives in `CLAUDE.md`). No critical findings. Toxic-flow correlator did not detect a complete trifecta — the agent set has hook guards that intervene before the third leg.
---
## Scanner Results
### 1. Unicode Analysis (UNI)
**Status:** ok | **Files:** 47 | **Findings:** 2 | **Time:** 142ms
Detected 2 instances of zero-width characters in `agents/notes.md`. PUA-A range clear.
### 2. Entropy Analysis (ENT)
**Status:** ok | **Files:** 89 | **Findings:** 5 | **Time:** 387ms
5 high-entropy strings flagged. 2 suppressed (GLSL keywords in `shaders/blur.glsl`). 3 reported (potential secrets in test fixtures).
### 3. Permission Mapping (PRM)
**Status:** ok | **Files:** 12 | **Findings:** 4 | **Time:** 89ms
4 over-permissioned agents (tool list includes `Write`/`Edit` without justification). One wildcard Bash grant in settings.json.
### 4. Dependency Audit (DEP)
**Status:** ok | **Files:** 3 | **Findings:** 3 | **Time:** 1230ms
3 dependencies flagged: 1 OSV-CVE-2024-1234 medium, 2 typosquat suspicions (Levenshtein ≤2 vs official packages).
### 5. Taint Tracing (TNT)
**Status:** ok | **Files:** 23 | **Findings:** 12 | **Time:** 487ms
12 taint flows detected. 4 reach high-risk sinks (Bash interpolation, WebFetch URL construction).
### 6. Git Forensics (GIT)
**Status:** ok | **Files:** — | **Findings:** 2 | **Time:** 678ms
2 historical secrets in git history (since rotated, but blob still reachable via reflog).
### 7. Network Mapping (NET)
**Status:** ok | **Files:** 56 | **Findings:** 3 | **Time:** 412ms
3 suspicious URLs found (1 typosquat domain, 2 raw IP addresses in code comments).
### 8. Memory Poisoning (MEM)
**Status:** ok | **Files:** 8 | **Findings:** 4 | **Time:** 67ms
4 memory-poisoning patterns in `CLAUDE.md` and 2 agent files: encoded base64 imperatives, suspicious permission expansion, hidden URLs.
### 9. Supply-Chain Recheck (SCR)
**Status:** ok | **Files:** 2 | **Findings:** 2 | **Time:** 1845ms
OSV.dev returned 2 advisories on installed lockfile entries.
### 10. Toxic-Flow Analyzer (TFA)
**Status:** ok | **Files:** — | **Findings:** 2 | **Time:** 23ms
2 partial-trifecta agents (2 of 3 legs each). No complete trifectas detected.
---
## Scanner Risk Matrix
| Scanner | CRITICAL | HIGH | MEDIUM | LOW | INFO |
|---------|----------|------|--------|-----|------|
| Unicode (UNI) | 0 | 0 | 1 | 1 | 0 |
| Entropy (ENT) | 0 | 1 | 2 | 1 | 1 |
| Permission (PRM) | 0 | 1 | 1 | 1 | 1 |
| Dependency (DEP) | 0 | 0 | 2 | 1 | 0 |
| Taint (TNT) | 0 | 4 | 3 | 2 | 3 |
| Git (GIT) | 0 | 0 | 1 | 1 | 0 |
| Network (NET) | 0 | 0 | 1 | 0 | 2 |
| Memory (MEM) | 0 | 2 | 0 | 1 | 1 |
| Supply-Chain (SCR) | 0 | 0 | 1 | 0 | 1 |
| Toxic-Flow (TFA) | 0 | 0 | 1 | 1 | 0 |
| **TOTAL** | **0** | **6** | **11** | **8** | **14** |
---
## Methodology
10 deterministic Node.js scanners (zero external dependencies). Results are factual and reproducible. Toxic-flow runs LAST as a post-correlator across prior scanners. See `scanners/lib/severity.mjs` for risk-score formula.
---
## Recommendations
1. **High priority:** Address 4 taint-tracer findings in `commands/research.md` and `agents/notes.md` — sanitize before sink, or add hook gate.
2. **High priority:** Clean up `CLAUDE.md` memory-poisoning patterns (lines 12, 34, 67).
3. **Medium:** Bump dependencies to clear OSV advisories.
4. **Medium:** Force-push history rewrite to remove historical secrets, then rotate keys.
Re-run with `--baseline-diff` against last green run to track progress.
---
*Deep-scan complete. 39 findings, 10 scanners, 4.7 seconds.*