15 KiB
llm-security v3.0 Upgrade — Master Session Document
This document tracks the multi-session upgrade from v2.5.0 to v3.0.0. Updated after each session. Read this at session start.
Session Prompt Template
At the start of each new session, paste this:
Jeg fortsetter llm-security v3-oppgraderingen. Les V3-UPGRADE.md i plugin-rooten
for full kontekst, nåværende status, og hva neste sesjon skal gjøre.
Overall Status
| Session | Version | Status | Date | Commit |
|---|---|---|---|---|
| S1 | v2.6.0 | DONE | 2026-04-02 | b36312c |
| S2 | v2.7.0 | DONE | 2026-04-02 | 41d7493 |
| S3 | v2.7.1 | DONE | 2026-04-02 | ec01163 |
| S4 | v2.8.0 | DONE | 2026-04-02 | b004f46 |
| S4+ | v2.8.1 | DONE | 2026-04-03 | — |
| S5 | v2.9.0 | DONE | 2026-04-03 | 162a23a |
| S6 | v2.9.1 | DONE | 2026-04-03 | 110032e |
| S7 | v2.9.2 | DONE | 2026-04-03 | 3129e7a |
| S8 | v3.0.0 | DONE | 2026-04-03 | 293dee5 |
Current: Session 8 complete — v3.0.0 released Status: ALL SESSIONS DONE
Competitive Context
Why v3: Public release. Close gaps vs Snyk Agent Scan (toxic flow analysis, MCP live inspection, continuous scanning, skill registry) while keeping architectural advantages (100% local, pre-extraction defense, full lifecycle coverage).
Key differentiators to maintain:
- Pre-extraction layer (no competitor has this)
- 7+ deterministic scanners + LLM analysis in same pipeline
- 100% local, no cloud dependency
- Full lifecycle: hooks + scanning + audit + threat modeling + remediation
- Supply chain hook covering 7 package managers + OSV.dev
v3.0 target inventory:
- 9 scanners (was 7, now 9): +toxic-flow-analyzer (done), +mcp-live-inspect (done)
- 7 hooks (was 6): +post-session-guard
- 14 commands (was 10, now 14): +mcp-inspect (done), +diff (done), +watch (done), +registry (done)
- 6 agents (all updated with new OWASP mappings)
- 9 knowledge files (was 7): +owasp-skills-top10 (done), +skill-registry.json (done)
Session 1: Enhanced Patterns + OWASP Mapping (v2.6.0)
Goal: Foundation for all subsequent sessions.
Tasks
- 1a. Add
MEDIUM_PATTERNStier toscanners/lib/injection-patterns.mjs- ~15-20 patterns: base64 payloads, leetspeak, multi-language mixing, markdown/HTML comment injection, homoglyph-obfuscated keywords, invisible Unicode separators
- Update
scanForInjection()to return severity level (not just boolean)
- 1b. Update OWASP mappings in
scanners/lib/severity.mjs- Add ASI01-ASI10 (Agentic Top 10) prefix mappings
- Add MCP1-MCP7 (MCP Top 10) prefix mappings
- Add AST01-AST10 (Skills Top 10) prefix mappings
- Add TFA scanner prefix
- Update
owaspCategorize()for all frameworks
- 1c. Create
knowledge/owasp-skills-top10.md- AST01-AST10 definitions and mapping
- 1d. Update agent prompts with new OWASP references
agents/skill-scanner-agent.md: AST10 mappingagents/mcp-scanner-agent.md: MCP Top 10 mappingagents/posture-assessor-agent.md: ASI mappingagents/deep-scan-synthesizer-agent.md: new scanner prefixes
- 1e. Update
CLAUDE.mdwith new knowledge file - 1f. Verify:
node scanners/scan-orchestrator.mjs .passes, new OWASP IDs in output
Files Modified
scanners/lib/injection-patterns.mjs— MEDIUM tierscanners/lib/severity.mjs— ASI/AST/MCP/TFA mappingsagents/skill-scanner-agent.md— AST10agents/mcp-scanner-agent.md— MCP Top 10agents/posture-assessor-agent.md— ASIagents/deep-scan-synthesizer-agent.md— new prefixesCLAUDE.md— knowledge table update
Files Created
knowledge/owasp-skills-top10.md
Acceptance Criteria
scanForInjection()returns{ found, severity, patterns }instead of boolean- All 4 OWASP frameworks mapped in severity.mjs
node scanners/scan-orchestrator.mjs .runs clean- MEDIUM patterns detect base64 instruction payloads and homoglyph obfuscation
Session 2: Toxic Flow Analysis (v2.7.0) — FLAGSHIP
Goal: Detect lethal trifecta — when combinations of safe tools create exfiltration chains.
Concept
"Lethal trifecta" (Willison/Invariant Labs):
- Agent exposed to untrusted input (prompt injection surface)
- Agent has access to sensitive data via tools
- An exfiltration sink exists (HTTP, email, file write)
Tasks
- 2a. Create
scanners/toxic-flow-analyzer.mjs(~380 lines)- Phase 1: Component inventory from plugin frontmatter + MCP/hook detection
- Phase 2: Trifecta leg classification with prior scanner enrichment
- Phase 3: Trifecta detection (direct/cross-component/project-level) with mitigation downgrades
- Scanner prefix:
TFA, OWASP: ASI01, ASI02, ASI05
- 2b. Modify
scanners/scan-orchestrator.mjs- TFA runs LAST after all 7 scanners
- Pass accumulated scanner results to TFA via
requiresPriorResultsflag
- 2c. Update
commands/scan.md+commands/deep-scan.mdto render TFA findings - 2d. Update
agents/deep-scan-synthesizer-agent.mdfor TFA report section - 2e. Create test fixture:
test-fixtures/trifecta-plugin/with known trifecta pattern - 2f. Update
CLAUDE.md— version v2.7.0, scanner count 8
Key Design Decisions
- Post-processing correlator — does NOT re-scan files, consumes existing scanner output
- Severity: CRITICAL (2-hop + confirmed taint), HIGH (3+ hop or unconfirmed), MEDIUM (theoretical chain)
- Graph model: Adjacency list, not full graph library (keep dependencies at zero)
Files Modified
scanners/scan-orchestrator.mjsscanners/lib/severity.mjs(TFA prefix already added in S1)commands/scan.mdagents/deep-scan-synthesizer-agent.mdCLAUDE.md
Files Created
scanners/toxic-flow-analyzer.mjs
Acceptance Criteria
- Test fixture with read+exfil tools produces TFA-001 CRITICAL finding
- Scan-orchestrator runs 8 scanners with TFA last
/security scanon fixture shows chain description/security deep-scanincludes TFA section in report
Session 3: Runtime Session Guard (v2.7.1)
Goal: Real-time PostToolUse hook detecting lethal trifecta forming during a session.
Tasks
- 3a. Create
hooks/scripts/post-session-guard.mjs(~200-250 lines)- Append tool calls to
/tmp/llm-security-session-${ppid}.jsonl - Classify each tool:
input_source | data_access | exfil_sink | neutral - Sliding window (20 calls) trifecta detection
- Emit
systemMessagewarning (never block) - Cleanup state files >24h old
- Append tool calls to
- 3b. Update
hooks/hooks.json— add PostToolUse entry - 3c. Update
CLAUDE.md— hooks table - 3d. Test: simulate trifecta sequence, verify warning
Files Modified
hooks/hooks.jsonCLAUDE.md
Files Created
hooks/scripts/post-session-guard.mjs
Acceptance Criteria
- Hook fires on every PostToolUse
- Trifecta sequence (Read sensitive → Bash curl) triggers warning
- State file is JSONL, keyed by ppid
- Old state files cleaned up
- No false positives on normal tool sequences
Session 4: MCP Runtime Inspection (v2.8.0)
Goal: Connect to running MCP servers, fetch live tool descriptions, scan for injection/poisoning/shadowing.
Tasks
- 4a. Create
scanners/mcp-live-inspect.mjs(~350-400 lines)- Config discovery (6 locations, reuse mcp-scanner-agent logic)
- Spawn servers, JSON-RPC 2.0 initialize + tools/list + prompts/list + resources/list
- Scan descriptions with injection-patterns.mjs
- Tool shadowing detection (same names across servers)
- Description drift (live vs static config)
- 10s timeout per server
- 4b. Create
commands/mcp-inspect.md(~40-50 lines) - 4c. Update
commands/mcp-audit.mdwith--liveflag - 4d. Update
agents/mcp-scanner-agent.mdfor live inspection context - 4e. Update
CLAUDE.md - 4f. Update
README.md— badges, tables, version history - 4g. Update
plugin.jsonversion - 4h. Subtree push to public repo
Files Modified
commands/mcp-audit.mdagents/mcp-scanner-agent.mdCLAUDE.md
Files Created
scanners/mcp-live-inspect.mjscommands/mcp-inspect.md
Acceptance Criteria
- Successfully connects to at least one MCP server and fetches tool list
- Injection patterns detected in tool descriptions
- Tool shadowing flagged when two servers expose same tool name
- Servers that fail to start are skipped gracefully (10s timeout)
Session 5: Report Diffing & Baseline (v2.9.0)
Goal: Compare scan results over time. Show new/resolved/unchanged findings.
Tasks
- 5a. Create
scanners/lib/diff-engine.mjs(~200-250 lines)- Baseline storage in
reports/baselines/<target-hash>.json - Match findings by: scanner prefix + file path + line (fuzzy ±3) + pattern type
- Categories:
new,resolved,unchanged,moved
- Baseline storage in
- 5b. Update
scanners/scan-orchestrator.mjs— add--baselineand--save-baselineflags - 5c. Create
commands/diff.md(~40-50 lines) - 5d. Update
CLAUDE.md - 5e. Update
README.md— badges, tables, version history - 5f. Update
plugin.jsonversion - 5g. Subtree push to public repo
Files Modified
scanners/scan-orchestrator.mjsCLAUDE.md
Files Created
scanners/lib/diff-engine.mjscommands/diff.mdreports/baselines/(directory)
Acceptance Criteria
--save-baselinestores results,--baselineloads and diffs- NEW findings flagged after adding a vulnerability
- RESOLVED findings flagged after removing one
- Fuzzy line matching handles ±3 line drift
Session 6: Continuous/Background Scanning (v2.9.1)
Goal: Automated periodic scanning with delta reporting.
Tasks
- 6a. Create
commands/watch.md(~50-60 lines)/security watch [path] [--interval 6h]- Uses /loop as execution engine
- Runs scan-orchestrator with --baseline --save-baseline
- Reports delta only
- 6b. Create
scanners/watch-cron.mjs(~150-200 lines)- Standalone Node.js script for cron/launchd
- Config:
reports/watch/config.json - Output:
reports/watch/latest.json
- 6c. Update
CLAUDE.md - 6d. Update
README.md— badges, tables, version history - 6e. Update
plugin.jsonversion - 6f. Subtree push to public repo
Files Modified
CLAUDE.md
Files Created
commands/watch.mdscanners/watch-cron.mjsreports/watch/(directory)
Acceptance Criteria
/security watch .creates baseline and shows "No changes"- After modification: shows delta with NEW findings
- Cron wrapper runs standalone:
node scanners/watch-cron.mjs
Session 7: Skill Signature Registry (v2.9.2)
Goal: Local database of known skill patterns and risk profiles.
Tasks
- 7a. Create
scanners/lib/skill-registry.mjs(~300-350 lines)- Fingerprinting: SHA-256 of normalized SKILL.md content
scanAndRegister(skillPath)andcheckRegistry(fingerprint)- Registry format: JSON with skill metadata + findings summary
- 7b. Create
knowledge/skill-registry.json(seed data) - 7c. Create
commands/registry.md(~40-50 lines)/security registry— stats/security registry scan <url>— scan and register/security registry search <pattern>— search
- 7d. Integrate with
commands/scan.md— check registry before full scan - 7e. Update
CLAUDE.md - 7f. Update
README.md— badges, tables, version history - 7g. Update
plugin.jsonversion - 7h. Subtree push to public repo
Files Modified
commands/scan.mdCLAUDE.md
Files Created
scanners/lib/skill-registry.mjsknowledge/skill-registry.jsoncommands/registry.md
Acceptance Criteria
- Scan a skill → fingerprint added to registry
- Re-scan same skill → registry hit, instant result
/security registry searchreturns matches
Session 8: Polish & Public Release (v3.0.0)
Goal: Quality pass, documentation, public release, announcement prep.
Tasks
- 8a. Full quality pass
- 544/544 tests pass
- Scan-orchestrator: 8/8 scanners OK (0 findings with ignore, ~190 suppressed)
- All 14 commands verified (valid frontmatter)
- All 8 hooks verified (parse without errors)
- Scan-orchestrator: ~7.5s on plugin self-scan
- 8b. Documentation
- README.md: v3 badge, mermaid architecture diagram, TFA in scanner table, updated stats, v3.0.0 version history
- CHANGELOG.md: full version history v1.0→v3.0 in Keep a Changelog format
- package.json + plugin.json bumped to v3.0.0
- .llm-security-ignore updated with TFA suppressions
- 8c. Public repo sync
- Subtree push to
git.fromaitochitta.com/open/claude-code-llm-security
- Subtree push to
- 8d. Announcement prep
- V3-ANNOUNCEMENT.md with feature comparison matrix (vs Snyk Agent Scan, Lasso Claude Hooks)
- Key differentiators narrative (6 points)
- Demo scenario with scan/diff/watch workflow
Acceptance Criteria
/security auditon plugin itself scores A or B- All commands documented in CLAUDE.md
- All hooks documented in CLAUDE.md
- README has complete v3 feature list
- Public repo updated and accessible
Technical Notes
Reusable Infrastructure (do not duplicate)
scanners/lib/injection-patterns.mjs— all injection pattern matchingscanners/lib/output.mjs—finding()andscannerResult()buildersscanners/lib/severity.mjs— risk scoring, OWASP mappingscanners/lib/file-discovery.mjs—discoverFiles()andreadTextFile()scanners/lib/string-utils.mjs— entropy, Levenshtein, base64 detectionscanners/content-extractor.mjs— pre-extraction for remote repos
Constraints
- All code is Node.js (>=18), no external dependencies beyond Node stdlib
- Hooks are separate processes per invocation (no shared memory)
- Context budget: max 3 knowledge files per agent invocation
- Intel Mac target (no Apple Silicon-specific features)
- Plugin convention: commands ~30-60 lines, agents use registered subagent_type
- CLAUDE.md updated in same commit as the change it documents
- README.md + plugin.json + subtree push are MANDATORY per session — not optional, not deferred to S8. Every version bump must update: plugin.json version, README badges/tables/version history, then subtree push. Session is NOT done until public repo is current.
Scanner Integration Pattern
// In scan-orchestrator.mjs, TFA scanner receives prior results:
const tfaResults = await runTfaScanner(target, files, priorResults);
// All other scanners: (target, files) signature unchanged
Hook State Pattern
// Session guard uses temp file for cross-invocation state:
const stateFile = `/tmp/llm-security-session-${process.ppid}.jsonl`;
// Append on each invocation, read sliding window for analysis