Kjell Tore Guttormsen db80854830 feat(llm-security): playground v7.6.2-dev — render-report CLI + wire 4 skills (scan, audit, posture, deep-scan) [skip-docs]

- New scripts/render-report.mjs CLI: stdin/file/stdout modes, ESM import
  from ./lib/report-renderers.mjs, kebab→camel renderer-name lookup so
  any of the 18 PARSERS works
- Standalone HTML wrap: inlines 6 DS stylesheets (tokens, base, components,
  tier2, tier3, tier3-supplement) + local .report-table CSS. Skips fonts.css
  → system-ui fallback via tokens.css (~137 KB self-contained vs ~1 MB
  with woff2 bundled)
- 4 skill files wired: commands/{scan,audit,posture,deep-scan}.md — new
  step instructs Claude to Write the markdown report to a temp file,
  invoke the CLI, and print a markdown-formatted file:// link
- Absolute file:// paths in stdout for Ghostty cmd-click compatibility
- Default output: reports/<command>-<YYYYMMDD-HHmmss>.html relative to CWD
- Smoke-tested: stdin→stdout, file→file roundtrip, all 4 commands produce
  valid HTML with DS-aligned page-shell (page__title, verdict-pill-lg,
  risk-meter, key-stats, findings__item, recommendation-card)
- Tests 1820/1820 green (same baseline; pre-compact-scan perf-flake from
  NEXT-SESSION-PROMPT did not fire on retry)
- Playground untouched (2 scripts, 0 parse failures), report-renderers.mjs
  untouched (74 exports, 18 PARSERS, 18 RENDERERS)

Sesjon 4 av 5. v7.7.0 release + 9 remaining skill wirings = sesjon 5.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-05-18 12:56:03 +02:00

2.7 KiB

Raw Blame History

name	description	allowed-tools	model
security:audit	Full project security audit with OWASP LLM Top 10 assessment, scoring, and remediation plan	Read, Glob, Grep, Bash, Agent	sonnet

/security audit

Full security audit — 10 categories, OWASP LLM Top 10 aligned, A-F grade.

Step 1: Run Posture Scanner

Run the deterministic posture scanner first for instant category results:

node <this plugin's scanners/posture-scanner.mjs> [cwd]

Parse JSON output. Record: grade, risk score, all category statuses, all findings.

Step 2: Gather Context

Read CLAUDE.md for project name and type
Glob for: commands/*.md, agents/*.md, .mcp.json, **/.mcp.json, .claude-plugin/plugin.json
Determine: has skills/commands? has MCP servers?

Step 3: Skill Scan (if commands/agents found)

Spawn subagent_type: "llm-security:skill-scanner-agent", model: "sonnet":

Scan all commands/ and agents/ at [cwd]. Read: <plugin-root>/knowledge/skill-threat-patterns.md Return findings: file, issue, severity, OWASP ref.

Step 4: MCP Scan (if MCP servers found)

After skill scan, spawn subagent_type: "llm-security:mcp-scanner-agent", model: "sonnet":

Audit MCP configs at [cwd]. Read: <plugin-root>/knowledge/mcp-threat-patterns.md Return trust table and findings with severity.

Step 5: Generate Report

Merge posture scanner JSON + agent findings. Use the posture scanner's grade as the baseline. Recalculate risk_score = riskScore(counts) (severity-dominated v2 model — see scanners/lib/severity.mjs) including agent findings.

Output: Risk Dashboard, Executive Summary, 10 Category Sections (use scanner evidence + agent narrative), Summary Table, Action Items (IMMEDIATE → HIGH → MEDIUM).

Close with top 2-3 action items. If grade C or lower: suggest /security threat-model.

Step 6: HTML Report

After producing the markdown audit report above:

Compute a temp markdown path:

node -p "require('path').join(require('os').tmpdir(), 'sec-audit-' + Date.now() + '.md')"

Use the Write tool to save the entire markdown report you just produced (Risk Dashboard + Executive Summary + Category Sections + Summary Table + Action Items) to that temp path. Verbatim.
Run the renderer:
```
node <plugin-root>/scripts/render-report.mjs audit --in "<temp-md-path>"
```
The CLI writes reports/audit-<YYYYMMDD-HHmmss>.html relative to CWD and prints file:///abs/path.html on stdout.
Append to your response (markdown link, no bare URL):

HTML-rapport: Åpne i nettleser

If the CLI exits non-zero, mention the error but do not block — the markdown audit above is the primary deliverable.

2.7 KiB Raw Blame History