Full port of llm-security plugin for internal use on Windows with GitHub Copilot CLI. Protocol translation layer (copilot-hook-runner.mjs) normalizes Copilot camelCase I/O to Claude Code snake_case format — all original hook scripts run unmodified. - 8 hooks with protocol translation (stdin/stdout/exit code) - 18 SKILL.md skills (Agent Skills Open Standard) - 6 .agent.md agent definitions - 20 scanners + 14 scanner lib modules (unchanged) - 14 knowledge files (unchanged) - 39 test files including copilot-port-verify.mjs (17 tests) - Windows-ready: node:path, os.tmpdir(), process.execPath, no bash Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
11 KiB
Security Audit Report
Header
| Field | Value |
|---|---|
| Project | [Name of the project or repository that was audited] |
| Repository | [e.g. github.com/org/repo] |
| Audit date | [ISO 8601 — e.g. 2026-02-19] |
| Auditor | llm-security v[X.X] (automated) |
| Baseline | Claude Code Security Baseline v1.0 + OWASP LLM Top 10 (2025) |
| Scope | [Brief description — e.g. "Full project: source, skills, hooks, MCP configs, Docker, deployment"] |
Executive Summary
Overall Grade: [A / B / C / D / F] ([X]%)
Security Posture [==========] X.0 / 9.0
PASS ||| [n] categories
PARTIAL |||||| [n] categories
FAIL [n] categories
| Severity | Count |
|---|---|
| Critical | [n] |
| High | [n] |
| Medium | [n] |
| Low | [n] |
| Total | [n] |
Summary: [3–5 sentences covering the overall security posture: what the project does well, what the primary risks are, and the most urgent action required.]
Category Assessment
Category 1 — Deny-First Configuration
| Status | [PASS / PARTIAL / FAIL / N/A] |
|---|
Evidence:
- [Bullet per observation — what was found, with file paths and line references where relevant]
- [If PASS: confirm deny-first posture is correctly configured]
- [If PARTIAL/FAIL: specify exactly what is missing or misconfigured]
Recommendations:
- [Specific, actionable recommendation — omit if PASS]
Category 2 — Secrets Protection
| Status | [PASS / PARTIAL / FAIL / N/A] |
|---|
Evidence:
- [Bullet per observation]
Recommendations:
- [Specific, actionable recommendation — omit if PASS]
Category 3 — Path Guarding
| Status | [PASS / PARTIAL / FAIL / N/A] |
|---|
Evidence:
- [Bullet per observation]
Recommendations:
- [Specific, actionable recommendation — omit if PASS]
Category 4 — MCP Server Trust
| Status | [PASS / PARTIAL / FAIL / N/A] |
|---|
Evidence:
- [Bullet per MCP server found — source, auth status, scope assessment]
- [Include trust verdict per server: Trusted / Suspect / Unknown]
Recommendations:
- [Specific, actionable recommendation — omit if PASS]
Category 5 — Destructive Command Blocking
| Status | [PASS / PARTIAL / FAIL / N/A] |
|---|
Evidence:
- [Bullet per observation]
Recommendations:
- [Specific, actionable recommendation — omit if PASS]
Category 6 — Sandbox Configuration
| Status | [PASS / PARTIAL / FAIL / N/A] |
|---|
Evidence:
- [Bullet per observation]
Recommendations:
- [Specific, actionable recommendation — omit if PASS]
Category 7 — Human Review Requirements
| Status | [PASS / PARTIAL / FAIL / N/A] |
|---|
Evidence:
- [Bullet per observation]
Recommendations:
- [Specific, actionable recommendation — omit if PASS]
Category 8 — Skill and Plugin Sources
| Status | [PASS / PARTIAL / FAIL / N/A] |
|---|
Evidence:
- [Bullet per observation — first-party vs third-party, lock file status, marketplace trust]
Recommendations:
- [Specific, actionable recommendation — omit if PASS]
Category 9 — Session Isolation
| Status | [PASS / PARTIAL / FAIL / N/A] |
|---|
Evidence:
- [Bullet per observation]
Recommendations:
- [Specific, actionable recommendation — omit if PASS]
Scan Findings
Findings grouped by severity, sorted Critical → High → Medium → Low.
Each finding ID is formatted SCN-[NNN] (e.g. SCN-001).
Critical Findings ([n])
Omit this section if no Critical findings.
SCN-001 — [Short title]
| Field | Value |
|---|---|
| File | [path/to/file:line] |
| OWASP | [e.g. LLM06:2025 Excessive Agency] |
[Full description paragraph: what was found, why it is a risk, what an attacker could do with it.]
[Exact code or config excerpt that triggered the finding — redact actual secret values]
Remediation: [Concrete, actionable fix. Include example code or config snippet where helpful.]
SCN-002 — [Short title]
| Field | Value |
|---|---|
| File | [path/to/file:line] |
| OWASP | [OWASP reference] |
[Description paragraph.]
[Evidence excerpt]
Remediation: [Fix.]
High Findings ([n])
Omit this section if no High findings.
SCN-[NNN] — [Short title]
| Field | Value |
|---|---|
| File | [path/to/file:line] |
| OWASP | [OWASP reference] |
[Description paragraph.]
[Evidence excerpt]
Remediation: [Fix.]
Medium Findings ([n])
Omit this section if no Medium findings.
SCN-[NNN] — [Short title]
| Field | Value |
|---|---|
| File | [path/to/file:line] |
| OWASP | [OWASP reference] |
[Description paragraph.]
Remediation: [Fix.]
Low Findings ([n])
Omit this section if no Low findings.
SCN-[NNN] — [Short title]
| Field | Value |
|---|---|
| File | [path/to/file:line] |
| OWASP | [OWASP reference] |
[Description paragraph.]
Remediation: [Fix.]
Risk Matrix
LIKELIHOOD
Low Medium High
+------------+------------+------------+
High | | | |
| | | |
IMPACT +------------+------------+------------+
Med | | | |
| | | |
+------------+------------+------------+
Low | | | |
| | | |
+------------+------------+------------+
Place each Cat [N] label in the cell matching its assessed likelihood and impact.
Categories with Critical findings belong in High/High.
Categories with PASS status typically appear in Low/Low.
Prioritized Action Plan
Sorted by risk. IMMEDIATE items must be resolved before the next deployment.
| # | Priority | Action | Finding | Effort | Risk if deferred |
|---|---|---|---|---|---|
| 1 | IMMEDIATE | [Specific action] | SCN-[NNN] | [Low / Med / High] | [Risk description] |
| 2 | IMMEDIATE | [Specific action] | SCN-[NNN] | [Low / Med / High] | [Risk description] |
| 3 | HIGH | [Specific action] | SCN-[NNN] | [Low / Med / High] | [Risk description] |
| 4 | HIGH | [Specific action] | Posture | [Low / Med / High] | [Risk description] |
| 5 | MEDIUM | [Specific action] | SCN-[NNN] | [Low / Med / High] | [Risk description] |
| 6 | LOW | [Specific action] | Posture | [Low / Med / High] | [Risk description] |
Positive Findings
The following security controls are in place and working correctly:
- [Control name] — [Brief description of what is working and where it was confirmed]
- [Control name] — [Description]
- [Control name] — [Description]
(Remove any bullet that does not apply. Add as many as warranted by the evidence.)
Methodology
This audit was performed by automated assessment agents:
-
posture-assessor-agent — Evaluated 9 security categories against the Claude Code Security Baseline v1.0, collecting file-level evidence and assigning PASS/PARTIAL/FAIL status per category.
-
skill-scanner-agent — Scanned all skills, commands, agents, hooks, source code, and configs for 7 threat categories derived from ToxicSkills/ClawHavoc research, OWASP LLM Top 10 (2025), and OWASP Agentic AI Top 10.
[Add or remove agents as applicable. Include mcp-scanner-agent if MCP servers were analyzed.]
Both agents operated in read-only mode. No files were modified during this assessment.
Limitations:
- Static analysis only — no runtime behavior observed
- Source code spot-checked, not exhaustively reviewed
- [Add project-specific limitations, e.g. "Extension dependencies not audited for known CVEs"]
- Third-party MCP servers and marketplace content not analyzed beyond declared configs
Report generated [ISO 8601 timestamp] by llm-security v[X.X] Baseline: Claude Code Security Baseline v1.0 OWASP references: LLM Top 10 2025, Agentic AI Top 10 Next recommended audit: [e.g. Before next major release or within 30 days]