feat: initial open marketplace with llm-security, config-audit, ultraplan-local

2026-04-06 18:47:49 +02:00 · 2026-04-06 18:47:49 +02:00 · f93d6abdae
commit f93d6abdae
380 changed files with 65935 additions and 0 deletions
--- a/plugins/llm-security/reports/awesome-copilot-test-skills-deepscan.md
+++ b/plugins/llm-security/reports/awesome-copilot-test-skills-deepscan.md
@ -0,0 +1,151 @@
+# Deep Security Scan: awesome-copilot Test Skills
+
+**Target:** github.com/github/awesome-copilot (5 test-related skills)
+**Scan date:** 2026-04-05
+**Scanner:** llm-security v4.5.1 — deep-scan (10 deterministic) + skill-scanner-agent (LLM)
+**Requested by:** KTG
+
+---
+
+## Skills Assessed
+
+| # | Skill | Installs/wk | Files | Purpose |
+|---|-------|-------------|-------|---------|
+| 1 | playwright-generate-test | 9.2K | 1 (SKILL.md) | Playwright test generation via MCP |
+| 2 | javascript-typescript-jest | 8.8K | 1 (SKILL.md) | Jest best practices reference |
+| 3 | webapp-testing | 8.3K | 2 (SKILL.md + test-helper.js) | Browser testing toolkit |
+| 4 | java-junit | 8.3K | 1 (SKILL.md) | JUnit 5 best practices reference |
+| 5 | pytest-coverage | 8.0K | 1 (SKILL.md) | pytest coverage workflow |
+
+---
+
+## Overall Verdict: ALLOW (Risk Score 3/100)
+
+All 5 skills are safe to install and use. Zero critical, high, or medium findings. Three low-severity hygiene observations.
+
+---
+
+## Deterministic Deep-Scan Results (10 Scanners)
+
+| Scanner | playwright-generate-test | jest | webapp-testing | java-junit | pytest-coverage |
+|---------|:---:|:---:|:---:|:---:|:---:|
+| Unicode (confusables, BiDi) | OK | OK | OK | OK | OK |
+| Entropy (secrets, tokens) | OK | OK | OK | OK | OK |
+| Permission (chmod, setuid) | skip | skip | skip | skip | skip |
+| Dependency audit | skip | skip | skip | skip | skip |
+| Taint (untrusted input flow) | OK | OK | OK | OK | OK |
+| Git forensics | OK | OK | OK | OK | OK |
+| Network (URLs, endpoints) | OK | OK | OK | OK | OK |
+| Memory poisoning | OK | OK | OK | OK | OK |
+| Supply-chain recheck | skip | skip | skip | skip | skip |
+| Toxic-flow correlator | skip | skip | skip | skip | skip |
+
+**Result:** 0 findings across all 5 skills. Scanners that require lockfiles/dependencies/permissions correctly skipped (pure markdown skills).
+
+---
+
+## LLM Skill Security Analysis (7 Threat Categories)
+
+| Category | playwright-generate-test | jest | webapp-testing | java-junit | pytest-coverage |
+|----------|:---:|:---:|:---:|:---:|:---:|
+| Prompt Injection | Clean | Clean | Clean | Clean | Clean |
+| Data Exfiltration | Clean | Clean | Clean | Clean | Clean |
+| Privilege Escalation | 1 Low | Clean | 1 Low | Clean | Clean |
+| Scope Creep | Clean | Clean | Clean | Clean | Clean |
+| Hidden Instructions | Clean | Clean | Clean | Clean | Clean |
+| Toolchain Manipulation | Clean | Clean | Clean | Clean | 1 Low |
+| Persistence | Clean | Clean | Clean | Clean | Clean |
+
+### Finding Details
+
+**SCN-001 — Execution scope undeclared** (Low)
+- **Skill:** playwright-generate-test
+- **Issue:** Instructs "Execute the test file and iterate until the test passes" without declaring `allowed-tools` in frontmatter
+- **OWASP:** LLM06:2025 Excessive Agency, AST03 Scope Declaration
+- **Fix:** Add `allowed-tools` frontmatter limiting execution to `npx playwright test`
+
+**SCN-002 — Unbounded Node.js fallback** (Low)
+- **Skill:** webapp-testing
+- **Issue:** Falls back to "local Node.js environment" if MCP unavailable — no scope limitation on what the fallback may execute
+- **OWASP:** LLM06:2025 Excessive Agency, AST04 Capability Expansion
+- **Fix:** Constrain fallback to localhost targets only, require user confirmation for remote
+
+**SCN-003 — Implicit dependency assumption** (Low)
+- **Skill:** pytest-coverage
+- **Issue:** Assumes `pytest-cov` is installed without verification. Agent may silently install it
+- **OWASP:** LLM03:2025 Supply Chain
+- **Fix:** Add prerequisite check before running coverage commands
+
+---
+
+## Risk Classification
+
+```
+Skill                          Score  Verdict  Risk Band
+───────────────────────────────────────────────────────
+javascript-typescript-jest       0    ALLOW    None
+java-junit                       0    ALLOW    None
+playwright-generate-test         4    ALLOW    Low
+webapp-testing                   4    ALLOW    Low
+pytest-coverage                  4    ALLOW    Low
+───────────────────────────────────────────────────────
+AGGREGATE                        3    ALLOW    Low (0-20)
+```
+
+---
+
+## Key Observations
+
+1. **No injection attempts found.** Zero instances of rule override language, identity redefinition, spoofed system headers, or context normalization patterns across all 6 files. This is notably clean — ToxicSkills research found 36.82% of community skills have at least one issue.
+
+2. **No exfiltration infrastructure.** None of the skills access credential paths, environment variables, sensitive filesystem locations, or external network endpoints.
+
+3. **No secrets in any file.** All 6 files pass entropy and secrets-pattern checks.
+
+4. **Two pure-reference skills (jest, junit) are exemplary.** They demonstrate the correct pattern for knowledge-transfer skills: no execution, no tool access, no network references. These cannot be weaponized.
+
+5. **Source legitimacy is consistent.** All from the official `github/awesome-copilot` repository (28.5K stars), maintained by GitHub.
+
+---
+
+## OWASP Coverage Matrix
+
+| Framework | Category | Checked | Findings |
+|-----------|----------|:---:|---|
+| LLM Top 10 | LLM01 Prompt Injection | Yes | None |
+| LLM Top 10 | LLM02 Sensitive Info Disclosure | Yes | None |
+| LLM Top 10 | LLM03 Supply Chain | Yes | SCN-003 (Low) |
+| LLM Top 10 | LLM06 Excessive Agency | Yes | SCN-001, SCN-002 (Low) |
+| Agentic AI | ASI01 Prompt Injection | Yes | None |
+| Agentic AI | ASI02 Exfiltration | Yes | None |
+| Agentic AI | ASI03 Privilege Escalation | Yes | None |
+| Agentic AI | ASI04 Toolchain Manipulation | Yes | None |
+| Agentic AI | ASI10 Persistence | Yes | None |
+| Skills Top 10 | AST03 Scope Declaration | Yes | SCN-001, SCN-002 (Low) |
+| Skills Top 10 | AST04 Capability Expansion | Yes | SCN-002 (Low) |
+
+---
+
+## Recommendations for Testledere
+
+Disse 5 skills er trygge å ta i bruk for testteam. Noen anbefalinger:
+
+| Prioritet | Anbefaling |
+|-----------|------------|
+| **Bruk direkte** | `javascript-typescript-jest` og `java-junit` — rene referansedokumenter uten risiko |
+| **Bruk med bevissthet** | `playwright-generate-test` og `webapp-testing` — har kjørerettighetsbehov, men er korrekt scopet |
+| **Bruk med bevissthet** | `pytest-coverage` — verifiser at `pytest-cov` er i prosjektets avhengigheter før bruk |
+| **Generelt** | Alle skills bør kombineres med prosjektets egne sikkerhetshooks for å fange opp uventet oppførsel |
+
+---
+
+## Methodology
+
+- **Phase 1:** Deterministic deep-scan — 10 Node.js scanners (unicode, entropy, permission, dep-audit, taint, git-forensics, network, memory-poisoning, supply-chain-recheck, toxic-flow)
+- **Phase 2:** LLM-based skill analysis — 7 threat categories (prompt injection, data exfiltration, privilege escalation, scope creep, hidden instructions, toolchain manipulation, persistence)
+- **Frameworks:** OWASP LLM Top 10 (2025), OWASP Agentic AI Top 10 (ASI), OWASP Skills Top 10 (AST)
+- **Models:** scan-orchestrator.mjs (deterministic), skill-scanner-agent (claude-sonnet-4-6)
+
+---
+
+*Generated by llm-security v4.5.1*
--- a/plugins/llm-security/reports/baselines/.gitkeep
+++ b/plugins/llm-security/reports/baselines/.gitkeep
--- a/plugins/llm-security/reports/oh-my-openagent-scan-2026-04-02.docx
+++ b/plugins/llm-security/reports/oh-my-openagent-scan-2026-04-02.docx
--- a/plugins/llm-security/reports/oh-my-openagent-scan-2026-04-02.md
+++ b/plugins/llm-security/reports/oh-my-openagent-scan-2026-04-02.md
@ -0,0 +1,219 @@
+---
+title: "Security Scan Report — oh-my-openagent"
+subtitle: "Branch: dev | Full scan with deep analysis"
+author: "KI-seksjonen, Statens vegvesen"
+date: "2026-04-02"
+---
+
+# Security Scan Report — oh-my-openagent (branch: dev)
+
+**Target:** `https://github.com/code-yeongyu/oh-my-openagent`\
+**Timestamp:** 2026-04-02T12:29:18Z\
+**Scanners:** LLM skill-scanner + 7 deterministic scanners (unicode, entropy, permission, dep-audit, taint, git-forensics, network)\
+**Files scanned:** 1 646\
+**Tool:** llm-security v2.5.0 for Claude Code
+
+---
+
+## Verdict: BLOCK — Risk Score: 100/100 (Extreme)
+
+| Severity | LLM Scan | Deep Scan | Total |
+|----------|----------|-----------|-------|
+| Critical | 3 | 4 | **7** |
+| High | 2 | 7 | **9** |
+| Medium | 1 | 192 | 193 |
+| Low | 0 | 0 | 0 |
+| Info | 2 | 61 | 63 |
+
+**Do not install this plugin without resolving the Critical findings.** The confirmed `<system>` tag injection in production source code and the agent-manipulation pattern in the installation guide are particularly concerning.
+
+---
+
+## Key Risk Signals
+
+| Signal | Assessment |
+|--------|-----------|
+| Confirmed prompt injection in production source | **Critical** — `<system>` tags in `constants.ts` |
+| Agent manipulation for advertising/self-promotion | **Critical** — must remove |
+| Mutable-URL install chain (rug-pull ready) | **High** — pin all URLs |
+| Telegram + Discord exfiltration channels | **High** — confirm user-controlled |
+| `process.argv` → `spawnSync()` without sanitization | **Critical** — P0 fix |
+| High-entropy Korean README cluster | **Critical** — manual review required |
+
+---
+
+## Critical Findings
+
+### SCN-001 — Spoofed `<system>` tags in production source
+
+- **Category:** Prompt Injection
+- **File:** `src/tools/delegate-task/constants.ts:313,332`
+- **OWASP:** LLM01:2025
+- **Evidence:** Literal `<system>`/`</system>` XML delimiters (ClawHavoc technique) — pre-extraction scanner confirmed and stripped. These are in production string constants used to build agent prompts.
+- **Remediation:** Audit lines 313–332. Remove or HTML-escape (`&lt;system&gt;`) the tags. Add sanitization assertion.
+
+### SCN-002 — `<system>` tags validated in tests (no sanitization guard)
+
+- **Category:** Prompt Injection
+- **File:** `src/tools/delegate-task/tools.test.ts:3089,3175,3188`
+- **OWASP:** LLM01:2025
+- **Evidence:** 3 occurrences in the test file for the delegate-task tool — tests replicate the injection template from `constants.ts` without asserting sanitization. Tests that pass with injected system tags *validate* the attack path.
+- **Remediation:** Add assertions that `<system>` tags are rejected/escaped before reaching any LLM API call.
+
+### SCN-003 — `override instructions` phrase in documentation
+
+- **Category:** Prompt Injection (context-normalization)
+- **File:** `docs/reference/configuration.md:737`
+- **OWASP:** LLM01:2025, LLM03:2025
+- **Evidence:** `[INJECTION-PATTERN-STRIPPED: override: override instructions]` embedded mid-sentence. This codebase supports `file://` URIs in `prompt`/`prompt_append` fields — doc files can be loaded directly into agent system prompts, making this a live attack surface.
+- **Remediation:** Git-blame line 737, identify the commit, and determine if authorized. Rewrite the sentence using passive voice to eliminate the imperative framing.
+
+### DS-TNT-001 — `process.argv` flows directly to `spawnSync()`
+
+- **Category:** Command Injection (Taint)
+- **File:** `bin/oh-my-opencode.js:125`
+- **OWASP:** LLM01:2025
+- **Evidence:** Source `process.argv` → sink `spawnSync()` with zero sanitization, at the application entry-point.
+- **Remediation:** Parse args with `yargs`/`commander`, allowlist valid subcommands before forwarding.
+
+### DS-ENT-017/019 — Abnormally high-entropy Korean text cluster
+
+- **Category:** Obfuscated content / possible embedded payload
+- **File:** `README.ko.md:65,71`
+- **OWASP:** LLM01:2025
+- **Evidence:** H=5.80 (len=174) and H=5.55 (len=128) — two contiguous critical-entropy Korean strings adjacent on lines 65–71. Natural prose entropy is typically 3.5–4.5.
+- **Remediation:** Inspect lines 59–80 as a unit. Confirm no embedded instructions. Remove if provenance unclear.
+
+### DS-TNT-002 — `sys.argv` flows directly to `open(w)` in test file
+
+- **Category:** Arbitrary File Write (Taint)
+- **File:** `src/shared/archive-entry-validator.test.ts:102`
+- **OWASP:** LLM01:2025
+- **Evidence:** Source `sys.argv` → sink `open(w)` with zero sanitization.
+- **Remediation:** Even in test helpers, avoid constructing file write paths from raw argv. Use `path.resolve` with a fixed base directory.
+
+---
+
+## High Findings
+
+### SCN-004 — "Free advertising" + unauthorized repo-star via `gh api`
+
+- **Category:** Covert Agent Manipulation / Excessive Agency
+- **File:** `docs/guide/installation.md:396,448`
+- **OWASP:** LLM06:2025, LLM01:2025
+- **Evidence:** Installation guide instructs the agent to (1) fetch a remote README and advertise a company to the user, and (2) execute `gh api --method PUT /user/starred/...` to star the repository — without user consent.
+- **Remediation:** Remove both sections. Implement star-request as an explicit user-consent UI, not an agent-executed API call.
+
+### SCN-005 — All READMEs reference mutable `dev` branch raw URLs
+
+- **Category:** Supply Chain / Rug-pull vector
+- **File:** `README.md`, `README.ja.md`, `README.ko.md`, `README.ru.md`, `README.zh-cn.md`, `docs/guide/installation.md`
+- **OWASP:** LLM03:2025, LLM01:2025
+- **Evidence:** `curl -s https://raw.githubusercontent.com/.../refs/heads/dev/docs/guide/installation.md` — points to a mutable branch, not a pinned commit/tag.
+- **Remediation:** Replace all `refs/heads/dev` references with pinned commit SHAs or versioned tags.
+
+### DS-NET-054 — Telegram Bot API in production code
+
+- **Category:** Suspicious Exfiltration Domain
+- **File:** `src/openclaw/reply-listener.ts:413,484`
+- **OWASP:** LLM02:2025
+- **Evidence:** `https://api.telegram.org/bot$` — bot token interpolated at runtime. Telegram Bot API is a well-documented exfiltration channel used in credential-stealing malware.
+- **Remediation:** Confirm this is an opt-in notification feature fully controlled by the user (not enabled by default). Add documentation stating what data is sent to Telegram and under what conditions.
+
+### DS-NET-053 — Discord webhook in production code
+
+- **Category:** Suspicious Exfiltration Domain
+- **File:** `src/openclaw/reply-listener.ts:310`
+- **OWASP:** LLM02:2025
+- **Evidence:** `discord.com/api/webhooks` — webhook URL in production code means the application can send data to Discord.
+- **Remediation:** Ensure URL is user-configured, never hardcoded. Document what data is sent and when.
+
+### DS-ENT-152 — Hardcoded browser User-Agent in redirect-guard hook
+
+- **Category:** Obfuscated string / Deceptive network behavior
+- **File:** `src/hooks/webfetch-redirect-guard/redirect-resolution.ts:34`
+- **OWASP:** LLM03:2025
+- **Evidence:** H=5.11, `Mozilla/...7.36` — spoofs browser identity during redirect resolution.
+- **Remediation:** Source UA from configurable env var; document justification.
+
+### DS-ENT-155 — Elevated-entropy conditional instruction in pre-tool hook
+
+- **Category:** Obfuscated instructions / possible embedded directive
+- **File:** `src/plugin/tool-execute-before.ts:44`
+- **OWASP:** LLM03:2025
+- **Evidence:** H=5.11, len=107, starts `If the w...se>.` — conditional-instruction pattern in a pre-tool-execution hook.
+- **Remediation:** Read lines 40–50 to confirm it is a legitimate log/display string, not a behavioral directive.
+
+### DS-NET-001 — Discord invite link across 15+ files
+
+- **Category:** Suspicious Exfiltration Domain
+- **File:** `.github/ISSUE_TEMPLATE/config.yml:4` and 14 other locations
+- **OWASP:** LLM02:2025
+- **Evidence:** `https://discord.gg/PUwSMR9XNk` — DNS resolved. Discord invite links are a known exfiltration vector via webhook.
+- **Remediation:** Verify the invite still points to a controlled server and has not been hijacked. Remove for enterprise deployments.
+
+---
+
+## Medium Findings (summary)
+
+193 medium findings detected, dominated by entropy scanner hits on template literals and log format strings throughout the TypeScript source (expected for string-interpolation-heavy codebases). The cross-instruction scanner flagged 26 files containing both `process.env` access and network calls in the same file — after review, all are attributable to normal Node.js application patterns (`process.env` for config + HTTP for core functionality).
+
+---
+
+## Info Findings (summary)
+
+63 info findings: 61 are network domain inventory entries from the NET scanner. 2 are from the LLM skill scan: a dynamic `npm install ${packageCandidates[0]}` pattern in `bin/oh-my-opencode.js:118` and diagnostic `sudo apt`/`sudo yum` strings in `src/tools/look-at/image-converter.ts:96-97`.
+
+---
+
+## OWASP Categorization
+
+| OWASP Category | Findings | Max Severity |
+|----------------|----------|-------------|
+| LLM01 — Prompt Injection | 11 | Critical |
+| LLM02 — Sensitive Information Disclosure | 6 | High |
+| LLM03 — Supply Chain | 249 | High |
+| LLM06 — Excessive Agency | 1 | High |
+
+---
+
+## Prioritized Remediation Plan
+
+| Priority | Finding | Action | Effort |
+|----------|---------|--------|--------|
+| P0 | SCN-001 | Remove/escape `<system>` tags in `constants.ts:313-332` | Low |
+| P0 | DS-TNT-001 | Sanitize `process.argv` before `spawnSync()` in `bin/oh-my-opencode.js:125` | Low |
+| P0 | DS-NET-054 | Audit Telegram bot integration — confirm user-controlled | Medium |
+| P0 | SCN-003 | Git-blame `configuration.md:737` — verify `override instructions` provenance | Low |
+| P1 | SCN-004 | Remove "Free advertising" and "Ask for a Star" agent-executed actions | Low |
+| P1 | SCN-005 | Pin all raw GitHub URL references to commit SHAs or tags | Low |
+| P1 | DS-NET-053 | Confirm Discord webhook is user-controlled, never hardcoded | Low |
+| P1 | DS-ENT-017/019 | Inspect `README.ko.md:60-80` for embedded instructions | Low |
+| P2 | SCN-002 | Add sanitization assertions in `tools.test.ts` | Medium |
+| P2 | DS-ENT-155 | Verify no embedded directive in `tool-execute-before.ts:44` | Low |
+| P2 | DS-ENT-152 | Remove hardcoded User-Agent from redirect-guard hook | Low |
+
+---
+
+## Methodology
+
+This scan used `llm-security v2.5.0` for Claude Code, combining:
+
+1. **Pre-extraction layer** (`content-extractor.mjs`) — Scans all files before LLM analysis. Strips confirmed injection patterns and replaces them with `[INJECTION-PATTERN-STRIPPED]` markers. This prevents prompt injection from the scanned repository from affecting the scanning agent itself.
+
+2. **LLM skill scanner** — Analyzes the evidence package for 7 threat categories: prompt injection, data exfiltration, privilege escalation, scope creep, hidden instructions, toolchain manipulation, and persistence mechanisms.
+
+3. **7 deterministic Node.js scanners:**
+   - **Unicode** — Detects homoglyph attacks, bidirectional override characters
+   - **Entropy** — Shannon entropy analysis for obfuscated content, embedded secrets
+   - **Permission** — File permission anomalies
+   - **Dependency audit** — Known vulnerabilities in dependencies
+   - **Taint** — Source-to-sink data flow analysis (argv→exec, env→http, etc.)
+   - **Git forensics** — Suspicious commit patterns, force-pushes
+   - **Network** — External endpoint inventory, suspicious domain detection
+
+All findings are mapped to OWASP LLM Top 10 (2025) and OWASP Agentic AI Top 10 categories.
+
+---
+
+*Report generated by llm-security v2.5.0 — Security scanning, auditing, and threat modeling for Claude Code projects.*
--- a/plugins/llm-security/reports/skill-registry.json
+++ b/plugins/llm-security/reports/skill-registry.json
@ -0,0 +1,45 @@
+{
+  "version": "1",
+  "updated": "2026-04-05T13:40:30.791Z",
+  "entry_count": 1,
+  "entries": {
+    "e4e9fe45a840febc9e95a70cc4fe64e143f65856be5546177f48c08715c2e466": {
+      "name": "klinkis",
+      "source": "/Users/ktg/repos/klinkis",
+      "fingerprint": "e4e9fe45a840febc9e95a70cc4fe64e143f65856be5546177f48c08715c2e466",
+      "first_seen": "2026-04-05T13:40:30.791Z",
+      "last_scanned": "2026-04-05T13:40:30.791Z",
+      "scan_count": 1,
+      "verdict": "ALLOW",
+      "risk_score": 1,
+      "counts": {
+        "critical": 0,
+        "high": 0,
+        "medium": 0,
+        "low": 1,
+        "info": 1
+      },
+      "files_scanned": 28,
+      "files_in_fingerprint": [
+        ".claude/settings.local.json",
+        "CLAUDE.md",
+        "docs/spec.md",
+        "eslint.config.js",
+        "package-lock.json",
+        "package.json",
+        "postcss.config.mjs",
+        "README.md",
+        "src/modules/TrackGenerator.ts",
+        "src/modules/types.ts",
+        "src/shared/marbleState.ts",
+        "src/stores/gameStore.ts",
+        "tsconfig.app.json",
+        "tsconfig.json",
+        "tsconfig.node.json",
+        "vite.config.ts"
+      ],
+      "tags": [],
+      "source_type": "scanned"
+    }
+  }
+}
--- a/plugins/llm-security/reports/watch/.gitkeep
+++ b/plugins/llm-security/reports/watch/.gitkeep