feat: initial open marketplace with llm-security, config-audit, ultraplan-local

This commit is contained in:
Kjell Tore Guttormsen 2026-04-06 18:47:49 +02:00
commit f93d6abdae
380 changed files with 65935 additions and 0 deletions

View file

@ -0,0 +1,151 @@
# Deep Security Scan: awesome-copilot Test Skills
**Target:** github.com/github/awesome-copilot (5 test-related skills)
**Scan date:** 2026-04-05
**Scanner:** llm-security v4.5.1 — deep-scan (10 deterministic) + skill-scanner-agent (LLM)
**Requested by:** KTG
---
## Skills Assessed
| # | Skill | Installs/wk | Files | Purpose |
|---|-------|-------------|-------|---------|
| 1 | playwright-generate-test | 9.2K | 1 (SKILL.md) | Playwright test generation via MCP |
| 2 | javascript-typescript-jest | 8.8K | 1 (SKILL.md) | Jest best practices reference |
| 3 | webapp-testing | 8.3K | 2 (SKILL.md + test-helper.js) | Browser testing toolkit |
| 4 | java-junit | 8.3K | 1 (SKILL.md) | JUnit 5 best practices reference |
| 5 | pytest-coverage | 8.0K | 1 (SKILL.md) | pytest coverage workflow |
---
## Overall Verdict: ALLOW (Risk Score 3/100)
All 5 skills are safe to install and use. Zero critical, high, or medium findings. Three low-severity hygiene observations.
---
## Deterministic Deep-Scan Results (10 Scanners)
| Scanner | playwright-generate-test | jest | webapp-testing | java-junit | pytest-coverage |
|---------|:---:|:---:|:---:|:---:|:---:|
| Unicode (confusables, BiDi) | OK | OK | OK | OK | OK |
| Entropy (secrets, tokens) | OK | OK | OK | OK | OK |
| Permission (chmod, setuid) | skip | skip | skip | skip | skip |
| Dependency audit | skip | skip | skip | skip | skip |
| Taint (untrusted input flow) | OK | OK | OK | OK | OK |
| Git forensics | OK | OK | OK | OK | OK |
| Network (URLs, endpoints) | OK | OK | OK | OK | OK |
| Memory poisoning | OK | OK | OK | OK | OK |
| Supply-chain recheck | skip | skip | skip | skip | skip |
| Toxic-flow correlator | skip | skip | skip | skip | skip |
**Result:** 0 findings across all 5 skills. Scanners that require lockfiles/dependencies/permissions correctly skipped (pure markdown skills).
---
## LLM Skill Security Analysis (7 Threat Categories)
| Category | playwright-generate-test | jest | webapp-testing | java-junit | pytest-coverage |
|----------|:---:|:---:|:---:|:---:|:---:|
| Prompt Injection | Clean | Clean | Clean | Clean | Clean |
| Data Exfiltration | Clean | Clean | Clean | Clean | Clean |
| Privilege Escalation | 1 Low | Clean | 1 Low | Clean | Clean |
| Scope Creep | Clean | Clean | Clean | Clean | Clean |
| Hidden Instructions | Clean | Clean | Clean | Clean | Clean |
| Toolchain Manipulation | Clean | Clean | Clean | Clean | 1 Low |
| Persistence | Clean | Clean | Clean | Clean | Clean |
### Finding Details
**SCN-001 — Execution scope undeclared** (Low)
- **Skill:** playwright-generate-test
- **Issue:** Instructs "Execute the test file and iterate until the test passes" without declaring `allowed-tools` in frontmatter
- **OWASP:** LLM06:2025 Excessive Agency, AST03 Scope Declaration
- **Fix:** Add `allowed-tools` frontmatter limiting execution to `npx playwright test`
**SCN-002 — Unbounded Node.js fallback** (Low)
- **Skill:** webapp-testing
- **Issue:** Falls back to "local Node.js environment" if MCP unavailable — no scope limitation on what the fallback may execute
- **OWASP:** LLM06:2025 Excessive Agency, AST04 Capability Expansion
- **Fix:** Constrain fallback to localhost targets only, require user confirmation for remote
**SCN-003 — Implicit dependency assumption** (Low)
- **Skill:** pytest-coverage
- **Issue:** Assumes `pytest-cov` is installed without verification. Agent may silently install it
- **OWASP:** LLM03:2025 Supply Chain
- **Fix:** Add prerequisite check before running coverage commands
---
## Risk Classification
```
Skill Score Verdict Risk Band
───────────────────────────────────────────────────────
javascript-typescript-jest 0 ALLOW None
java-junit 0 ALLOW None
playwright-generate-test 4 ALLOW Low
webapp-testing 4 ALLOW Low
pytest-coverage 4 ALLOW Low
───────────────────────────────────────────────────────
AGGREGATE 3 ALLOW Low (0-20)
```
---
## Key Observations
1. **No injection attempts found.** Zero instances of rule override language, identity redefinition, spoofed system headers, or context normalization patterns across all 6 files. This is notably clean — ToxicSkills research found 36.82% of community skills have at least one issue.
2. **No exfiltration infrastructure.** None of the skills access credential paths, environment variables, sensitive filesystem locations, or external network endpoints.
3. **No secrets in any file.** All 6 files pass entropy and secrets-pattern checks.
4. **Two pure-reference skills (jest, junit) are exemplary.** They demonstrate the correct pattern for knowledge-transfer skills: no execution, no tool access, no network references. These cannot be weaponized.
5. **Source legitimacy is consistent.** All from the official `github/awesome-copilot` repository (28.5K stars), maintained by GitHub.
---
## OWASP Coverage Matrix
| Framework | Category | Checked | Findings |
|-----------|----------|:---:|---|
| LLM Top 10 | LLM01 Prompt Injection | Yes | None |
| LLM Top 10 | LLM02 Sensitive Info Disclosure | Yes | None |
| LLM Top 10 | LLM03 Supply Chain | Yes | SCN-003 (Low) |
| LLM Top 10 | LLM06 Excessive Agency | Yes | SCN-001, SCN-002 (Low) |
| Agentic AI | ASI01 Prompt Injection | Yes | None |
| Agentic AI | ASI02 Exfiltration | Yes | None |
| Agentic AI | ASI03 Privilege Escalation | Yes | None |
| Agentic AI | ASI04 Toolchain Manipulation | Yes | None |
| Agentic AI | ASI10 Persistence | Yes | None |
| Skills Top 10 | AST03 Scope Declaration | Yes | SCN-001, SCN-002 (Low) |
| Skills Top 10 | AST04 Capability Expansion | Yes | SCN-002 (Low) |
---
## Recommendations for Testledere
Disse 5 skills er trygge å ta i bruk for testteam. Noen anbefalinger:
| Prioritet | Anbefaling |
|-----------|------------|
| **Bruk direkte** | `javascript-typescript-jest` og `java-junit` — rene referansedokumenter uten risiko |
| **Bruk med bevissthet** | `playwright-generate-test` og `webapp-testing` — har kjørerettighetsbehov, men er korrekt scopet |
| **Bruk med bevissthet** | `pytest-coverage` — verifiser at `pytest-cov` er i prosjektets avhengigheter før bruk |
| **Generelt** | Alle skills bør kombineres med prosjektets egne sikkerhetshooks for å fange opp uventet oppførsel |
---
## Methodology
- **Phase 1:** Deterministic deep-scan — 10 Node.js scanners (unicode, entropy, permission, dep-audit, taint, git-forensics, network, memory-poisoning, supply-chain-recheck, toxic-flow)
- **Phase 2:** LLM-based skill analysis — 7 threat categories (prompt injection, data exfiltration, privilege escalation, scope creep, hidden instructions, toolchain manipulation, persistence)
- **Frameworks:** OWASP LLM Top 10 (2025), OWASP Agentic AI Top 10 (ASI), OWASP Skills Top 10 (AST)
- **Models:** scan-orchestrator.mjs (deterministic), skill-scanner-agent (claude-sonnet-4-6)
---
*Generated by llm-security v4.5.1*

View file

@ -0,0 +1,219 @@
---
title: "Security Scan Report — oh-my-openagent"
subtitle: "Branch: dev | Full scan with deep analysis"
author: "KI-seksjonen, Statens vegvesen"
date: "2026-04-02"
---
# Security Scan Report — oh-my-openagent (branch: dev)
**Target:** `https://github.com/code-yeongyu/oh-my-openagent`\
**Timestamp:** 2026-04-02T12:29:18Z\
**Scanners:** LLM skill-scanner + 7 deterministic scanners (unicode, entropy, permission, dep-audit, taint, git-forensics, network)\
**Files scanned:** 1 646\
**Tool:** llm-security v2.5.0 for Claude Code
---
## Verdict: BLOCK — Risk Score: 100/100 (Extreme)
| Severity | LLM Scan | Deep Scan | Total |
|----------|----------|-----------|-------|
| Critical | 3 | 4 | **7** |
| High | 2 | 7 | **9** |
| Medium | 1 | 192 | 193 |
| Low | 0 | 0 | 0 |
| Info | 2 | 61 | 63 |
**Do not install this plugin without resolving the Critical findings.** The confirmed `<system>` tag injection in production source code and the agent-manipulation pattern in the installation guide are particularly concerning.
---
## Key Risk Signals
| Signal | Assessment |
|--------|-----------|
| Confirmed prompt injection in production source | **Critical**`<system>` tags in `constants.ts` |
| Agent manipulation for advertising/self-promotion | **Critical** — must remove |
| Mutable-URL install chain (rug-pull ready) | **High** — pin all URLs |
| Telegram + Discord exfiltration channels | **High** — confirm user-controlled |
| `process.argv``spawnSync()` without sanitization | **Critical** — P0 fix |
| High-entropy Korean README cluster | **Critical** — manual review required |
---
## Critical Findings
### SCN-001 — Spoofed `<system>` tags in production source
- **Category:** Prompt Injection
- **File:** `src/tools/delegate-task/constants.ts:313,332`
- **OWASP:** LLM01:2025
- **Evidence:** Literal `<system>`/`</system>` XML delimiters (ClawHavoc technique) — pre-extraction scanner confirmed and stripped. These are in production string constants used to build agent prompts.
- **Remediation:** Audit lines 313332. Remove or HTML-escape (`&lt;system&gt;`) the tags. Add sanitization assertion.
### SCN-002 — `<system>` tags validated in tests (no sanitization guard)
- **Category:** Prompt Injection
- **File:** `src/tools/delegate-task/tools.test.ts:3089,3175,3188`
- **OWASP:** LLM01:2025
- **Evidence:** 3 occurrences in the test file for the delegate-task tool — tests replicate the injection template from `constants.ts` without asserting sanitization. Tests that pass with injected system tags *validate* the attack path.
- **Remediation:** Add assertions that `<system>` tags are rejected/escaped before reaching any LLM API call.
### SCN-003 — `override instructions` phrase in documentation
- **Category:** Prompt Injection (context-normalization)
- **File:** `docs/reference/configuration.md:737`
- **OWASP:** LLM01:2025, LLM03:2025
- **Evidence:** `[INJECTION-PATTERN-STRIPPED: override: override instructions]` embedded mid-sentence. This codebase supports `file://` URIs in `prompt`/`prompt_append` fields — doc files can be loaded directly into agent system prompts, making this a live attack surface.
- **Remediation:** Git-blame line 737, identify the commit, and determine if authorized. Rewrite the sentence using passive voice to eliminate the imperative framing.
### DS-TNT-001 — `process.argv` flows directly to `spawnSync()`
- **Category:** Command Injection (Taint)
- **File:** `bin/oh-my-opencode.js:125`
- **OWASP:** LLM01:2025
- **Evidence:** Source `process.argv` → sink `spawnSync()` with zero sanitization, at the application entry-point.
- **Remediation:** Parse args with `yargs`/`commander`, allowlist valid subcommands before forwarding.
### DS-ENT-017/019 — Abnormally high-entropy Korean text cluster
- **Category:** Obfuscated content / possible embedded payload
- **File:** `README.ko.md:65,71`
- **OWASP:** LLM01:2025
- **Evidence:** H=5.80 (len=174) and H=5.55 (len=128) — two contiguous critical-entropy Korean strings adjacent on lines 6571. Natural prose entropy is typically 3.54.5.
- **Remediation:** Inspect lines 5980 as a unit. Confirm no embedded instructions. Remove if provenance unclear.
### DS-TNT-002 — `sys.argv` flows directly to `open(w)` in test file
- **Category:** Arbitrary File Write (Taint)
- **File:** `src/shared/archive-entry-validator.test.ts:102`
- **OWASP:** LLM01:2025
- **Evidence:** Source `sys.argv` → sink `open(w)` with zero sanitization.
- **Remediation:** Even in test helpers, avoid constructing file write paths from raw argv. Use `path.resolve` with a fixed base directory.
---
## High Findings
### SCN-004 — "Free advertising" + unauthorized repo-star via `gh api`
- **Category:** Covert Agent Manipulation / Excessive Agency
- **File:** `docs/guide/installation.md:396,448`
- **OWASP:** LLM06:2025, LLM01:2025
- **Evidence:** Installation guide instructs the agent to (1) fetch a remote README and advertise a company to the user, and (2) execute `gh api --method PUT /user/starred/...` to star the repository — without user consent.
- **Remediation:** Remove both sections. Implement star-request as an explicit user-consent UI, not an agent-executed API call.
### SCN-005 — All READMEs reference mutable `dev` branch raw URLs
- **Category:** Supply Chain / Rug-pull vector
- **File:** `README.md`, `README.ja.md`, `README.ko.md`, `README.ru.md`, `README.zh-cn.md`, `docs/guide/installation.md`
- **OWASP:** LLM03:2025, LLM01:2025
- **Evidence:** `curl -s https://raw.githubusercontent.com/.../refs/heads/dev/docs/guide/installation.md` — points to a mutable branch, not a pinned commit/tag.
- **Remediation:** Replace all `refs/heads/dev` references with pinned commit SHAs or versioned tags.
### DS-NET-054 — Telegram Bot API in production code
- **Category:** Suspicious Exfiltration Domain
- **File:** `src/openclaw/reply-listener.ts:413,484`
- **OWASP:** LLM02:2025
- **Evidence:** `https://api.telegram.org/bot$` — bot token interpolated at runtime. Telegram Bot API is a well-documented exfiltration channel used in credential-stealing malware.
- **Remediation:** Confirm this is an opt-in notification feature fully controlled by the user (not enabled by default). Add documentation stating what data is sent to Telegram and under what conditions.
### DS-NET-053 — Discord webhook in production code
- **Category:** Suspicious Exfiltration Domain
- **File:** `src/openclaw/reply-listener.ts:310`
- **OWASP:** LLM02:2025
- **Evidence:** `discord.com/api/webhooks` — webhook URL in production code means the application can send data to Discord.
- **Remediation:** Ensure URL is user-configured, never hardcoded. Document what data is sent and when.
### DS-ENT-152 — Hardcoded browser User-Agent in redirect-guard hook
- **Category:** Obfuscated string / Deceptive network behavior
- **File:** `src/hooks/webfetch-redirect-guard/redirect-resolution.ts:34`
- **OWASP:** LLM03:2025
- **Evidence:** H=5.11, `Mozilla/...7.36` — spoofs browser identity during redirect resolution.
- **Remediation:** Source UA from configurable env var; document justification.
### DS-ENT-155 — Elevated-entropy conditional instruction in pre-tool hook
- **Category:** Obfuscated instructions / possible embedded directive
- **File:** `src/plugin/tool-execute-before.ts:44`
- **OWASP:** LLM03:2025
- **Evidence:** H=5.11, len=107, starts `If the w...se>.` — conditional-instruction pattern in a pre-tool-execution hook.
- **Remediation:** Read lines 4050 to confirm it is a legitimate log/display string, not a behavioral directive.
### DS-NET-001 — Discord invite link across 15+ files
- **Category:** Suspicious Exfiltration Domain
- **File:** `.github/ISSUE_TEMPLATE/config.yml:4` and 14 other locations
- **OWASP:** LLM02:2025
- **Evidence:** `https://discord.gg/PUwSMR9XNk` — DNS resolved. Discord invite links are a known exfiltration vector via webhook.
- **Remediation:** Verify the invite still points to a controlled server and has not been hijacked. Remove for enterprise deployments.
---
## Medium Findings (summary)
193 medium findings detected, dominated by entropy scanner hits on template literals and log format strings throughout the TypeScript source (expected for string-interpolation-heavy codebases). The cross-instruction scanner flagged 26 files containing both `process.env` access and network calls in the same file — after review, all are attributable to normal Node.js application patterns (`process.env` for config + HTTP for core functionality).
---
## Info Findings (summary)
63 info findings: 61 are network domain inventory entries from the NET scanner. 2 are from the LLM skill scan: a dynamic `npm install ${packageCandidates[0]}` pattern in `bin/oh-my-opencode.js:118` and diagnostic `sudo apt`/`sudo yum` strings in `src/tools/look-at/image-converter.ts:96-97`.
---
## OWASP Categorization
| OWASP Category | Findings | Max Severity |
|----------------|----------|-------------|
| LLM01 — Prompt Injection | 11 | Critical |
| LLM02 — Sensitive Information Disclosure | 6 | High |
| LLM03 — Supply Chain | 249 | High |
| LLM06 — Excessive Agency | 1 | High |
---
## Prioritized Remediation Plan
| Priority | Finding | Action | Effort |
|----------|---------|--------|--------|
| P0 | SCN-001 | Remove/escape `<system>` tags in `constants.ts:313-332` | Low |
| P0 | DS-TNT-001 | Sanitize `process.argv` before `spawnSync()` in `bin/oh-my-opencode.js:125` | Low |
| P0 | DS-NET-054 | Audit Telegram bot integration — confirm user-controlled | Medium |
| P0 | SCN-003 | Git-blame `configuration.md:737` — verify `override instructions` provenance | Low |
| P1 | SCN-004 | Remove "Free advertising" and "Ask for a Star" agent-executed actions | Low |
| P1 | SCN-005 | Pin all raw GitHub URL references to commit SHAs or tags | Low |
| P1 | DS-NET-053 | Confirm Discord webhook is user-controlled, never hardcoded | Low |
| P1 | DS-ENT-017/019 | Inspect `README.ko.md:60-80` for embedded instructions | Low |
| P2 | SCN-002 | Add sanitization assertions in `tools.test.ts` | Medium |
| P2 | DS-ENT-155 | Verify no embedded directive in `tool-execute-before.ts:44` | Low |
| P2 | DS-ENT-152 | Remove hardcoded User-Agent from redirect-guard hook | Low |
---
## Methodology
This scan used `llm-security v2.5.0` for Claude Code, combining:
1. **Pre-extraction layer** (`content-extractor.mjs`) — Scans all files before LLM analysis. Strips confirmed injection patterns and replaces them with `[INJECTION-PATTERN-STRIPPED]` markers. This prevents prompt injection from the scanned repository from affecting the scanning agent itself.
2. **LLM skill scanner** — Analyzes the evidence package for 7 threat categories: prompt injection, data exfiltration, privilege escalation, scope creep, hidden instructions, toolchain manipulation, and persistence mechanisms.
3. **7 deterministic Node.js scanners:**
- **Unicode** — Detects homoglyph attacks, bidirectional override characters
- **Entropy** — Shannon entropy analysis for obfuscated content, embedded secrets
- **Permission** — File permission anomalies
- **Dependency audit** — Known vulnerabilities in dependencies
- **Taint** — Source-to-sink data flow analysis (argv→exec, env→http, etc.)
- **Git forensics** — Suspicious commit patterns, force-pushes
- **Network** — External endpoint inventory, suspicious domain detection
All findings are mapped to OWASP LLM Top 10 (2025) and OWASP Agentic AI Top 10 categories.
---
*Report generated by llm-security v2.5.0 — Security scanning, auditing, and threat modeling for Claude Code projects.*

View file

@ -0,0 +1,45 @@
{
"version": "1",
"updated": "2026-04-05T13:40:30.791Z",
"entry_count": 1,
"entries": {
"e4e9fe45a840febc9e95a70cc4fe64e143f65856be5546177f48c08715c2e466": {
"name": "klinkis",
"source": "/Users/ktg/repos/klinkis",
"fingerprint": "e4e9fe45a840febc9e95a70cc4fe64e143f65856be5546177f48c08715c2e466",
"first_seen": "2026-04-05T13:40:30.791Z",
"last_scanned": "2026-04-05T13:40:30.791Z",
"scan_count": 1,
"verdict": "ALLOW",
"risk_score": 1,
"counts": {
"critical": 0,
"high": 0,
"medium": 0,
"low": 1,
"info": 1
},
"files_scanned": 28,
"files_in_fingerprint": [
".claude/settings.local.json",
"CLAUDE.md",
"docs/spec.md",
"eslint.config.js",
"package-lock.json",
"package.json",
"postcss.config.mjs",
"README.md",
"src/modules/TrackGenerator.ts",
"src/modules/types.ts",
"src/shared/marbleState.ts",
"src/stores/gameStore.ts",
"tsconfig.app.json",
"tsconfig.json",
"tsconfig.node.json",
"vite.config.ts"
],
"tags": [],
"source_type": "scanned"
}
}
}