Deep Security Scan: awesome-copilot Test Skills

Target: github.com/github/awesome-copilot (5 test-related skills) Scan date: 2026-04-05 Scanner: llm-security v4.5.1 — deep-scan (10 deterministic) + skill-scanner-agent (LLM) Requested by: KTG

Skills Assessed

#	Skill	Installs/wk	Files	Purpose
1	playwright-generate-test	9.2K	1 (SKILL.md)	Playwright test generation via MCP
2	javascript-typescript-jest	8.8K	1 (SKILL.md)	Jest best practices reference
3	webapp-testing	8.3K	2 (SKILL.md + test-helper.js)	Browser testing toolkit
4	java-junit	8.3K	1 (SKILL.md)	JUnit 5 best practices reference
5	pytest-coverage	8.0K	1 (SKILL.md)	pytest coverage workflow

Overall Verdict: ALLOW (Risk Score 3/100)

All 5 skills are safe to install and use. Zero critical, high, or medium findings. Three low-severity hygiene observations.

Deterministic Deep-Scan Results (10 Scanners)

Scanner	playwright-generate-test	jest	webapp-testing	java-junit	pytest-coverage
Unicode (confusables, BiDi)	OK	OK	OK	OK	OK
Entropy (secrets, tokens)	OK	OK	OK	OK	OK
Permission (chmod, setuid)	skip	skip	skip	skip	skip
Dependency audit	skip	skip	skip	skip	skip
Taint (untrusted input flow)	OK	OK	OK	OK	OK
Git forensics	OK	OK	OK	OK	OK
Network (URLs, endpoints)	OK	OK	OK	OK	OK
Memory poisoning	OK	OK	OK	OK	OK
Supply-chain recheck	skip	skip	skip	skip	skip
Toxic-flow correlator	skip	skip	skip	skip	skip

Result: 0 findings across all 5 skills. Scanners that require lockfiles/dependencies/permissions correctly skipped (pure markdown skills).

LLM Skill Security Analysis (7 Threat Categories)

Category	playwright-generate-test	jest	webapp-testing	java-junit	pytest-coverage
Prompt Injection	Clean	Clean	Clean	Clean	Clean
Data Exfiltration	Clean	Clean	Clean	Clean	Clean
Privilege Escalation	1 Low	Clean	1 Low	Clean	Clean
Scope Creep	Clean	Clean	Clean	Clean	Clean
Hidden Instructions	Clean	Clean	Clean	Clean	Clean
Toolchain Manipulation	Clean	Clean	Clean	Clean	1 Low
Persistence	Clean	Clean	Clean	Clean	Clean

Finding Details

SCN-001 — Execution scope undeclared (Low)

Skill: playwright-generate-test
Issue: Instructs "Execute the test file and iterate until the test passes" without declaring allowed-tools in frontmatter
OWASP: LLM06:2025 Excessive Agency, AST03 Scope Declaration
Fix: Add allowed-tools frontmatter limiting execution to npx playwright test

SCN-002 — Unbounded Node.js fallback (Low)

Skill: webapp-testing
Issue: Falls back to "local Node.js environment" if MCP unavailable — no scope limitation on what the fallback may execute
OWASP: LLM06:2025 Excessive Agency, AST04 Capability Expansion
Fix: Constrain fallback to localhost targets only, require user confirmation for remote

SCN-003 — Implicit dependency assumption (Low)

Skill: pytest-coverage
Issue: Assumes pytest-cov is installed without verification. Agent may silently install it
OWASP: LLM03:2025 Supply Chain
Fix: Add prerequisite check before running coverage commands

Risk Classification

Skill                          Score  Verdict  Risk Band
───────────────────────────────────────────────────────
javascript-typescript-jest       0    ALLOW    None
java-junit                       0    ALLOW    None
playwright-generate-test         4    ALLOW    Low
webapp-testing                   4    ALLOW    Low
pytest-coverage                  4    ALLOW    Low
───────────────────────────────────────────────────────
AGGREGATE                        3    ALLOW    Low (0-20)

Key Observations

No injection attempts found. Zero instances of rule override language, identity redefinition, spoofed system headers, or context normalization patterns across all 6 files. This is notably clean — ToxicSkills research found 36.82% of community skills have at least one issue.
No exfiltration infrastructure. None of the skills access credential paths, environment variables, sensitive filesystem locations, or external network endpoints.
No secrets in any file. All 6 files pass entropy and secrets-pattern checks.
Two pure-reference skills (jest, junit) are exemplary. They demonstrate the correct pattern for knowledge-transfer skills: no execution, no tool access, no network references. These cannot be weaponized.
Source legitimacy is consistent. All from the official github/awesome-copilot repository (28.5K stars), maintained by GitHub.

OWASP Coverage Matrix

Framework	Category	Checked	Findings
LLM Top 10	LLM01 Prompt Injection	Yes	None
LLM Top 10	LLM02 Sensitive Info Disclosure	Yes	None
LLM Top 10	LLM03 Supply Chain	Yes	SCN-003 (Low)
LLM Top 10	LLM06 Excessive Agency	Yes	SCN-001, SCN-002 (Low)
Agentic AI	ASI01 Prompt Injection	Yes	None
Agentic AI	ASI02 Exfiltration	Yes	None
Agentic AI	ASI03 Privilege Escalation	Yes	None
Agentic AI	ASI04 Toolchain Manipulation	Yes	None
Agentic AI	ASI10 Persistence	Yes	None
Skills Top 10	AST03 Scope Declaration	Yes	SCN-001, SCN-002 (Low)
Skills Top 10	AST04 Capability Expansion	Yes	SCN-002 (Low)

Recommendations for Testledere

Disse 5 skills er trygge å ta i bruk for testteam. Noen anbefalinger:

Prioritet	Anbefaling
Bruk direkte	`javascript-typescript-jest` og `java-junit` — rene referansedokumenter uten risiko
Bruk med bevissthet	`playwright-generate-test` og `webapp-testing` — har kjørerettighetsbehov, men er korrekt scopet
Bruk med bevissthet	`pytest-coverage` — verifiser at `pytest-cov` er i prosjektets avhengigheter før bruk
Generelt	Alle skills bør kombineres med prosjektets egne sikkerhetshooks for å fange opp uventet oppførsel

Methodology

Phase 1: Deterministic deep-scan — 10 Node.js scanners (unicode, entropy, permission, dep-audit, taint, git-forensics, network, memory-poisoning, supply-chain-recheck, toxic-flow)
Phase 2: LLM-based skill analysis — 7 threat categories (prompt injection, data exfiltration, privilege escalation, scope creep, hidden instructions, toolchain manipulation, persistence)
Frameworks: OWASP LLM Top 10 (2025), OWASP Agentic AI Top 10 (ASI), OWASP Skills Top 10 (AST)
Models: scan-orchestrator.mjs (deterministic), skill-scanner-agent (claude-sonnet-4-6)

Generated by llm-security v4.5.1

7.1 KiB Raw Blame History