ktg-plugin-marketplace/plugins/llm-security/knowledge/ide-extension-threat-patterns.md

236 lines
12 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# IDE Extension Threat Patterns
Detection categories used by `scanners/ide-extension-scanner.mjs` (prefix `IDE`).
Based on Koi Security / ExtensionTotal research 2024-2026 and VS Code / JetBrains official documentation.
Research brief: `/Users/ktg/.claude/plans/research-ide-extension-prescan.md`.
## Scope
VS Code + forks (Cursor, Windsurf, VSCodium, code-server, Insiders, Remote-SSH) and
JetBrains/IntelliJ plugins (IntelliJ IDEA, PyCharm, WebStorm, GoLand, Rider, CLion,
PhpStorm, RubyMine, DataGrip, DataSpell, RustRover, Aqua, Gateway, and Android Studio).
JetBrains discovery shipped in v6.6.0.
## 1. Blocklist Match (CRITICAL)
**Signal:** Extension ID (lowercased `publisher.name`) matches entry in `knowledge/top-vscode-extensions.json` `blocklist` array.
**Case:** TigerJack (11 malicious extensions, 17K+ installs). WhiteCobra (24 extensions, ~$500K crypto theft). VS Code Cryptojacking Campaign ("Mark H" impersonator, 1M+ installs). Known-malicious IDs are CRITICAL.
**Format:** `publisher.name@version` or `publisher.name@*` for any version.
**OWASP:** LLM03 (Supply Chain), ASI04.
## 2. Theme-with-Code (HIGH)
**Signal:** `package.json` `categories` includes `"Themes"` AND (`main` is truthy OR `activationEvents` non-empty).
**Case:** "A Wolf in Dark Mode" — the Material Theme malware. Popular theme with hidden malware under color-scheme. Pure themes require zero runtime code; any `main`/`activationEvents` on a theme is a strong red flag.
**OWASP:** LLM06 (Excessive Agency), ASI02.
## 3. Sideload Signal (HIGH unsigned, MEDIUM signed)
**Signal:** `extensions.json` entry has `metadata.source === "vsix"` (i.e. installed from file, not Marketplace).
**Rationale:** Marketplace signature verification and malware-scan bypassed for `.vsix`-file installs. Legitimate use cases exist (private extensions, dev testing), but high malware-ratio in observed incidents.
**Modifier:** If `.signature.p7s` file present in extension root → downgrade to MEDIUM (possibly Marketplace-downloaded .vsix).
**OWASP:** LLM03.
## 4. Broad Activation Surface (MEDIUM / LOW)
**Signal:** `package.json` `activationEvents` includes `"*"` (MEDIUM) or `"onStartupFinished"` (LOW).
**Rationale:** "Wants to run always" is a strong capability signal — necessary for a few legitimate tools (shell integrators, system monitors) but unusual for most extensions. Exemption: exact-match against top-100 list.
**Note:** VS Code 1.74+ no longer requires `activationEvents` for declarative `contributes` — absence of events is NOT suspicious.
**OWASP:** LLM06.
## 5. Typosquat (HIGH / MEDIUM)
**Signal:** Extension ID has Levenshtein distance ≤ 2 from a top-100 extension ID, excluding exact match.
- Distance 1 → HIGH
- Distance 2 AND target is in top-50 → MEDIUM
**Case:** TigerJack aliases `ab-498`, `498`, `498-00` targeting popular AI / utility extensions. Publisher impersonation (e.g. `ms-pythom.pythom` vs `ms-python.python`). AI-assistant typosquats (`claude-code`, `codeium`, `cody`).
**OWASP:** LLM03.
## 6. Extension Pack Expansion (MEDIUM)
**Signal:** `package.json` `extensionPack` array contains ≥ 3 bundled extension IDs.
**Rationale:** Extension packs amplify trust chain — installing one extension installs N others, each of which brings its own risk surface.
**OWASP:** LLM03.
## 7. Dangerous Uninstall Hook (HIGH / LOW)
**Signal:** `package.json` `scripts["vscode:uninstall"]` exists AND references one of: `child_process`, `curl`, `wget`, `rm`, `powershell`, `iex`, `Invoke-Expression`, `Start-Process`.
**Rationale:** Uninstall scripts are a persistence hook — attacker can delay destructive payload to trigger on uninstall attempt. VS Code runs these scripts with the user's privileges.
**OWASP:** LLM06, ASI02.
## 8. Data Exfiltration Patterns (delegated)
Detected by reused scanners on extension bundled source:
- **Hardcoded webhooks** (Discord, Pipedream, webhook.site, Burp Collaborator, interactsh) → detected by NET scanner
- **Base64-encoded C2 domains** → detected by ENT scanner
- **Unicode Tag steganography** (GlassWorm pattern) → detected by UNI scanner
- **Env var exfiltration** (`process.env.HOME`, SSH keys, `.aws/credentials`, `.env`) → detected by TNT scanner
- **Clipboard / screen capture misuse** → detected by NET + TNT via API surface
**Cases:** GlassWorm (Unicode steganography + blockchain C2), MaliciousCorgi (AI-assistant data leaks), VS Code Cryptojacking (PowerShell download-and-execute), screen-capture malware ("Bitcoin Black", "Codo AI").
**OWASP:** LLM01 (Prompt Injection), LLM02 (Sensitive Disclosure), LLM03.
## 9. Nested npm Supply Chain (delegated)
Detected by SCR scanner on extension's bundled `package-lock.json` or flat `package.json` dependencies.
**Rationale:** A typical VS Code extension with `main` bundles 50500+ transitive npm deps. VS Code Marketplace malware-scan does NOT inspect nested deps. Compromised npm packages (event-stream, rc, nx, ua-parser-js, lottie-player) flow into extensions automatically at build time.
**OWASP:** LLM03, ASI04.
## 10. Memory Poisoning via README / CHANGELOG (delegated)
Detected by MEM scanner on extension `README.md` and `CHANGELOG.md`.
**Rationale:** Extension README is displayed in VS Code when user inspects extension details. Prompt-injection payloads in README can poison co-located LLM assistants (Copilot, Claude Code) if the user asks about the extension.
**OWASP:** LLM01.
## 11. JetBrains Plugin Format (informational)
**Layout:** JetBrains plugins distribute as a ZIP or JAR. Installed plugins on disk
are already extracted by the IDE (directory form). A sideloaded URL download is a
single ZIP with layout `<artifact>/lib/<main>.jar + lib/<dep>.jar`. The authoritative
manifest `META-INF/plugin.xml` lives **inside the main JAR in `lib/`**, not at the
ZIP root. `META-INF/MANIFEST.MF` lives in each individual JAR.
Scanner strategy: walk `lib/*.jar`, open each as a nested ZIP, read `plugin.xml`
from the first JAR that contains one, then parse `MANIFEST.MF` from every JAR for
`Premain-Class` and coordinates (`Implementation-Title`, `Bundle-SymbolicName`).
**Source:** https://plugins.jetbrains.com/docs/intellij/plugin-content.html.
## 12. JetBrains Broad Activation (HIGH / MEDIUM)
**Signals (ranked):**
- **HIGH:** `<application-components>` present (legacy, loads at IDE start, blocks
dynamic reload) OR an `AppLifecycleListener` registered via
`<applicationListener topic="...AppLifecycleListener"/>` with an `appStarted`
handler. Equivalent to "run code at IDE startup."
- **MEDIUM:** `<postStartupActivity>` or `<backgroundPostStartupActivity>` — runs
once shortly after project open. Common in legitimate plugins but still a
capability signal.
- **MEDIUM:** `applicationService` with `preload="true"` — forces early
instantiation at IDE load.
**Case:** CVE-2024-37051 (JetBrains GitHub integration, June 2024) exfiltrated
GitHub access tokens via malicious pull request content — required no user
interaction once opened, abusing startup-time hooks.
**OWASP:** LLM06 (Excessive Agency), ASI02.
## 13. Theme-with-Code (JetBrains) (HIGH)
**Signal:** `plugin.xml` declares `<themeProvider>` AND any of:
`applicationService`, `projectService`, `action`, `applicationListener`,
`projectListener`, `postStartupActivity`, `<application-components>`.
**Rationale:** A pure JetBrains theme (LAF — look-and-feel) needs only a
`themeProvider` + a `.theme.json` resource. Bundling services/actions/listeners on
a theme mirrors the VS Code "A Wolf in Dark Mode" pattern and is a strong red flag.
**OWASP:** LLM06, ASI02.
## 14. Java Agent — Premain-Class (HIGH)
**Signal:** Any JAR in `lib/` has `Premain-Class: <fqcn>` in `META-INF/MANIFEST.MF`.
**Rationale:** `Premain-Class` registers a Java agent, giving bytecode-instrumentation
authority over the IDE JVM (hook every class load, rewrite methods, intercept
reflection). No legitimate third-party IntelliJ plugin needs this. If present
together with `Can-Redefine-Classes: true` or `Can-Retransform-Classes: true`,
severity is CRITICAL.
**Reference:** Log4Shell 2021 retrospective and subsequent JVM attacks highlight
`Premain-Class` as a persistent instrumentation vector.
**OWASP:** LLM06, ASI04.
## 15. Native Binary Bundling (MEDIUM / HIGH)
**Signal:** `.dll`, `.so`, `.dylib`, `.exe` file inside any JAR in `lib/` or in
the plugin directory tree.
**Rationale:** Bundled native binaries escape JVM sandboxing and cannot be audited
by JVM-level scanners. Legitimate uses exist (native filesystem watchers, DB
drivers) but are rare — most plugins should be pure JVM bytecode. Severity is
MEDIUM by default, HIGH when combined with Java-agent signal (#14) or broad
activation (#12).
**Case:** OX Security 2025 research on JetBrains Marketplace demonstrated that
signed plugins can still bundle arbitrary native payloads — the verified badge
attests publisher identity, not plugin safety.
**OWASP:** LLM03, ASI04.
## 16. Legacy `<application-components>` (MEDIUM advisory)
**Signal:** `plugin.xml` uses the deprecated `<application-components>`,
`<project-components>`, or `<module-components>` elements instead of modern
`<applicationService>` / `<extensions defaultExtensionNs="com.intellij">`.
**Rationale:** Deprecated since 2020. Plugins that use components cannot be
dynamically loaded/unloaded and force a restart on install, bypassing IDE-managed
hot-reload safety. Often found together with other legacy red flags.
**OWASP:** LLM06.
## 17. Shaded/Uncoordinated JARs (MEDIUM)
**Signal:** JAR in `lib/` has no recognisable coordinates (`Implementation-Title`,
`Bundle-SymbolicName`, `Implementation-Version` absent from `MANIFEST.MF`) OR
class files appear under shaded package prefixes (`com.company.shaded.*`,
`plugin.relocated.*`).
**Rationale:** Uncoordinated or shaded JARs cannot be mapped to an OSV or Maven
Central entry, so transitive-dependency auditing is impossible. YouTrack
IJPL-212393 confirms JetBrains cannot reliably identify shaded library content
either, so the signature-warning UI sometimes emits no warning at all.
**OWASP:** LLM03, ASI04.
## Known Limitations
- No runtime bytecode analysis — JARs are inspected as ZIPs and via MANIFEST.MF
only. Method-level instrumentation detection is out of scope.
- No VSIX extraction (pass extracted directory instead)
- No Marketplace API lookups without `--online` flag (publisher age, download count, verified status unavailable offline)
- Profile-specific extension filtering not implemented (all installed extensions are scanned)
- `.obsolete` file parsing not implemented (extensions marked obsolete are still scanned — harmless but redundant)
- Real-time IDE hooks are out of scope (separate repo, planned)
## References
- Koi Security blog — https://koi.security/blog (GlassWorm, WhiteCobra, TigerJack, Material Theme, Cryptojacking, MaliciousCorgi, Screen-capture, Marketplace Takeover)
- VS Code Extension Runtime Security — https://code.visualstudio.com/docs/configure/extensions/extension-runtime-security
- VS Code Extension Manifest — https://code.visualstudio.com/api/references/extension-manifest
- ExtensionTotal — https://extensiontotal.com (closed-source, compatible reference)
- OSV schema — confirms no `VSCodeMarketplace` ecosystem (verified 2026-04-17)
- JetBrains plugin-content reference — https://plugins.jetbrains.com/docs/intellij/plugin-content.html
- JetBrains plugin-configuration-file — https://plugins.jetbrains.com/docs/intellij/plugin-configuration-file.html
- CVE-2024-37051 — JetBrains GitHub plugin token exfiltration (2024)
- OX Security 2025 — JetBrains verified-badge bypass research
- Log4Shell and JVM instrumentation retrospective (20212023)
- YouTrack IJPL-212393 — JetBrains signature-warning inconsistency