ktg-plugin-marketplace/plugins/llm-security/knowledge/ide-extension-threat-patterns.md

# IDE Extension Threat Patterns

Detection categories used by `scanners/ide-extension-scanner.mjs` (prefix `IDE`).
Based on Koi Security / ExtensionTotal research 2024-2026 and VS Code / JetBrains official documentation.

Research brief: `/Users/ktg/.claude/plans/research-ide-extension-prescan.md`.

## Scope

VS Code + forks (Cursor, Windsurf, VSCodium, code-server, Insiders, Remote-SSH) and
JetBrains/IntelliJ plugins (IntelliJ IDEA, PyCharm, WebStorm, GoLand, Rider, CLion,
PhpStorm, RubyMine, DataGrip, DataSpell, RustRover, Aqua, Gateway, and Android Studio).
JetBrains discovery shipped in v6.6.0.

## 1. Blocklist Match (CRITICAL)

**Signal:** Extension ID (lowercased `publisher.name`) matches entry in `knowledge/top-vscode-extensions.json` `blocklist` array.

**Case:** TigerJack (11 malicious extensions, 17K+ installs). WhiteCobra (24 extensions, ~$500K crypto theft). VS Code Cryptojacking Campaign ("Mark H" impersonator, 1M+ installs). Known-malicious IDs are CRITICAL.

**Format:** `publisher.name@version` or `publisher.name@*` for any version.

**OWASP:** LLM03 (Supply Chain), ASI04.

## 2. Theme-with-Code (HIGH)

**Signal:** `package.json` `categories` includes `"Themes"` AND (`main` is truthy OR `activationEvents` non-empty).

**Case:** "A Wolf in Dark Mode" — the Material Theme malware. Popular theme with hidden malware under color-scheme. Pure themes require zero runtime code; any `main`/`activationEvents` on a theme is a strong red flag.

**OWASP:** LLM06 (Excessive Agency), ASI02.

## 3. Sideload Signal (HIGH unsigned, MEDIUM signed)

**Signal:** `extensions.json` entry has `metadata.source === "vsix"` (i.e. installed from file, not Marketplace).

**Rationale:** Marketplace signature verification and malware-scan bypassed for `.vsix`-file installs. Legitimate use cases exist (private extensions, dev testing), but high malware-ratio in observed incidents.

**Modifier:** If `.signature.p7s` file present in extension root → downgrade to MEDIUM (possibly Marketplace-downloaded .vsix).

**OWASP:** LLM03.

## 4. Broad Activation Surface (MEDIUM / LOW)

**Signal:** `package.json` `activationEvents` includes `"*"` (MEDIUM) or `"onStartupFinished"` (LOW).

**Rationale:** "Wants to run always" is a strong capability signal — necessary for a few legitimate tools (shell integrators, system monitors) but unusual for most extensions. Exemption: exact-match against top-100 list.

**Note:** VS Code 1.74+ no longer requires `activationEvents` for declarative `contributes` — absence of events is NOT suspicious.

**OWASP:** LLM06.

## 5. Typosquat (HIGH / MEDIUM)

**Signal:** Extension ID has Levenshtein distance ≤ 2 from a top-100 extension ID, excluding exact match.

- Distance 1 → HIGH
- Distance 2 AND target is in top-50 → MEDIUM

**Case:** TigerJack aliases `ab-498`, `498`, `498-00` targeting popular AI / utility extensions. Publisher impersonation (e.g. `ms-pythom.pythom` vs `ms-python.python`). AI-assistant typosquats (`claude-code`, `codeium`, `cody`).

**OWASP:** LLM03.

## 6. Extension Pack Expansion (MEDIUM)

**Signal:** `package.json` `extensionPack` array contains ≥ 3 bundled extension IDs.

**Rationale:** Extension packs amplify trust chain — installing one extension installs N others, each of which brings its own risk surface.

**OWASP:** LLM03.

## 7. Dangerous Uninstall Hook (HIGH / LOW)

**Signal:** `package.json` `scripts["vscode:uninstall"]` exists AND references one of: `child_process`, `curl`, `wget`, `rm`, `powershell`, `iex`, `Invoke-Expression`, `Start-Process`.

**Rationale:** Uninstall scripts are a persistence hook — attacker can delay destructive payload to trigger on uninstall attempt. VS Code runs these scripts with the user's privileges.

**OWASP:** LLM06, ASI02.

## 8. Data Exfiltration Patterns (delegated)

Detected by reused scanners on extension bundled source:

- **Hardcoded webhooks** (Discord, Pipedream, webhook.site, Burp Collaborator, interactsh) → detected by NET scanner
- **Base64-encoded C2 domains** → detected by ENT scanner
- **Unicode Tag steganography** (GlassWorm pattern) → detected by UNI scanner
- **Env var exfiltration** (`process.env.HOME`, SSH keys, `.aws/credentials`, `.env`) → detected by TNT scanner
- **Clipboard / screen capture misuse** → detected by NET + TNT via API surface

**Cases:** GlassWorm (Unicode steganography + blockchain C2), MaliciousCorgi (AI-assistant data leaks), VS Code Cryptojacking (PowerShell download-and-execute), screen-capture malware ("Bitcoin Black", "Codo AI").

**OWASP:** LLM01 (Prompt Injection), LLM02 (Sensitive Disclosure), LLM03.

## 9. Nested npm Supply Chain (delegated)

Detected by SCR scanner on extension's bundled `package-lock.json` or flat `package.json` dependencies.

**Rationale:** A typical VS Code extension with `main` bundles 50–500+ transitive npm deps. VS Code Marketplace malware-scan does NOT inspect nested deps. Compromised npm packages (event-stream, rc, nx, ua-parser-js, lottie-player) flow into extensions automatically at build time.

**OWASP:** LLM03, ASI04.

## 10. Memory Poisoning via README / CHANGELOG (delegated)

Detected by MEM scanner on extension `README.md` and `CHANGELOG.md`.

**Rationale:** Extension README is displayed in VS Code when user inspects extension details. Prompt-injection payloads in README can poison co-located LLM assistants (Copilot, Claude Code) if the user asks about the extension.

**OWASP:** LLM01.

## 11. JetBrains Plugin Format (informational)

**Layout:** JetBrains plugins distribute as a ZIP or JAR. Installed plugins on disk
are already extracted by the IDE (directory form). A sideloaded URL download is a
single ZIP with layout `<artifact>/lib/<main>.jar + lib/<dep>.jar`. The authoritative
manifest `META-INF/plugin.xml` lives **inside the main JAR in `lib/`**, not at the
ZIP root. `META-INF/MANIFEST.MF` lives in each individual JAR.

Scanner strategy: walk `lib/*.jar`, open each as a nested ZIP, read `plugin.xml`
from the first JAR that contains one, then parse `MANIFEST.MF` from every JAR for
`Premain-Class` and coordinates (`Implementation-Title`, `Bundle-SymbolicName`).

**Source:** https://plugins.jetbrains.com/docs/intellij/plugin-content.html.

## 12. JetBrains Broad Activation (HIGH / MEDIUM)

**Signals (ranked):**

- **HIGH:** `<application-components>` present (legacy, loads at IDE start, blocks
  dynamic reload) OR an `AppLifecycleListener` registered via
  `<applicationListener topic="...AppLifecycleListener"/>` with an `appStarted`
  handler. Equivalent to "run code at IDE startup."
- **MEDIUM:** `<postStartupActivity>` or `<backgroundPostStartupActivity>` — runs
  once shortly after project open. Common in legitimate plugins but still a
  capability signal.
- **MEDIUM:** `applicationService` with `preload="true"` — forces early
  instantiation at IDE load.

**Case:** CVE-2024-37051 (JetBrains GitHub integration, June 2024) exfiltrated
GitHub access tokens via malicious pull request content — required no user
interaction once opened, abusing startup-time hooks.

**OWASP:** LLM06 (Excessive Agency), ASI02.

## 13. Theme-with-Code (JetBrains) (HIGH)

**Signal:** `plugin.xml` declares `<themeProvider>` AND any of:
`applicationService`, `projectService`, `action`, `applicationListener`,
`projectListener`, `postStartupActivity`, `<application-components>`.

**Rationale:** A pure JetBrains theme (LAF — look-and-feel) needs only a
`themeProvider` + a `.theme.json` resource. Bundling services/actions/listeners on
a theme mirrors the VS Code "A Wolf in Dark Mode" pattern and is a strong red flag.

**OWASP:** LLM06, ASI02.

## 14. Java Agent — Premain-Class (HIGH)

**Signal:** Any JAR in `lib/` has `Premain-Class: <fqcn>` in `META-INF/MANIFEST.MF`.

**Rationale:** `Premain-Class` registers a Java agent, giving bytecode-instrumentation
authority over the IDE JVM (hook every class load, rewrite methods, intercept
reflection). No legitimate third-party IntelliJ plugin needs this. If present
together with `Can-Redefine-Classes: true` or `Can-Retransform-Classes: true`,
severity is CRITICAL.

**Reference:** Log4Shell 2021 retrospective and subsequent JVM attacks highlight
`Premain-Class` as a persistent instrumentation vector.

**OWASP:** LLM06, ASI04.

## 15. Native Binary Bundling (MEDIUM / HIGH)

**Signal:** `.dll`, `.so`, `.dylib`, `.exe` file inside any JAR in `lib/` or in
the plugin directory tree.

**Rationale:** Bundled native binaries escape JVM sandboxing and cannot be audited
by JVM-level scanners. Legitimate uses exist (native filesystem watchers, DB
drivers) but are rare — most plugins should be pure JVM bytecode. Severity is
MEDIUM by default, HIGH when combined with Java-agent signal (#14) or broad
activation (#12).

**Case:** OX Security 2025 research on JetBrains Marketplace demonstrated that
signed plugins can still bundle arbitrary native payloads — the verified badge
attests publisher identity, not plugin safety.

**OWASP:** LLM03, ASI04.

## 16. Legacy `<application-components>` (MEDIUM advisory)

**Signal:** `plugin.xml` uses the deprecated `<application-components>`,
`<project-components>`, or `<module-components>` elements instead of modern
`<applicationService>` / `<extensions defaultExtensionNs="com.intellij">`.

**Rationale:** Deprecated since 2020. Plugins that use components cannot be
dynamically loaded/unloaded and force a restart on install, bypassing IDE-managed
hot-reload safety. Often found together with other legacy red flags.

**OWASP:** LLM06.

## 17. Shaded/Uncoordinated JARs (MEDIUM)

**Signal:** JAR in `lib/` has no recognisable coordinates (`Implementation-Title`,
`Bundle-SymbolicName`, `Implementation-Version` absent from `MANIFEST.MF`) OR
class files appear under shaded package prefixes (`com.company.shaded.*`,
`plugin.relocated.*`).

**Rationale:** Uncoordinated or shaded JARs cannot be mapped to an OSV or Maven
Central entry, so transitive-dependency auditing is impossible. YouTrack
IJPL-212393 confirms JetBrains cannot reliably identify shaded library content
either, so the signature-warning UI sometimes emits no warning at all.

**OWASP:** LLM03, ASI04.

## Known Limitations

- No runtime bytecode analysis — JARs are inspected as ZIPs and via MANIFEST.MF
  only. Method-level instrumentation detection is out of scope.
- No VSIX extraction (pass extracted directory instead)
- No Marketplace API lookups without `--online` flag (publisher age, download count, verified status unavailable offline)
- Profile-specific extension filtering not implemented (all installed extensions are scanned)
- `.obsolete` file parsing not implemented (extensions marked obsolete are still scanned — harmless but redundant)
- Real-time IDE hooks are out of scope (separate repo, planned)

## References

- Koi Security blog — https://koi.security/blog (GlassWorm, WhiteCobra, TigerJack, Material Theme, Cryptojacking, MaliciousCorgi, Screen-capture, Marketplace Takeover)
- VS Code Extension Runtime Security — https://code.visualstudio.com/docs/configure/extensions/extension-runtime-security
- VS Code Extension Manifest — https://code.visualstudio.com/api/references/extension-manifest
- ExtensionTotal — https://extensiontotal.com (closed-source, compatible reference)
- OSV schema — confirms no `VSCodeMarketplace` ecosystem (verified 2026-04-17)
- JetBrains plugin-content reference — https://plugins.jetbrains.com/docs/intellij/plugin-content.html
- JetBrains plugin-configuration-file — https://plugins.jetbrains.com/docs/intellij/plugin-configuration-file.html
- CVE-2024-37051 — JetBrains GitHub plugin token exfiltration (2024)
- OX Security 2025 — JetBrains verified-badge bypass research
- Log4Shell and JVM instrumentation retrospective (2021–2023)
- YouTrack IJPL-212393 — JetBrains signature-warning inconsistency