feat(llm-security): sandboxed remote cloning v5.1.0

Harden git clone attack surface for remote scans with defense-in-depth: Layer 1 (all platforms): 8 git config flags disable hooks, symlinks, filter/smudge drivers, fsmonitor, local file protocol. 4 env vars isolate from system/user git config and block interactive prompts. Layer 2 (OS sandbox): macOS sandbox-exec and Linux bubblewrap (bwrap) restrict file writes to only the specific temp directory. bwrap probe-tests availability before use. Graceful fallback on Windows and Ubuntu 24.04+ (git config hardening only). Additional: post-clone 100MB size check, UUID-unique evidence filenames, evidence file cleanup, cleanup guarantee in scan/plugin-audit commands. 32 new tests (1147 total). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-07 17:08:32 +02:00 · 2026-04-07 17:08:32 +02:00 · 708c898754
commit 708c898754
parent 5c1ceaa567
11 changed files with 487 additions and 12 deletions
--- a/plugins/llm-security/CHANGELOG.md
+++ b/plugins/llm-security/CHANGELOG.md
@ -4,6 +4,21 @@ All notable changes to the LLM Security Plugin are documented in this file.

 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).

+## [5.1.0] - 2026-04-07
+
+### Added
+- **Sandboxed remote cloning** — `git clone` for remote scans is now hardened with two defense layers:
+  1. Git config flags: `core.hooksPath=/dev/null`, `core.symlinks=false`, `core.fsmonitor=false`, all LFS filter drivers disabled, `protocol.file.allow=never`, `transfer.fsckObjects=true`. Environment: `GIT_CONFIG_NOSYSTEM=1`, `GIT_CONFIG_GLOBAL=/dev/null`, `GIT_ATTR_NOSYSTEM=1`, `GIT_TERMINAL_PROMPT=0`
+  2. OS-level filesystem sandbox: macOS `sandbox-exec` and Linux `bubblewrap` (bwrap) restrict file writes to only the specific temp directory. Even if `.gitattributes` filter drivers bypass git config, they cannot write outside the clone dir. bwrap probe-tests availability before use (graceful fallback on Ubuntu 24.04+ where AppArmor blocks it). Graceful fallback on Windows (git config flags only, WARN logged)
+- **Post-clone size check** — Repos exceeding 100MB after clone are rejected and cleaned up
+- **UUID-unique evidence filenames** — `fs-utils.mjs tmppath` now generates unique filenames with `crypto.randomUUID()` suffix, preventing race conditions between concurrent scans
+- **Evidence file cleanup** — `scan.md` and `plugin-audit.md` now clean up evidence files (content-extract, plugin-extract) after scanning
+- **Cleanup guarantee** — Both `scan.md` and `plugin-audit.md` have explicit cleanup guarantee: temp dir + evidence file are removed even if scan fails or errors
+
+### Changed
+- `scanners/lib/git-clone.mjs` — complete rewrite of clone command with sandbox wrapping
+- `scanners/lib/fs-utils.mjs` — tmppath uses `crypto.randomUUID()` for unique names
+
 ## [5.0.0] - 2026-04-06

 ### Added
--- a/plugins/llm-security/CLAUDE.md
+++ b/plugins/llm-security/CLAUDE.md
@ -1,6 +1,6 @@
-# LLM Security Plugin (v5.0.0)
+# LLM Security Plugin (v5.1.0)

-Security scanning, auditing, and threat modeling for Claude Code projects. 5 frameworks: OWASP LLM Top 10, Agentic AI Top 10 (ASI), Skills Top 10 (AST), MCP Top 10, AI Agent Traps (DeepMind). 1115 tests.
+Security scanning, auditing, and threat modeling for Claude Code projects. 5 frameworks: OWASP LLM Top 10, Agentic AI Top 10 (ASI), Skills Top 10 (AST), MCP Top 10, AI Agent Traps (DeepMind). 1147 tests.

 ## Commands

@ -55,6 +55,14 @@ Security scanning, auditing, and threat modeling for Claude Code projects. 5 fra

 `scan` and `plugin-audit` accept GitHub URLs directly. The command clones to a temp dir via `scanners/lib/git-clone.mjs`, scans locally, then cleans up. Use `--branch <name>` for non-default branches.

+**Clone sandboxing (v5.1):** `git clone` executes code via `.gitattributes` filter/smudge drivers — this is a known attack vector. Two layers of defense:
+1. **Git config flags (all platforms):** `core.hooksPath=/dev/null`, `core.symlinks=false`, `core.fsmonitor=false`, all LFS filter drivers disabled, `protocol.file.allow=never`, `transfer.fsckObjects=true`. Environment: `GIT_CONFIG_NOSYSTEM=1`, `GIT_CONFIG_GLOBAL=/dev/null`, `GIT_ATTR_NOSYSTEM=1`, `GIT_TERMINAL_PROMPT=0`.
+2. **OS sandbox:** macOS `sandbox-exec` or Linux `bubblewrap` (bwrap) restricts file writes to only the specific temp directory. Even if a filter driver bypasses git config, it cannot write outside the clone dir. Fallback on Windows or when neither sandbox is available: git config flags only, WARN logged.
+
+Platform matrix: macOS (`sandbox-exec`) — always works. Linux (`bwrap`) — works on Fedora/Arch, may fail on Ubuntu 24.04+ without admin AppArmor config. Windows — no OS sandbox, git config flags only.
+
+Post-clone: size check (100MB max), cleanup guarantee (temp dir + evidence file always removed, even on error).
+
 **Prompt injection defense:** Remote scans use `scanners/content-extractor.mjs` to pre-extract structured evidence and strip injection patterns BEFORE LLM agents see the content. Agents analyze a JSON evidence package, never raw files from untrusted repos.

 ## Scanners
--- a/plugins/llm-security/README.md
+++ b/plugins/llm-security/README.md
@ -190,6 +190,24 @@ claude plugin add plugin-marketplace/llm-security

 **Injection-safe remote scanning (v2.5+):** Remote scans pre-extract structured evidence via `content-extractor.mjs` and strip injection patterns BEFORE LLM agents see the content. Agents analyze a JSON evidence package, never raw files from untrusted repos. `[INJECTION-PATTERN-STRIPPED]` markers are confirmed findings.

+**Sandboxed cloning (v5.1+):** `git clone` can execute arbitrary code via `.gitattributes` filter/smudge drivers. Remote clones are now hardened with defense-in-depth:
+
+**Layer 1 — Git config hardening (all platforms):** 8 config flags disable hooks (`core.hooksPath=/dev/null`), symlinks (`core.symlinks=false`), filter/smudge drivers (all LFS filters cleared), fsmonitor, and local file protocol. Environment variables isolate from system/user git config and block interactive prompts.
+
+**Layer 2 — OS-level filesystem sandbox (platform-dependent):**
+
+| Platform | Sandbox | Status |
+|----------|---------|--------|
+| macOS | `sandbox-exec` | Always available — restricts file writes to specific temp dir |
+| Linux | `bubblewrap` (bwrap) | Works on Fedora/Arch. May require admin AppArmor config on Ubuntu 24.04+ |
+| Windows | None | No practical zero-install CLI sandbox exists. Git config hardening only |
+
+When no OS sandbox is available, the plugin warns and proceeds with git config hardening only. The sandbox is an additional defense layer — even without it, the git config flags neutralize all known `.gitattributes` attack vectors.
+
+**Additional protections:** Post-clone size check (100MB max), UUID-unique evidence filenames (prevents race conditions), cleanup guarantee (temp files removed even on error).
+
+**Windows guidance:** Windows has no equivalent to `sandbox-exec` or `bwrap` that ships with the OS. The most practical mitigation for Windows users is to run Claude Code itself inside a sandboxed environment (e.g., Windows Sandbox on Pro/Enterprise, Docker Desktop, or WSL2). The git config hardening layer provides baseline protection on all platforms.
+
 Output: structured report with ALLOW / WARNING / BLOCK verdict, risk score (0-100), and findings sorted by severity.

 ### Audit
@ -594,8 +612,8 @@ llm-security/
 │   │   ├── skill-registry.mjs     #   Fingerprinting, caching, pattern search
 │   │   ├── file-discovery.mjs     #   Walk tree, filter, binary detect
 │   │   ├── yaml-frontmatter.mjs   #   Regex-based frontmatter parser
-│   │   ├── git-clone.mjs          #   Clone/cleanup remote repos to temp dirs
-│   │   └── fs-utils.mjs           #   Backup, restore, cleanup, tmppath utilities
+│   │   ├── git-clone.mjs          #   Sandboxed clone/cleanup (sandbox-exec + git config hardening)
+│   │   └── fs-utils.mjs           #   Backup, restore, cleanup, tmppath (UUID-unique) utilities
 │   ├── unicode-scanner.mjs        #   Zero-width, Tags, BIDI, homoglyphs
 │   ├── entropy-scanner.mjs        #   Shannon entropy, base64/hex detection
 │   ├── permission-mapper.mjs      #   Plugin permission analysis
@ -687,6 +705,7 @@ This plugin provides full-stack security hardening (static analysis + supply cha

 | Version | Date | Highlights |
 |---------|------|------------|
+| **5.1.0** | 2026-04-07 | **Sandboxed remote cloning.** Defense-in-depth for `git clone` attack surface: (1) 8 git config flags disable hooks, symlinks, filter/smudge drivers, fsmonitor, local file protocol; 4 env vars isolate from system/user config. (2) OS sandbox: macOS `sandbox-exec` + Linux `bubblewrap` restrict file writes to only the clone temp dir. Graceful fallback on Windows (git config only). Post-clone size check (100MB max). UUID-unique evidence filenames prevent race conditions. Cleanup guarantee in scan/plugin-audit commands. 1147 tests (was 1115). |
 | **5.0.0** | 2026-04-06 | **Prompt Injection Hardening (v5.0).** 8-session defense-in-depth overhaul driven by 7 research papers (2025-2026). MEDIUM advisory for obfuscation signals (leetspeak, homoglyphs, zero-width, multi-language). Unicode Tag steganography detection (U+E0000-E007F). Bash expansion normalization (`bash-normalize.mjs`). Rule of Two enforcement (configurable `LLM_SECURITY_TRIFECTA_MODE=block\|warn\|off`). 100-call long-horizon monitoring window with slow-burn trifecta detection. Behavioral drift via Jensen-Shannon divergence. HITL trap detection (approval urgency, summary suppression, scope minimization). Sub-agent delegation tracking (escalation-after-input advisory). NL indirection patterns. Hybrid attacks (P2SQL, recursive injection, XSS-in-agent). CaMeL-inspired data flow tagging (SHA-256 provenance, output-to-input linking). Adaptive red-team (5 mutation rounds per scenario: homoglyph, encoding, zero-width, case alternation, synonym). Knowledge base expanded: `prompt-injection-research-2025-2026.md`, `deepmind-agent-traps.md`, `attack-mutations.json`. Posture scanner expanded to 13 categories (+Prompt Injection Hardening, Rule of Two, Long-Horizon Monitoring). Defense Philosophy section documenting honest limitations. 1115 tests. |
 | **4.5.1** | 2026-04-04 | **Cross-platform support.** Windows/Linux compatibility: `fileURLToPath()`, `path.dirname()`, native `fetch()` replaces `curl` subprocess, fixed tilde expansion regex. 11 files, 782 tests pass. |
 | **4.5.0** | 2026-04-04 | **Attack simulation / red-team mode.** New `attack-simulator.mjs` runs 38 crafted attack scenarios across 7 categories (secrets, destructive, supply-chain, prompt-injection, pathguard, mcp-output, session-trifecta) against the plugin's own hooks. Data-driven via `knowledge/attack-scenarios.json` with runtime payload assembly. New `/security red-team` command with `--category` filter. Capstone release: v4.0 roadmap complete (S1-S6). 18 commands, 16 scanners (10 orchestrated + 6 standalone). 782 tests. |
--- a/plugins/llm-security/commands/plugin-audit.md
+++ b/plugins/llm-security/commands/plugin-audit.md
@ -21,6 +21,13 @@ Audit a Claude Code plugin for security before installation. Accepts local paths
 - Else → `target = "."`, `clone_path = null`
 - Verify `.claude-plugin/plugin.json` exists at `<target>`. If not and `clone_path != null` → cleanup clone_path first, then tell user this is not a plugin directory and **STOP**. If not and local → tell user and **STOP**.

+## IMPORTANT: Cleanup Guarantee (remote audits)
+
+If `clone_path != null`, the following cleanup MUST run regardless of audit outcome.
+If ANY step between clone and cleanup fails or errors, STILL run cleanup before stopping:
+  1. `node <plugin-root>/scanners/lib/git-clone.mjs cleanup "<clone_path>"`
+  2. `node <plugin-root>/scanners/lib/fs-utils.mjs cleanup "<evidence_file>"` (if `evidence_file` is set)
+
 ## Step 1.5: Pre-extraction (remote audits only)

 If `clone_path != null`:
@ -62,3 +69,6 @@ Verdict: **Install** (0 critical/high, transparent hooks) | **Review** (high fin
 If `clone_path != null`:
  Run: `node <plugin-root>/scanners/lib/git-clone.mjs cleanup "<clone_path>"`
  If cleanup fails → warn: "Could not remove temp dir <clone_path> — remove manually."
+
+If `evidence_file != null`:
+  Run: `node <plugin-root>/scanners/lib/fs-utils.mjs cleanup "<evidence_file>"`
--- a/plugins/llm-security/commands/scan.md
+++ b/plugins/llm-security/commands/scan.md
@ -21,6 +21,13 @@ Scan target for security issues. Accepts local paths or GitHub URLs. Delegates t
    Set `remote_url = <url>` for display
 - Otherwise → `target = $ARGUMENTS`, `clone_path = null`

+## IMPORTANT: Cleanup Guarantee (remote scans)
+
+If `clone_path != null`, the following cleanup MUST run regardless of scan outcome.
+If ANY step between clone and cleanup fails or errors, STILL run cleanup before stopping:
+  1. `node <plugin-root>/scanners/lib/git-clone.mjs cleanup "<clone_path>"`
+  2. `node <plugin-root>/scanners/lib/fs-utils.mjs cleanup "<evidence_file>"` (if `evidence_file` is set)
+
 ## Step 1.5: Pre-extraction (remote scans only)

 If `clone_path != null` (target is a cloned remote repo):
@ -145,3 +152,6 @@ Parse stdout aggregate JSON. Merge with LLM findings. Re-evaluate verdict. Outpu
 If `clone_path != null`:
  Run: `node <plugin-root>/scanners/lib/git-clone.mjs cleanup "<clone_path>"`
  If cleanup fails → warn: "Could not remove temp dir <clone_path> — remove manually."
+
+If `evidence_file != null`:
+  Run: `node <plugin-root>/scanners/lib/fs-utils.mjs cleanup "<evidence_file>"`
--- a/plugins/llm-security/package.json
+++ b/plugins/llm-security/package.json
@ -1,6 +1,6 @@
 {
  "name": "llm-security",
-  "version": "5.0.0",
+  "version": "5.1.0",
  "description": "Security scanning, auditing, and threat modeling for Claude Code projects",
  "type": "module",
  "engines": {
--- a/plugins/llm-security/scanners/dashboard-aggregator.mjs
+++ b/plugins/llm-security/scanners/dashboard-aggregator.mjs
@ -19,7 +19,7 @@ import { scan } from './posture-scanner.mjs';
 // Constants
 // ---------------------------------------------------------------------------

-const VERSION = '5.0.0';
+const VERSION = '5.1.0';

 /** Cache location */
 const CACHE_DIR = join(homedir(), '.cache', 'llm-security');
--- a/plugins/llm-security/scanners/lib/fs-utils.mjs
+++ b/plugins/llm-security/scanners/lib/fs-utils.mjs
@ -9,6 +9,7 @@
 import { cpSync, rmSync, renameSync, existsSync } from 'node:fs';
 import { join, basename } from 'node:path';
 import { tmpdir } from 'node:os';
+import { randomUUID } from 'node:crypto';

 const [,, command, ...args] = process.argv;

@ -50,8 +51,12 @@ switch (command) {
  }

  case 'tmppath': {
-    const filename = args[0] || 'llm-security-temp.json';
-    process.stdout.write(join(tmpdir(), filename) + '\n');
+    const base = args[0] || 'llm-security-temp.json';
+    const dotIdx = base.lastIndexOf('.');
+    const name = dotIdx > 0 ? base.slice(0, dotIdx) : base;
+    const ext = dotIdx > 0 ? base.slice(dotIdx) : '.json';
+    const unique = `${name}-${randomUUID().slice(0, 8)}${ext}`;
+    process.stdout.write(join(tmpdir(), unique) + '\n');
    break;
  }

--- a/plugins/llm-security/scanners/lib/git-clone.mjs
+++ b/plugins/llm-security/scanners/lib/git-clone.mjs
@ -1,17 +1,18 @@
 #!/usr/bin/env node
 // git-clone.mjs — Clone GitHub repos to temp dirs for security scanning
 // Usage:
-//   node git-clone.mjs clone <url> [--branch <name>]  → shallow clone, prints tmpdir path
+//   node git-clone.mjs clone <url> [--branch <name>]  → sandboxed shallow clone, prints tmpdir path
 //   node git-clone.mjs cleanup <dir>                  → removes temp directory
 //   node git-clone.mjs validate <url>                 → exits 0 if valid GitHub URL, 1 if not

-import { mkdtempSync, rmSync, existsSync } from 'node:fs';
+import { mkdtempSync, rmSync, existsSync, realpathSync } from 'node:fs';
 import { join } from 'node:path';
 import { tmpdir } from 'node:os';
 import { spawnSync } from 'node:child_process';

 const GITHUB_URL_RE = /^https:\/\/github\.com\/[\w.-]+\/[\w.-]+(\.git)?\/?$/;
 const GITHUB_SSH_RE = /^git@github\.com:[\w.-]+\/[\w.-]+(\.git)?$/;
+const MAX_CLONE_SIZE_MB = 100;

 function isValidUrl(url) {
  return GITHUB_URL_RE.test(url) || GITHUB_SSH_RE.test(url);
@ -29,6 +30,109 @@ function parseArgs(argv) {
  return args;
 }

+/** Git config flags that neutralize known attack vectors */
+const GIT_SANDBOX_CONFIG = [
+  '-c', 'core.hooksPath=/dev/null',
+  '-c', 'core.symlinks=false',
+  '-c', 'core.fsmonitor=false',
+  '-c', 'filter.lfs.process=',
+  '-c', 'filter.lfs.smudge=',
+  '-c', 'filter.lfs.clean=',
+  '-c', 'protocol.file.allow=never',
+  '-c', 'transfer.fsckObjects=true',
+];
+
+/** Environment that isolates git from system/user config */
+const GIT_SANDBOX_ENV = {
+  ...process.env,
+  GIT_CONFIG_NOSYSTEM: '1',
+  GIT_CONFIG_GLOBAL: '/dev/null',
+  GIT_ATTR_NOSYSTEM: '1',
+  GIT_TERMINAL_PROMPT: '0',
+};
+
+/**
+ * Build sandbox-exec profile restricting file writes to a single directory.
+ * macOS only — returns null on other platforms.
+ */
+function buildSandboxProfile(allowedWritePath) {
+  if (process.platform !== 'darwin') return null;
+  const check = spawnSync('which', ['sandbox-exec'], { encoding: 'utf8' });
+  if (check.status !== 0) return null;
+
+  const realPath = realpathSync(allowedWritePath);
+  return [
+    '(version 1)',
+    '(allow default)',
+    '(deny file-write*)',
+    `(allow file-write* (subpath "${realPath}"))`,
+    '(allow file-write* (literal "/dev/null"))',
+    '(allow file-write* (literal "/dev/tty"))',
+  ].join('');
+}
+
+/**
+ * Build bwrap args restricting writes to a single directory.
+ * Linux only — returns null if bwrap is not installed or fails.
+ */
+function buildBwrapArgs(allowedWritePath, innerArgs) {
+  if (process.platform !== 'linux') return null;
+  const check = spawnSync('which', ['bwrap'], { encoding: 'utf8' });
+  if (check.status !== 0) return null;
+
+  // Test that bwrap actually works (fails on Ubuntu 24.04+ without admin config)
+  const probe = spawnSync('bwrap', ['--ro-bind', '/', '/', '--dev', '/dev', '/bin/true'], {
+    stdio: 'ignore', timeout: 5000,
+  });
+  if (probe.status !== 0) return null;
+
+  return [
+    '--ro-bind', '/', '/',           // read-only root
+    '--bind', allowedWritePath, allowedWritePath, // writable clone dir
+    '--dev', '/dev',                  // /dev/null etc.
+    '--unshare-all',                  // isolate namespaces
+    '--new-session',                  // prevent tty hijack
+    '--die-with-parent',             // cleanup on parent exit
+    ...innerArgs,
+  ];
+}
+
+/**
+ * Build the full sandboxed command + args for the current platform.
+ * Returns { cmd, args } — either wrapped in sandbox or plain git.
+ */
+function buildSandboxedClone(tmpDir, gitArgs) {
+  const innerGitArgs = [...GIT_SANDBOX_CONFIG, ...gitArgs];
+
+  // macOS: sandbox-exec
+  const profile = buildSandboxProfile(tmpDir);
+  if (profile) {
+    return { cmd: 'sandbox-exec', args: ['-p', profile, 'git', ...innerGitArgs], sandbox: 'sandbox-exec' };
+  }
+
+  // Linux: bwrap
+  const bwrapArgs = buildBwrapArgs(tmpDir, ['git', ...innerGitArgs]);
+  if (bwrapArgs) {
+    return { cmd: 'bwrap', args: bwrapArgs, sandbox: 'bwrap' };
+  }
+
+  // Fallback: git with config flags only
+  return { cmd: 'git', args: innerGitArgs, sandbox: null };
+}
+
+// Export for testing
+export {
+  GIT_SANDBOX_CONFIG, GIT_SANDBOX_ENV, buildSandboxProfile, buildBwrapArgs,
+  buildSandboxedClone, MAX_CLONE_SIZE_MB,
+};
+
+// CLI entry point — only run when invoked directly
+import { fileURLToPath } from 'node:url';
+const __filename = fileURLToPath(import.meta.url);
+const isDirectRun = process.argv[1] === __filename;
+
+if (isDirectRun) {
+
 const [,, command, ...rest] = process.argv;

 switch (command) {
@ -52,9 +156,17 @@ switch (command) {
    if (branch) gitArgs.push('--branch', branch);
    gitArgs.push(url, tmpDir);

-    const result = spawnSync('git', gitArgs, {
+    // Build sandboxed clone command (macOS: sandbox-exec, Linux: bwrap, fallback: git only)
+    const { cmd: cloneCmd, args: cloneArgs, sandbox } = buildSandboxedClone(tmpDir, gitArgs);
+
+    if (!sandbox) {
+      console.error('clone: WARN: no OS sandbox available, running with git config hardening only');
+    }
+
+    const result = spawnSync(cloneCmd, cloneArgs, {
      stdio: ['ignore', 'pipe', 'pipe'],
      timeout: 60_000,
+      env: GIT_SANDBOX_ENV,
    });

    if (result.status !== 0) {
@ -65,6 +177,17 @@ switch (command) {
      process.exit(1);
    }

+    // Post-clone size check
+    const duResult = spawnSync('du', ['-sm', tmpDir], { encoding: 'utf8' });
+    if (duResult.status === 0) {
+      const sizeMb = parseInt(duResult.stdout.split('\t')[0], 10);
+      if (sizeMb > MAX_CLONE_SIZE_MB) {
+        try { rmSync(tmpDir, { recursive: true, force: true }); } catch {}
+        console.error(`clone: repo too large (${sizeMb}MB, max ${MAX_CLONE_SIZE_MB}MB)`);
+        process.exit(1);
+      }
+    }
+
    process.stdout.write(tmpDir + '\n');
    break;
  }
@ -100,3 +223,5 @@ switch (command) {
    console.error('Usage: node git-clone.mjs <clone|cleanup|validate> [args...]');
    process.exit(1);
 }
+
+} // end isDirectRun
--- a/plugins/llm-security/scanners/posture-scanner.mjs
+++ b/plugins/llm-security/scanners/posture-scanner.mjs
@ -20,7 +20,7 @@ import { finding, scannerResult, resetCounter } from './lib/output.mjs';
 // Constants
 // ---------------------------------------------------------------------------

-const VERSION = '5.0.0';
+const VERSION = '5.1.0';

 /** Minimum lines for a hook script to be considered non-stub */
 const NON_STUB_THRESHOLD = 5;
--- a/plugins/llm-security/tests/lib/git-clone-sandbox.test.mjs
+++ b/plugins/llm-security/tests/lib/git-clone-sandbox.test.mjs
@ -0,0 +1,283 @@
+// git-clone-sandbox.test.mjs — Tests for sandboxed git clone + fs-utils tmppath
+// Zero external dependencies: node:test + node:assert only.
+
+import { describe, it } from 'node:test';
+import assert from 'node:assert/strict';
+import { spawnSync } from 'node:child_process';
+import { existsSync, rmSync, readFileSync, realpathSync } from 'node:fs';
+import { join } from 'node:path';
+import { tmpdir } from 'node:os';
+import { fileURLToPath } from 'node:url';
+
+const __dirname = fileURLToPath(new URL('.', import.meta.url));
+const LIB_DIR = join(__dirname, '..', '..', 'scanners', 'lib');
+const GIT_CLONE = join(LIB_DIR, 'git-clone.mjs');
+const FS_UTILS = join(LIB_DIR, 'fs-utils.mjs');
+
+// ---------------------------------------------------------------------------
+// Import sandbox exports for unit testing
+// ---------------------------------------------------------------------------
+
+const {
+  GIT_SANDBOX_CONFIG, GIT_SANDBOX_ENV, buildSandboxProfile, buildBwrapArgs,
+  buildSandboxedClone, MAX_CLONE_SIZE_MB,
+} = await import('../../scanners/lib/git-clone.mjs');
+
+// ---------------------------------------------------------------------------
+// GIT_SANDBOX_CONFIG
+// ---------------------------------------------------------------------------
+
+describe('GIT_SANDBOX_CONFIG', () => {
+  it('disables hooks', () => {
+    const idx = GIT_SANDBOX_CONFIG.indexOf('core.hooksPath=/dev/null');
+    assert.ok(idx > 0, 'core.hooksPath=/dev/null must be in config flags');
+  });
+
+  it('disables symlinks', () => {
+    assert.ok(GIT_SANDBOX_CONFIG.includes('core.symlinks=false'));
+  });
+
+  it('disables fsmonitor', () => {
+    assert.ok(GIT_SANDBOX_CONFIG.includes('core.fsmonitor=false'));
+  });
+
+  it('disables LFS filter drivers', () => {
+    assert.ok(GIT_SANDBOX_CONFIG.includes('filter.lfs.process='));
+    assert.ok(GIT_SANDBOX_CONFIG.includes('filter.lfs.smudge='));
+    assert.ok(GIT_SANDBOX_CONFIG.includes('filter.lfs.clean='));
+  });
+
+  it('blocks local file protocol', () => {
+    assert.ok(GIT_SANDBOX_CONFIG.includes('protocol.file.allow=never'));
+  });
+
+  it('enables fsck on transfer', () => {
+    assert.ok(GIT_SANDBOX_CONFIG.includes('transfer.fsckObjects=true'));
+  });
+
+  it('has 8 -c flag pairs (16 elements)', () => {
+    const cCount = GIT_SANDBOX_CONFIG.filter(f => f === '-c').length;
+    assert.equal(cCount, 8, 'Should have exactly 8 -c flags');
+  });
+});
+
+// ---------------------------------------------------------------------------
+// GIT_SANDBOX_ENV
+// ---------------------------------------------------------------------------
+
+describe('GIT_SANDBOX_ENV', () => {
+  it('sets GIT_CONFIG_NOSYSTEM', () => {
+    assert.equal(GIT_SANDBOX_ENV.GIT_CONFIG_NOSYSTEM, '1');
+  });
+
+  it('sets GIT_CONFIG_GLOBAL to /dev/null', () => {
+    assert.equal(GIT_SANDBOX_ENV.GIT_CONFIG_GLOBAL, '/dev/null');
+  });
+
+  it('sets GIT_ATTR_NOSYSTEM', () => {
+    assert.equal(GIT_SANDBOX_ENV.GIT_ATTR_NOSYSTEM, '1');
+  });
+
+  it('sets GIT_TERMINAL_PROMPT to 0', () => {
+    assert.equal(GIT_SANDBOX_ENV.GIT_TERMINAL_PROMPT, '0');
+  });
+
+  it('preserves existing PATH', () => {
+    assert.ok(GIT_SANDBOX_ENV.PATH, 'PATH must be preserved from process.env');
+  });
+});
+
+// ---------------------------------------------------------------------------
+// buildSandboxProfile
+// ---------------------------------------------------------------------------
+
+describe('buildSandboxProfile', () => {
+  it('returns a profile string on macOS', () => {
+    if (process.platform !== 'darwin') return;
+    // Use tmpdir() which always exists — realpathSync needs an existing path
+    const profile = buildSandboxProfile(tmpdir());
+    assert.ok(profile !== null, 'Should return a profile on macOS');
+    assert.ok(profile.includes('(version 1)'), 'Profile must start with version');
+    assert.ok(profile.includes('(deny file-write*)'), 'Must deny writes by default');
+  });
+
+  it('includes the resolved real path in the profile', () => {
+    if (process.platform !== 'darwin') return;
+    const realPath = realpathSync(tmpdir());
+    const profile = buildSandboxProfile(tmpdir());
+    assert.ok(profile.includes(realPath), `Profile must contain resolved path: ${realPath}`);
+  });
+
+  it('allows /dev/null and /dev/tty writes', () => {
+    if (process.platform !== 'darwin') return;
+    const profile = buildSandboxProfile(tmpdir());
+    assert.ok(profile.includes('/dev/null'), 'Must allow /dev/null');
+    assert.ok(profile.includes('/dev/tty'), 'Must allow /dev/tty');
+  });
+});
+
+// ---------------------------------------------------------------------------
+// buildBwrapArgs
+// ---------------------------------------------------------------------------
+
+describe('buildBwrapArgs', () => {
+  it('returns null on non-Linux platforms', () => {
+    if (process.platform === 'linux') return;
+    const result = buildBwrapArgs('/tmp/test', ['git', 'clone']);
+    assert.equal(result, null, 'Should return null on non-Linux');
+  });
+
+  it('on Linux: returns args array if bwrap is available', () => {
+    if (process.platform !== 'linux') return;
+    const check = spawnSync('which', ['bwrap'], { encoding: 'utf8' });
+    if (check.status !== 0) return; // bwrap not installed, skip
+    const result = buildBwrapArgs('/tmp/test-bwrap', ['git', 'clone']);
+    if (result === null) return; // bwrap installed but fails (Ubuntu 24.04+)
+    assert.ok(Array.isArray(result), 'Should return an array');
+    assert.ok(result.includes('--ro-bind'), 'Should include --ro-bind');
+    assert.ok(result.includes('--unshare-all'), 'Should include --unshare-all');
+    assert.ok(result.includes('/tmp/test-bwrap'), 'Should include the allowed write path');
+  });
+});
+
+// ---------------------------------------------------------------------------
+// buildSandboxedClone
+// ---------------------------------------------------------------------------
+
+describe('buildSandboxedClone', () => {
+  it('returns cmd, args, and sandbox properties', () => {
+    const result = buildSandboxedClone(tmpdir(), ['clone', '--depth', '1', 'url', tmpdir()]);
+    assert.ok(result.cmd, 'Must have cmd');
+    assert.ok(Array.isArray(result.args), 'args must be an array');
+    assert.ok('sandbox' in result, 'Must have sandbox property');
+  });
+
+  it('uses sandbox-exec on macOS', () => {
+    if (process.platform !== 'darwin') return;
+    const result = buildSandboxedClone(tmpdir(), ['clone', '--depth', '1', 'url', tmpdir()]);
+    assert.equal(result.sandbox, 'sandbox-exec');
+    assert.equal(result.cmd, 'sandbox-exec');
+  });
+
+  it('includes git config flags in args regardless of platform', () => {
+    const result = buildSandboxedClone(tmpdir(), ['clone', '--depth', '1', 'url', tmpdir()]);
+    const argsStr = result.args.join(' ');
+    assert.ok(argsStr.includes('core.hooksPath=/dev/null'), 'Must include hooksPath');
+    assert.ok(argsStr.includes('core.symlinks=false'), 'Must include symlinks=false');
+  });
+
+  it('falls back gracefully with sandbox=null when no OS sandbox', () => {
+    // This test verifies the structure — on macOS/Linux with sandbox available,
+    // it will have a sandbox. The key assertion is structural.
+    const result = buildSandboxedClone(tmpdir(), ['clone', 'url', tmpdir()]);
+    if (result.sandbox === null) {
+      assert.equal(result.cmd, 'git', 'Fallback must use git directly');
+    }
+  });
+});
+
+// ---------------------------------------------------------------------------
+// MAX_CLONE_SIZE_MB
+// ---------------------------------------------------------------------------
+
+describe('MAX_CLONE_SIZE_MB', () => {
+  it('is 100', () => {
+    assert.equal(MAX_CLONE_SIZE_MB, 100);
+  });
+});
+
+// ---------------------------------------------------------------------------
+// fs-utils tmppath uniqueness
+// ---------------------------------------------------------------------------
+
+describe('fs-utils tmppath', () => {
+  it('generates unique paths for the same base name', () => {
+    const paths = new Set();
+    for (let i = 0; i < 5; i++) {
+      const result = spawnSync('node', [FS_UTILS, 'tmppath', 'content-extract.json'], {
+        encoding: 'utf8',
+      });
+      assert.equal(result.status, 0, `tmppath should exit 0, got: ${result.stderr}`);
+      paths.add(result.stdout.trim());
+    }
+    assert.equal(paths.size, 5, 'All 5 paths should be unique');
+  });
+
+  it('preserves file extension', () => {
+    const result = spawnSync('node', [FS_UTILS, 'tmppath', 'test-file.json'], {
+      encoding: 'utf8',
+    });
+    assert.ok(result.stdout.trim().endsWith('.json'), 'Should preserve .json extension');
+  });
+
+  it('preserves base name prefix', () => {
+    const result = spawnSync('node', [FS_UTILS, 'tmppath', 'my-evidence.json'], {
+      encoding: 'utf8',
+    });
+    assert.ok(result.stdout.trim().includes('my-evidence-'), 'Should contain base name prefix');
+  });
+
+  it('paths are under tmpdir', () => {
+    const result = spawnSync('node', [FS_UTILS, 'tmppath', 'test.json'], {
+      encoding: 'utf8',
+    });
+    const path = result.stdout.trim();
+    assert.ok(path.startsWith(tmpdir()), `Path should be under tmpdir: ${path}`);
+  });
+});
+
+// ---------------------------------------------------------------------------
+// git-clone CLI: validate
+// ---------------------------------------------------------------------------
+
+describe('git-clone validate', () => {
+  it('accepts valid HTTPS GitHub URL', () => {
+    const result = spawnSync('node', [GIT_CLONE, 'validate', 'https://github.com/org/repo'], {
+      encoding: 'utf8',
+    });
+    assert.equal(result.status, 0);
+  });
+
+  it('accepts valid SSH GitHub URL', () => {
+    const result = spawnSync('node', [GIT_CLONE, 'validate', 'git@github.com:org/repo.git'], {
+      encoding: 'utf8',
+    });
+    assert.equal(result.status, 0);
+  });
+
+  it('rejects non-GitHub URL', () => {
+    const result = spawnSync('node', [GIT_CLONE, 'validate', 'https://evil.com/repo'], {
+      encoding: 'utf8',
+    });
+    assert.equal(result.status, 1);
+  });
+
+  it('rejects URL with tree path', () => {
+    const result = spawnSync('node', [GIT_CLONE, 'validate', 'https://github.com/org/repo/tree/main/dir'], {
+      encoding: 'utf8',
+    });
+    assert.equal(result.status, 1);
+  });
+});
+
+// ---------------------------------------------------------------------------
+// git-clone CLI: cleanup safety
+// ---------------------------------------------------------------------------
+
+describe('git-clone cleanup', () => {
+  it('refuses to remove paths outside tmpdir', () => {
+    const result = spawnSync('node', [GIT_CLONE, 'cleanup', '/home/user/important'], {
+      encoding: 'utf8',
+    });
+    assert.equal(result.status, 1);
+    assert.ok(result.stderr.includes('refusing to remove'));
+  });
+
+  it('handles non-existent tmpdir path gracefully', () => {
+    const fakePath = join(tmpdir(), 'llm-sec-nonexistent-test-' + Date.now());
+    const result = spawnSync('node', [GIT_CLONE, 'cleanup', fakePath], {
+      encoding: 'utf8',
+    });
+    assert.equal(result.status, 0, 'Should exit 0 for non-existent path in tmpdir');
+  });
+});