1
0
Fork 0
claude-code-complete-agent/security/openclaw-security-assessment.md
Kjell Tore Guttormsen 2fe6a78e3c docs(security): add OpenClaw vs Claude Code security assessment
Data-driven comparison covering 9 CVEs, 10 security categories,
and attack surface analysis. Based on published research from
SecurityScorecard, DigitalOcean, Sangfor, and OpenClaw official docs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-05 23:38:34 +02:00

13 KiB

OpenClaw vs Claude Code: Security Assessment

A data-driven comparison of security posture between OpenClaw (self-hosted AI agent) and Claude Code (managed AI agent). All CVE numbers, exposure statistics, and incident references are sourced from published security research (March-April 2026).

This is not a marketing document. Both platforms have strengths and gaps. The goal is to help you choose based on evidence.

The security landscape (March 2026)

OpenClaw reached 247K GitHub stars and became the fastest-growing open-source project in history. With adoption came scrutiny:

  • 9 CVEs in 4 days (March 18-21, 2026), CVSS scores up to 9.9
  • 40,214 internet-exposed instances (SecurityScorecard)
  • 35-63% of deployments vulnerable at time of analysis
  • 824 malicious skills found in ClawHub marketplace out of 10,700+
  • 128 pending CVE assignments in upstream tracker

Claude Code has no public CVEs to date. Its managed infrastructure model eliminates several attack surface categories entirely.

OpenClaw vulnerability summary

CVE CVSS Category Impact
CVE-2026-22172 9.9 Auth bypass Client self-declares admin scope
CVE-2026-25253 8.8 WebSocket hijack One-click RCE via malicious link
CVE-2026-32048 7.5 Sandbox escape Child processes spawn unsandboxed
CVE-2026-32025 7.5 Brute force No rate limiting on localhost auth
CVE-2026-32032 7.0 Shell injection Untrusted SHELL env variable
CVE-2026-29607 6.4 Approval bypass Approve safe command, swap payload
CVE-2026-28460 5.9 Allowlist evasion Line-continuation character bypass
CVE-2026-22171 8.2 Path traversal Arbitrary file write via media
CVE-2026-32049 7.5 DoS Oversized media payload

Architectural patterns in these CVEs:

  1. Authorization model accepts client-declared permissions (CVE-2026-22172)
  2. Sandbox constraints don't propagate through process spawning (CVE-2026-32048)
  3. Human-in-the-loop approvals have enforcement gaps (CVE-2026-29607, 28460)
  4. Rate limiting absent on authentication endpoints (CVE-2026-32025)

Head-to-head: 10 security categories

1. Network exposure

OpenClaw risk: Gateway on port 18789. Default binds to localhost but misconfigured reverse proxies, Docker port publishing, and Tailscale Serve can expose it. 40K+ instances found on public internet. The gateway is a WebSocket server — no origin validation led to CVE-2026-25253 (one-click RCE from any website).

Claude Code: No gateway. No listening port. No WebSocket server. The attack surface does not exist. Claude Code communicates with Anthropic's API over outbound HTTPS only.

Verdict: Claude Code eliminates this category entirely.

2. Authentication and access control

OpenClaw risk: Token/password/trusted-proxy modes. Gateway token required by default since 2026.1.29, but older deployments lack auth. CVE-2026-22172 allowed any authenticated user to become admin by self-declaring scope. DM pairing (1-hour expiring codes) provides messaging access control.

Claude Code: Single-user model. No multi-user auth layer needed. Permission modes (default, auto-edit, auto, bypass) control what the agent can do, not who can access it. API key stored in OS keychain (macOS) or environment variable.

Verdict: Different models. OpenClaw needs auth because it's multi-user and network-accessible. Claude Code is single-user and local, so the auth question doesn't arise. For multi-user needs, OpenClaw must be properly configured; Claude Code isn't designed for it.

3. Execution sandboxing

OpenClaw risk: Three-level sandbox (off, non-main, all). Docker containerization available. But CVE-2026-32048 showed sandbox constraints don't propagate to child processes. NemoClaw adds kernel-level enforcement (Landlock, seccomp, netns).

Claude Code: Permission modes + hooks. macOS sandbox-exec available. Hooks run as separate processes and can block any tool call. No kernel-level isolation by default, but the agent is prevented from attempting dangerous operations rather than contained after attempting them.

Verdict: NemoClaw wins for enterprise isolation. Claude Code wins for flexibility and zero infrastructure. Vanilla OpenClaw has known sandbox escape paths.

4. Supply chain (skills/plugins)

OpenClaw risk: ClawHub marketplace had 824 malicious skills among 10,700+ (the ClawHavoc campaign). Atomic macOS Stealer distributed via fake skills. Publishing requires only a week-old GitHub account with no code review.

Claude Code: Plugin marketplace is smaller (2,300+) with a review process. Local plugins don't go through any marketplace. The llm-security plugin provides supply chain scanning: blocklists for 7 package managers, OSV.dev CVE checking, Levenshtein typosquat detection, and npm/pip audit integration.

Verdict: Claude Code's ecosystem is smaller but more controlled. OpenClaw's marketplace scale introduced real malware distribution.

5. Prompt injection defense

OpenClaw risk: Prompt injection led to code execution (CVE-2026-30741). Persistent memory creates delayed-execution attack paths: malicious instructions embedded in documents can remain dormant for days. Official guidance: "validate tool calls against policy, not only model output."

Claude Code: llm-security provides 3 layers of defense:

  • pre-prompt-inject-scan.mjs: Blocks injection patterns in user prompts (configurable: block/warn/off)
  • post-mcp-verify.mjs: Scans ALL tool output for injection and HTML content traps
  • post-session-guard.mjs: Detects runtime trifecta patterns (untrusted input + sensitive data + exfiltration sink)

Additionally, the taint-tracer scanner traces data flow paths statically, and the toxic-flow-analyzer correlates findings across scanners to detect compound attack chains.

Verdict: Claude Code (with llm-security) has more active defense layers. OpenClaw relies on tool policy enforcement which has documented bypass paths (CVE-2026-29607, 28460).

6. Credential management

OpenClaw risk: API keys and platform tokens stored in plaintext in ~/.openclaw/. File permissions (600/700) are the primary protection. Prompt injection attacks can exfiltrate credentials through tool calls.

Claude Code: API key stored in macOS Keychain (encrypted, OS-level). llm-security hooks block credential patterns in file writes (pre-edit-secrets.mjs) and detect secrets in code (entropy-scanner). Path guard blocks writes to .env, .ssh/, .aws/, and credentials files.

Verdict: Claude Code's approach (OS keychain + write blocking) is stronger than filesystem permissions on plaintext files.

7. Browser and relay security

OpenClaw risk: Browser profiles with logged-in sessions become agent-accessible. Remote CDP connections, SSRF via browser (private network access enabled by default). Relay access should be restricted to approved operators.

Claude Code: Playwright MCP for browser automation. Computer Use in Desktop app. No persistent browser relay. No agent access to browser profiles by default. Each browser session is explicit.

Verdict: Claude Code has a smaller browser attack surface because there is no always-on relay.

8. Session isolation

OpenClaw risk: Multi-user access requires explicit session isolation (dmScope: per-channel-peer). Default is unified session, creating cross-user context leakage risk. Any allowed sender can induce tool calls within the agent's permission set.

Claude Code: Single-user, single-session model. No cross-user leakage possible. Agent Teams run in isolated contexts (separate worktrees for file isolation).

Verdict: Claude Code eliminates multi-user leakage by design.

9. Configuration hardening

OpenClaw risk: Numerous configuration surfaces (openclaw.json, SOUL.md, tool policies, network binding, mDNS discovery, DM policies). Default configurations have historically been too permissive. openclaw security audit --deep provides automated checking.

Claude Code: Configuration through settings.json hierarchy (global, project, local) and CLAUDE.md. The config-audit plugin analyzes configuration quality with A-F grading. The reference-config-generator creates hardened configurations based on detected gaps.

Verdict: Both have audit tooling. OpenClaw has more configuration surface area (and more ways to misconfigure).

10. Monitoring and incident response

OpenClaw risk: Logging to /tmp/openclaw/. Redaction available but opt-in. No built-in anomaly detection. Audit trail integrity not guaranteed (file-based logs).

Claude Code: llm-security provides runtime monitoring: post-session-guard.mjs tracks tool call patterns in a sliding window (20 calls), detects concentrated MCP usage, and tracks cumulative data volume (100KB/500KB/1MB thresholds). The dashboard aggregator provides cross-project posture visibility. Scan baselines enable drift detection over time.

Verdict: Claude Code (with llm-security) has more active runtime monitoring. Both lack tamper-resistant audit trails.

What Claude Code does NOT do

Honest gaps where OpenClaw has genuine advantages:

Capability OpenClaw Claude Code
Daemon persistence Runs 24/7 as background process Session-based, stops when closed
Multi-engine Claude, GPT, Gemini, local models Claude only
Native messaging 15+ channels (WhatsApp, Telegram, Signal, iMessage) Channels (limited), MCP bridges
Canvas/A2UI Interactive HTML workspace HTML generation only
Self-hosting Full infrastructure control Anthropic-dependent
Kernel isolation Via NemoClaw (Landlock, seccomp) Not available

These gaps matter for specific use cases. If you need always-on daemon persistence or kernel-level multi-tenant isolation, Claude Code is not a drop-in replacement.

The "use Claude Code" mitigation

For use cases where Claude Code covers the functional requirements (21 of 22 capabilities — see feature-map.md), migrating from OpenClaw to Claude Code eliminates entire attack surface categories:

OpenClaw vulnerability Claude Code mitigation
WebSocket hijacking (CVE-2026-25253) No gateway, no listening port
40K exposed instances No network exposure
ClawHub malware (824 malicious skills) Local plugins, smaller reviewed marketplace
Plaintext credential storage OS keychain encryption
Prompt injection to RCE (CVE-2026-30741) Multi-layer hook defense
Sandbox escape (CVE-2026-32048) Permission-based prevention
Approval bypass (CVE-2026-29607) Deterministic hook validation
Auth bypass (CVE-2026-22172) Single-user, no multi-user auth
Shadow AI (22% enterprise) Anthropic billing visibility
Moltbook breach (2.8M agents) No shared agent platform

This is not "Claude Code is more secure" as a blanket claim. It is "Claude Code's architecture does not have these specific attack surfaces." The tradeoff is less infrastructure control and vendor dependency.

Security tooling comparison

Tool OpenClaw Claude Code (with llm-security)
Built-in audit openclaw security audit --deep /security posture + /security audit
Prompt injection defense Tool policy validation 3 active hooks + static taint analysis
Supply chain scanning Manual review 7 package managers, OSV.dev, typosquat detection
Secret detection None built-in Entropy scanner + write-blocking hooks
Memory poisoning None built-in Memory poisoning scanner (CLAUDE.md, rules)
Attack simulation None built-in 38 scenarios across 7 categories
Continuous monitoring None built-in /security watch with baseline diffing
Threat modeling None built-in Interactive STRIDE/MAESTRO sessions
Framework coverage Internal checks OWASP LLM Top 10, Agentic AI Top 10, Skills Top 10, MCP Top 10, AI Agent Traps

Recommendation

Choose based on your actual requirements:

  1. Personal automation, development work: Claude Code. Smaller attack surface, active hook defense, no infrastructure to secure.

  2. Always-on daemon, multi-channel messaging: OpenClaw with hardened configuration. Follow the Blink 10-step checklist. Consider NemoClaw for kernel isolation.

  3. Enterprise, multi-tenant, compliance: NemoClaw on OpenClaw or wait for Claude Code enterprise features. Neither vanilla OpenClaw nor Claude Code currently meets SOC2 requirements.

  4. Maximum security on personal setup: Claude Code + llm-security plugin. This repo demonstrates the configuration.

Sources

All vulnerability data sourced from published research:

  • OpenClaw CVE flood analysis (openclawai.io, March 2026)
  • SecurityScorecard exposure report (40,214 instances)
  • DigitalOcean "7 OpenClaw Security Challenges" (2026)
  • Sangfor "OpenClaw Security Risks" analysis
  • OpenClaw official security documentation (docs.openclaw.ai)
  • Valletta Software hardening guide (2026)
  • Nebius architecture analysis

CVE numbers verified against NVD. Statistics represent point-in-time measurements from the cited reports.