agent-builder/skills/agent-system-design/references/security-patterns.md
Kjell Tore Guttormsen 075383990f feat: initial agent-builder plugin (v0.1.0)
Build complete autonomous agent systems with Claude Code.
7-phase guided workflow: map work, CLAUDE.md, agent team,
pipeline, security, deployment, test.

Components:
- commands/build.md: main guided workflow
- agents/builder.md: scaffolding agent
- skills/agent-system-design: architecture knowledge + 4 references
- scripts/templates: hooks, automation, launchd, systemd

Covers 22 OpenClaw capabilities across 4 deployment targets
(local, Mac Mini, VPS, Managed Agents).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-10 19:10:54 +02:00

8.1 KiB

Security Patterns for Autonomous Agents

Reference for the agent-system-design skill. Covers permission modes, hook-based guardrails, settings.json configuration, and a checklist for hardening autonomous agents.


1. Permission Modes

Four levels of approval, from most restrictive to least:

Default (interactive approval)

Claude asks the user before every tool call. Suitable for exploratory sessions. No configuration required.

Auto-edit (AcceptEdits mode)

File edits are auto-approved. Bash commands still require approval. Enables faster iteration on code without allowing arbitrary shell execution.

{
  "autoApprove": ["Edit", "Write"]
}

Auto Mode (AI classifier)

An AI classifier evaluates each tool call and approves or blocks it based on predicted risk. Reported metrics: 0.4% false positive rate (safe calls blocked), 5.7% false negative rate (unsafe calls approved).

Suitable for: attended automation where the developer is nearby and can intervene. Not suitable for: fully unattended production agents.

{
  "autoApprove": ["auto"]
}

Bypass (--dangerously-skip-permissions)

All tool calls are auto-approved without review. Must only be used inside a sandbox (Docker, VM, or a throwaway environment with no access to production systems). Never use on the host machine with access to real credentials or filesystems.

claude --dangerously-skip-permissions -p "run the pipeline"

Decision rule: Match the permission mode to the blast radius. The more access the agent has, the more restrictive the approval mode must be.


2. Hook-Based Guardrails

Hooks run synchronously before and after tool calls. A non-zero exit from PreToolUse blocks the tool call entirely. Five standard patterns:

Pattern 1: Destructive Command Blocking

Block commands that cannot be undone.

# hooks/pre-tool-use.sh
BLOCKED_PATTERNS=(
  "rm -rf"
  "mkfs"
  "dd if="
  ":(){ :|:& };:"   # fork bomb
  "> /dev/"
)
COMMAND="$CLAUDE_TOOL_INPUT_COMMAND"
for pattern in "${BLOCKED_PATTERNS[@]}"; do
  if echo "$COMMAND" | grep -qF "$pattern"; then
    echo "BLOCKED: destructive pattern detected: $pattern" >&2
    exit 1
  fi
done

Pattern 2: Piped Script Execution Blocking

Block patterns that download and execute code in a single pipeline.

PIPED_EXEC_PATTERNS=(
  "curl.*|.*bash"
  "curl.*|.*sh"
  "wget.*|.*bash"
  "wget.*|.*sh"
)
for pattern in "${PIPED_EXEC_PATTERNS[@]}"; do
  if echo "$COMMAND" | grep -qE "$pattern"; then
    echo "BLOCKED: piped script execution detected" >&2
    exit 1
  fi
done

Pattern 3: Privilege Escalation Blocking

Block commands that elevate privileges or weaken file permissions.

PRIV_PATTERNS=(
  "sudo"
  "chmod 777"
  "chmod a+x /etc"
  "shutdown"
  "reboot"
  "init 0"
)
for pattern in "${PRIV_PATTERNS[@]}"; do
  if echo "$COMMAND" | grep -qF "$pattern"; then
    echo "BLOCKED: privilege escalation detected" >&2
    exit 1
  fi
done

Pattern 4: Path Restriction

Prevent writes outside the project directory.

PROJECT_DIR="$(cd "$(dirname "$0")/.." && pwd)"
TOOL_PATH="$CLAUDE_TOOL_INPUT_FILE_PATH"
if [ -n "$TOOL_PATH" ]; then
  REAL_PATH="$(realpath "$TOOL_PATH" 2>/dev/null || echo "$TOOL_PATH")"
  if [[ "$REAL_PATH" != "$PROJECT_DIR"* ]]; then
    echo "BLOCKED: write outside project dir: $REAL_PATH" >&2
    exit 1
  fi
fi

Pattern 5: Audit Logging

Log every tool call with timestamp for post-hoc review.

# hooks/post-tool-use.sh
LOG_FILE="$PROJECT_DIR/logs/audit.log"
mkdir -p "$(dirname "$LOG_FILE")"
echo "$(date -u +"%Y-%m-%dT%H:%M:%SZ") TOOL=$CLAUDE_TOOL_NAME INPUT=$CLAUDE_TOOL_INPUT" >> "$LOG_FILE"

Combine all five patterns into a single pre-tool-use.sh and a separate post-tool-use.sh. Keep them under 80 lines each so they are auditable at a glance.


3. settings.json Security Configuration

The full security configuration surface in settings.json:

{
  "permissions": {
    "allow": [
      "Bash(git:*)",
      "Bash(npm run *)",
      "Read(**)",
      "Write(src/**)",
      "Edit(src/**)"
    ],
    "deny": [
      "Bash(rm -rf *)",
      "Bash(sudo *)",
      "Bash(curl * | *)",
      "Bash(wget * | *)",
      "Write(/etc/*)",
      "Write(~/.ssh/*)",
      "Write(~/.zshenv)"
    ]
  },
  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Bash",
        "hooks": [
          {
            "type": "command",
            "command": "./hooks/pre-tool-use.sh"
          }
        ]
      }
    ],
    "PostToolUse": [
      {
        "matcher": "*",
        "hooks": [
          {
            "type": "command",
            "command": "./hooks/post-tool-use.sh"
          }
        ]
      }
    ]
  }
}

Allow list principle: Prefer an explicit allow list over a deny list alone. The deny list catches known-bad patterns; the allow list enforces least privilege. Both together provide defense in depth.

Glob patterns in deny list: Use * conservatively. Bash(sudo *) blocks all sudo invocations. Write(/etc/*) blocks all writes to system config.


4. Security Checklist for Autonomous Agents

Verify each item before declaring an agent production-ready:

  • PreToolUse hook is present and blocks destructive commands (Pattern 1-3 above)
  • PostToolUse audit log is enabled and written to a persistent location
  • permissions.deny list covers: rm -rf *, sudo *, curl * | *, wget * | *
  • permissions.allow list is as narrow as the agent's task requires
  • MEMORY.md does not contain API keys, tokens, or passwords
  • .env is in .gitignore; secrets are loaded from environment, not from files tracked by git
  • Deployment target matches the blast radius (see deployment-targets.md)
  • If always-on: phone approval via permission relay is configured (v2.1.81+)
  • Audit log is rotated (logrotate or launchd-managed)
  • Hook scripts are executable (chmod +x hooks/*.sh) and checked into version control

Phone Approval (v2.1.81+)

For unattended agents that must occasionally escalate to a human:

{
  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Bash(sudo *)",
        "hooks": [
          {
            "type": "prompt",
            "prompt": "This command requires elevated privileges. Approve? (yes/no)",
            "channel": "telegram"
          }
        ]
      }
    ]
  }
}

The agent pauses and sends the approval request to the configured channel. The operator approves or rejects from their phone. If no response within the timeout, the hook exits non-zero and the command is blocked.


5. OpenClaw vs Claude Code: Security Philosophy

Dimension OpenClaw Claude Code
Primary mechanism Docker sandbox (containment) Hooks + deny list (prevention)
Approach Contain the damage after the fact Prevent the action before it runs
Escape risk Container escape (low, not zero) Hook bypass if hook is misconfigured
Auditability Container logs Audit log hook (Pattern 5)
Operator control Docker network/volume flags settings.json permissions
Mobile escalation Native Permission relay via channel MCP (v2.1.81+)

Containment vs Prevention: OpenClaw runs the agent inside a Docker container with limited network and volume mounts. If the agent does something harmful, the damage is contained to the container. Claude Code instead prevents harmful actions from running at all, via hooks and the permissions deny list.

Tradeoff: Containment is harder to escape but allows the harmful action to attempt execution. Prevention stops it earlier but requires the hook to be correctly configured and maintained. For production agents, combine both: run Claude Code inside Docker AND configure hooks and deny lists.

Combined approach (recommended for production):

  1. Run claude inside a Docker container with minimal volume mounts
  2. Configure PreToolUse hooks with Patterns 1-4
  3. Enable PostToolUse audit logging (Pattern 5)
  4. Use an explicit permissions.deny list
  5. Do not use --dangerously-skip-permissions outside the container

This gives containment as a last resort and prevention as the primary defense.