feat(templates): add governance and approval gate templates (Paperclip pattern)

Session 4 step 17 — 5 autonomy levels (0-4), PreToolUse approval-gate hook polls approval-responses.jsonl with 60s timeout, blocks on no-response. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 06:55:33 +02:00 · 2026-04-12 06:55:33 +02:00 · 912689f3c5
commit 912689f3c5
parent ec6f7c150e
3 changed files with 277 additions and 0 deletions
--- a/scripts/templates/governance/GOVERNANCE.md
+++ b/scripts/templates/governance/GOVERNANCE.md
@ -0,0 +1,40 @@
+# Governance: {{PROJECT_NAME}}
+
+## Autonomy Levels
+
+- Level 0: Full manual approval (all tool calls require human OK)
+- Level 1: Auto-approve safe operations (Read, Glob, Grep)
+- Level 2: Auto-approve file operations (+ Write, Edit within project)
+- Level 3: Auto-approve all except destructive (+ Bash non-destructive)
+- Level 4: Full autonomy with hooks as guardrails
+
+Current level: {{AUTONOMY_LEVEL}}
+
+## Approval Gates
+
+Gates are checkpoints where the agent MUST pause and request human approval.
+
+- {{GATE_1_NAME}}: {{GATE_1_CONDITION}}
+  Action: {{GATE_1_ACTION}}
+- {{GATE_2_NAME}}: {{GATE_2_CONDITION}}
+  Action: {{GATE_2_ACTION}}
+
+## Escalation Rules
+
+- Budget exceeded: pause agent, notify via {{NOTIFICATION_METHOD}}
+- Error threshold: after {{ERROR_THRESHOLD}} consecutive errors, pause agent
+- Unknown tool call: block and log
+- Scope violation: block and notify
+
+## Audit Requirements
+
+- All tool calls logged to audit.log
+- Budget events logged to cost-events.jsonl
+- Approval decisions logged to approvals.log
+- Retention: {{LOG_RETENTION_DAYS}} days
+
+## Philosophy
+
+Autonomy is a privilege you grant. Start at Level 0 and increase only
+when the agent has demonstrated reliable behavior at the current level.
+Each level adds capability while hooks maintain the guardrails.
--- a/scripts/templates/governance/README.md
+++ b/scripts/templates/governance/README.md
@ -0,0 +1,75 @@
+# Governance and Approval Gates
+
+PreToolUse hook implementing human oversight with configurable autonomy levels.
+Inspired by Paperclip's "autonomy is a privilege you grant" philosophy.
+
+## Autonomy model
+
+Five levels (0–4) control how much the agent auto-approves:
+
+| Level | Description | Auto-approves |
+|-------|-------------|---------------|
+| 0 | Full manual | Nothing — every tool requires human OK |
+| 1 | Safe ops | Read, Glob, Grep |
+| 2 | File ops | + Write, Edit within project |
+| 3 | Non-destructive | + Bash (non-destructive commands) |
+| 4 | Full autonomy | Everything — hooks act only as guardrails |
+
+Set `Current level: N` in `GOVERNANCE.md` to change autonomy.
+
+## Approval gates
+
+Gates are conditional checkpoints where the agent pauses and waits for
+human approval regardless of autonomy level. Configure them in `GOVERNANCE.md`:
+
+```
+- deploy-gate: after all tests pass AND reviewer approved
+  Action: pause
+```
+
+## How approval flow works
+
+1. `approval-gate.sh` runs as a PreToolUse hook
+2. If auto-approved (based on level): exits 0, tool proceeds
+3. If gated: writes request to `governance/pending-approvals.jsonl`
+4. Polls `governance/approval-responses.jsonl` for matching response (by ID)
+5. Response `approve` → allow (exit 0), log to `approvals.log`
+6. Response `deny` → block (exit 2), log denial
+7. No response within timeout (default 60s) → block with timeout message
+
+## Responding to approval requests
+
+When an agent is waiting for approval, add to `governance/approval-responses.jsonl`:
+```json
+{"id": "<request-id>", "decision": "approve"}
+```
+or
+```json
+{"id": "<request-id>", "decision": "deny"}
+```
+
+## Audit trail
+
+All tool calls are logged to `governance/audit.log`.
+All approval decisions are logged to `governance/approvals.log`.
+
+## Integration
+
+Add to `.claude/settings.json`:
+```json
+{
+  "hooks": {
+    "PreToolUse": [{
+      "matcher": "*",
+      "hooks": [{"type": "command", "command": "bash governance/approval-gate.sh"}]
+    }]
+  }
+}
+```
+
+## Paperclip comparison
+
+Paperclip stores approvals in a database table with async notification.
+This implementation uses file-based polling — simpler, no service dependency,
+suitable for single-machine deployments. For multi-agent systems that need
+concurrent approval routing, consider upgrading to a queue or webhook approach.
--- a/scripts/templates/governance/approval-gate.sh
+++ b/scripts/templates/governance/approval-gate.sh
@ -0,0 +1,162 @@
+#!/bin/bash
+# PreToolUse hook: Implement approval gates based on GOVERNANCE.md.
+# Bash 3.2 compatible. Uses python3 for JSON/MD parsing.
+#
+# Follows Paperclip's approval mechanism:
+# 1. Read GOVERNANCE.md for current autonomy level and gate definitions
+# 2. Auto-approve or require human approval based on level
+# 3. Write pending approval requests; check for responses
+# 4. Timeout with no response → block
+#
+# Placeholders:
+#   {{WORKING_DIR}} - absolute path to project directory
+
+WORKING_DIR="{{WORKING_DIR}}"
+GOVERNANCE_DIR="$WORKING_DIR/governance"
+GOVERNANCE_FILE="$WORKING_DIR/GOVERNANCE.md"
+PENDING_FILE="$GOVERNANCE_DIR/pending-approvals.jsonl"
+RESPONSES_FILE="$GOVERNANCE_DIR/approval-responses.jsonl"
+AUDIT_LOG="$GOVERNANCE_DIR/audit.log"
+APPROVALS_LOG="$GOVERNANCE_DIR/approvals.log"
+APPROVAL_TIMEOUT=60
+
+mkdir -p "$GOVERNANCE_DIR"
+
+# Read hook input
+INPUT=$(cat)
+
+# Auto-approve tools based on autonomy level
+DECISION=$(GOVERNANCE_FILE="$GOVERNANCE_FILE" python3 -c "
+import re, json, sys, os
+
+governance_file = os.environ.get('GOVERNANCE_FILE', '')
+
+try:
+    data = json.loads('''$INPUT''')
+except:
+    print('approve')
+    sys.exit(0)
+
+tool_name = data.get('tool_name', '')
+
+# Read governance policy
+if not os.path.exists(governance_file):
+    print('approve')
+    sys.exit(0)
+
+content = open(governance_file).read()
+level_m = re.search(r'Current level:\s*(\d+)', content)
+level = int(level_m.group(1)) if level_m else 0
+
+# Safe read-only tools (always approved at level 1+)
+read_tools = ['Read', 'Glob', 'Grep', 'LS']
+# File operation tools (approved at level 2+)
+file_tools = ['Write', 'Edit', 'MultiEdit']
+# Non-destructive bash (approved at level 3+)
+# Level 4: everything auto-approved
+
+if level >= 4:
+    print('approve')
+elif level >= 3 and tool_name not in ['Bash']:
+    print('approve')
+elif level >= 2 and tool_name in read_tools + file_tools:
+    print('approve')
+elif level >= 1 and tool_name in read_tools:
+    print('approve')
+else:
+    print('gate')
+" 2>/dev/null)
+
+# Log every tool call to audit log
+python3 -c "
+import json, time, os
+try:
+    data = json.loads('''$INPUT''')
+    tool_name = data.get('tool_name', 'unknown')
+    entry = time.strftime('%Y-%m-%dT%H:%M:%SZ', time.gmtime()) + ' TOOL ' + tool_name + ' decision=$DECISION'
+    with open('$AUDIT_LOG', 'a') as f:
+        f.write(entry + '\n')
+except:
+    pass
+" 2>/dev/null
+
+if [ "$DECISION" = "approve" ]; then
+  exit 0
+fi
+
+# Gate: write pending approval request
+REQUEST_ID=$(python3 -c "import time; print(str(int(time.time())))" 2>/dev/null)
+python3 -c "
+import json, time, os
+try:
+    data = json.loads('''$INPUT''')
+    tool_name = data.get('tool_name', 'unknown')
+    tool_input = data.get('tool_input', {})
+    req = {
+        'id': '$REQUEST_ID',
+        'timestamp': time.strftime('%Y-%m-%dT%H:%M:%SZ', time.gmtime()),
+        'tool_name': tool_name,
+        'tool_input_summary': str(tool_input)[:200],
+        'status': 'pending'
+    }
+    with open('$PENDING_FILE', 'a') as f:
+        f.write(json.dumps(req) + '\n')
+    print('Approval required for: ' + tool_name)
+    print('Request ID: $REQUEST_ID')
+    print('Add response to: $RESPONSES_FILE')
+    print('Format: {\"id\": \"$REQUEST_ID\", \"decision\": \"approve\"}')
+except:
+    pass
+" >&2
+
+# Poll for response
+SECONDS_WAITED=0
+while [ "$SECONDS_WAITED" -lt "$APPROVAL_TIMEOUT" ]; do
+  if [ -f "$RESPONSES_FILE" ]; then
+    RESPONSE=$(python3 -c "
+import json, os
+req_id = '$REQUEST_ID'
+responses_file = '$RESPONSES_FILE'
+try:
+    with open(responses_file) as f:
+        for line in f:
+            line = line.strip()
+            if line:
+                try:
+                    r = json.loads(line)
+                    if r.get('id') == req_id:
+                        print(r.get('decision', 'deny'))
+                        exit(0)
+                except:
+                    pass
+except:
+    pass
+print('pending')
+" 2>/dev/null)
+    if [ "$RESPONSE" = "approve" ]; then
+      python3 -c "
+import json, time
+entry = {'id': '$REQUEST_ID', 'timestamp': '$(date -u +%Y-%m-%dT%H:%M:%SZ)', 'decision': 'approved'}
+with open('$APPROVALS_LOG', 'a') as f:
+    f.write(json.dumps(entry) + '\n')
+" 2>/dev/null
+      exit 0
+    elif [ "$RESPONSE" = "deny" ]; then
+      python3 -c "
+import json, time
+entry = {'id': '$REQUEST_ID', 'timestamp': '$(date -u +%Y-%m-%dT%H:%M:%SZ)', 'decision': 'denied'}
+with open('$APPROVALS_LOG', 'a') as f:
+    f.write(json.dumps(entry) + '\n')
+" 2>/dev/null
+      echo '{"decision": "block", "reason": "Approval denied by operator."}'
+      exit 2
+    fi
+  fi
+  sleep 5
+  SECONDS_WAITED=$(( SECONDS_WAITED + 5 ))
+done
+
+# Timeout — block
+echo "Approval timeout after ${APPROVAL_TIMEOUT}s — blocking tool call." >&2
+echo '{"decision": "block", "reason": "Approval timeout. Check governance/pending-approvals.jsonl."}'
+exit 2