# Session 3: OpenClaw Memory and Autonomy Patterns > Steps 9, 10, 11, 12 | Wave 1 | Depends on: none ## Dependencies Entry condition: none (independent — creates new template directories only) ## Scope Fence **Touch:** - `scripts/templates/memory/SESSION-STATE.md` (new) - `scripts/templates/memory/DAILY-LOG.md` (new) - `scripts/templates/memory/MEMORY.md` (new) - `scripts/templates/memory/README.md` (new) - `scripts/templates/heartbeat/HEARTBEAT.md` (new) - `scripts/templates/heartbeat/heartbeat-runner.sh` (new) - `scripts/templates/heartbeat/README.md` (new) - `scripts/templates/proactive/PROACTIVE-AGENT.md` (new) - `scripts/templates/proactive/ADL-RULES.md` (new) - `scripts/templates/proactive/VFM-SCORING.md` (new) - `scripts/templates/proactive/README.md` (new) - `scripts/templates/cron/agent-turn.sh` (new) - `scripts/templates/cron/system-event.sh` (new) - `scripts/templates/cron/README.md` (new) **Never touch:** - `commands/` - `agents/` - `skills/` - `scripts/templates/heartbeat/context-packet.md` (Session 4) - `scripts/templates/heartbeat/wake-prompt.md` (Session 4) - `.claude-plugin/` - `CLAUDE.md`, `README.md` --- ## Step 9: Create 3-tier memory templates ### Files to create **`scripts/templates/memory/SESSION-STATE.md`**: ```markdown # Session State: {{AGENT_NAME}} > Hot working memory. Updated every turn. Read first on resume. ## WAL Protocol **Write important details HERE before responding to the user.** This prevents data loss if the session crashes or context compacts mid-response. ## Current Task - Task: [what you're working on right now] - Started: [timestamp] - Status: [in progress / blocked / waiting] - Key decision: [the most important choice made this session] ## Context Window Usage - Estimated usage: [low / medium / high / DANGER ZONE] - If above 60%: activate Working Buffer below ## Active Decisions | Decision | Choice | Reason | Reversible? | |----------|--------|--------|-------------| | | | | | ## Pending Actions - [ ] [action 1] - [ ] [action 2] ## Working Buffer > Activate when context usage exceeds 60%. Capture key exchanges here > before they are lost to compaction. This is your safety net. ### Recent exchanges to preserve [Paste critical user messages and your key responses here when in the danger zone] ### Key facts from this session [Extract and list facts that would be lost on compaction] ## Compaction Recovery If you're reading this after a context compaction: 1. Read this SESSION-STATE.md first (you're here) 2. Read today's daily log: `memory/$(date +%Y-%m-%d).md` 3. Read memory/MEMORY.md for long-term context 4. Search daily logs for relevant prior context 5. Resume from the Current Task and Pending Actions above ``` **`scripts/templates/memory/DAILY-LOG.md`**: ```markdown # Daily Log: {{AGENT_NAME}} > Warm daily capture. One file per day. Filename: memory/YYYY-MM-DD.md > Auto-rotated: the pipeline creates a new file each day. ## {{DATE}} ### Summary of Work [1-3 sentences describing what was accomplished today] ### Decisions Made | Decision | Context | Outcome | |----------|---------|---------| | | | | ### Files Modified - [file path] — [what changed and why] ### Issues Encountered - [issue description] — [resolution or status] ### Quality Scores | Pipeline run | Reviewer score | Notes | |-------------|---------------|-------| | | | | ### Carry Forward > Items for the next session. These get checked on the next pipeline run. - [ ] [item that needs attention tomorrow] - [ ] [follow-up from today's work] ### Cost - Estimated tokens used: [if tracked] - Pipeline runs: [count] ``` **`scripts/templates/memory/MEMORY.md`**: ```markdown # Long-Term Memory: {{AGENT_NAME}} > Cold curated memory. Updated manually or after significant learnings. > This is the last file read during compaction recovery. ## Agent Identity - Name: {{AGENT_NAME}} - Role: [what this agent does] - Domain: {{DOMAIN}} - Created: {{DATE}} ## Key Learnings > Manually curated from daily logs. Only include insights that affect > future behavior. Delete entries that are no longer relevant. - [learning 1 — date discovered] - [learning 2 — date discovered] ## Recurring Patterns > Patterns observed across multiple runs. Used to improve agent behavior. | Pattern | Frequency | Impact | Action taken | |---------|-----------|--------|-------------| | | | | | ## Known Issues > Active issues that affect agent performance. - [issue] — [workaround] — [status: open/resolved] ## Project Context > Static context that doesn't change often. - Project: {{PROJECT_DIR}} - Pipeline: {{PIPELINE_NAME}} - Schedule: {{SCHEDULE}} - Deployment: [target] ## Last Updated [date — update this when you curate this file] ``` **`scripts/templates/memory/README.md`**: ```markdown # 3-Tier Memory System Inspired by OpenClaw's proactive agent memory pattern. Three tiers serve different purposes with different update frequencies. ## Architecture ``` Tier 1: SESSION-STATE.md (hot) - Updated every turn - Read FIRST on session start and compaction recovery - Contains: current task, decisions, pending actions, working buffer Tier 2: memory/YYYY-MM-DD.md (warm) - One file per day, auto-rotated - Updated at end of each pipeline run - Contains: daily summary, decisions, files modified, issues, carry-forward Tier 3: memory/MEMORY.md (cold) - Updated manually or after significant learnings - Read LAST during compaction recovery - Contains: identity, curated learnings, patterns, known issues ``` ## WAL Protocol (Write-Ahead Logging) Before responding to the user with important information, write it to SESSION-STATE.md first. This prevents data loss if: - The session crashes mid-response - Context compaction removes the exchange - The user's connection drops ## Working Buffer Protocol When context usage exceeds ~60% (the "danger zone"): 1. Activate the Working Buffer section in SESSION-STATE.md 2. Copy critical recent exchanges into the buffer 3. Extract and list key facts that would be lost on compaction 4. Continue working normally — the buffer is your safety net ## Compaction Recovery When Claude resumes after context compaction, it reads in this order: 1. SESSION-STATE.md (current task, decisions, working buffer) 2. Today's daily log (what happened today) 3. MEMORY.md (long-term context, known issues) 4. Search older daily logs if needed for specific context ## Integration with Agent Factory During `/agent-factory:build` Phase 2.5 (Memory Setup): 1. Copy these templates to the user's `memory/` directory 2. Replace `{{PLACEHOLDER}}` variables with project-specific values 3. Create the initial SESSION-STATE.md and MEMORY.md 4. Configure the pipeline skill to update daily logs after each run ## File locations after scaffolding ``` project/ memory/ SESSION-STATE.md (from Tier 1 template) MEMORY.md (from Tier 3 template) 2026-04-11.md (generated daily, from Tier 2 template) 2026-04-12.md ... ``` ``` ### Verify ```bash ls /Users/ktg/repos/agent-builder/scripts/templates/memory/ | wc -l ``` Expected: `4` ### On failure revert ### Checkpoint ```bash git commit -m "feat(templates): add 3-tier memory templates (OpenClaw pattern)" ``` --- ## Step 10: Create heartbeat and cron templates ### Files to create **`scripts/templates/heartbeat/HEARTBEAT.md`**: ```markdown # Heartbeat: {{AGENT_NAME}} Read this file on each heartbeat. Follow it strictly. Do not infer or repeat old tasks from prior chats. If nothing needs attention, reply HEARTBEAT_OK. ## Tasks tasks: - name: {{TASK_1_NAME}} interval: {{TASK_1_INTERVAL}} prompt: "{{TASK_1_PROMPT}}" - name: {{TASK_2_NAME}} interval: {{TASK_2_INTERVAL}} prompt: "{{TASK_2_PROMPT}}" ## Context {{CONTEXT_NOTES}} ## Rules - Only perform tasks listed above - Respect the interval — do not run a task before its next due time - If a task fails, log the error and continue to the next task - Respond with HEARTBEAT_OK if no tasks are due ``` **`scripts/templates/heartbeat/heartbeat-runner.sh`**: ```bash #!/bin/bash # Heartbeat runner for Claude Code agents. # Reads HEARTBEAT.md, checks which tasks are due, invokes claude -p for each. # # Bash 3.2 compatible: no associative arrays, no mapfile, no |& # Uses python3 for all JSON/YAML/date operations. # # Usage: ./heartbeat-runner.sh [--catchup] # --catchup: run missed tasks on first invocation (max 5, 5s stagger) # # Placeholders: # {{AGENT_NAME}} - name of the agent # {{WORKING_DIR}} - absolute path to project directory # {{MAX_TURNS}} - max turns per heartbeat (default: 10) # {{ACK_MAX_CHARS}} - suppress responses shorter than this (default: 300) AGENT_NAME="{{AGENT_NAME}}" WORKING_DIR="{{WORKING_DIR}}" MAX_TURNS="${MAX_TURNS:-10}" ACK_MAX_CHARS="${ACK_MAX_CHARS:-300}" HEARTBEAT_FILE="$WORKING_DIR/HEARTBEAT.md" STATE_FILE="$WORKING_DIR/.heartbeat-state.json" LOG_DIR="$WORKING_DIR/logs" CATCHUP_MODE=false if [ "$1" = "--catchup" ]; then CATCHUP_MODE=true fi # Ensure directories exist mkdir -p "$LOG_DIR" # --- Emptiness detection (OpenClaw pattern) --- # Skip API calls if heartbeat file has only headers/empty items is_heartbeat_empty() { python3 << 'PYEOF' import sys, re try: with open("$HEARTBEAT_FILE", "r") as f: content = f.read() except FileNotFoundError: print("true") sys.exit(0) # Strip markdown headers, blank lines, and YAML structure markers stripped = re.sub(r'^#+.*$', '', content, flags=re.MULTILINE) stripped = re.sub(r'^tasks:\s*$', '', stripped, flags=re.MULTILINE) stripped = re.sub(r'^\s*-\s*name:\s*\{\{.*\}\}\s*$', '', stripped, flags=re.MULTILINE) stripped = re.sub(r'^\s*$', '', stripped, flags=re.MULTILINE) stripped = stripped.strip() if len(stripped) < 20: print("true") else: print("false") PYEOF } HEARTBEAT_FILE_ACTUAL="$HEARTBEAT_FILE" EMPTY_CHECK=$(HEARTBEAT_FILE="$HEARTBEAT_FILE_ACTUAL" python3 -c " import sys, re, os hf = os.environ.get('HEARTBEAT_FILE', '') try: content = open(hf).read() except: print('true'); sys.exit(0) stripped = re.sub(r'^#+.*$', '', content, flags=re.MULTILINE) stripped = re.sub(r'^\s*$', '', stripped, flags=re.MULTILINE).strip() print('true' if len(stripped) < 20 else 'false') " 2>/dev/null) if [ "$EMPTY_CHECK" = "true" ]; then echo "$(date -u +%Y-%m-%dT%H:%M:%SZ) | heartbeat | SKIP (empty heartbeat file)" >> "$LOG_DIR/heartbeat.log" exit 0 fi # --- Parse tasks and check due times --- DUE_TASKS=$(python3 << PYEOF import json, re, os, time from datetime import datetime, timedelta heartbeat_file = "$HEARTBEAT_FILE_ACTUAL" state_file = "$STATE_FILE" catchup = "$CATCHUP_MODE" == "true" # Parse tasks from HEARTBEAT.md try: content = open(heartbeat_file).read() except FileNotFoundError: print("[]") exit(0) # Simple YAML-like task parsing tasks = [] current_task = {} for line in content.split('\n'): line = line.strip() m_name = re.match(r'-\s*name:\s*(.+)', line) m_interval = re.match(r'interval:\s*(.+)', line) m_prompt = re.match(r'prompt:\s*"(.+)"', line) if m_name: if current_task.get('name'): tasks.append(current_task) current_task = {'name': m_name.group(1).strip()} elif m_interval and current_task: current_task['interval'] = m_interval.group(1).strip() elif m_prompt and current_task: current_task['prompt'] = m_prompt.group(1).strip() if current_task.get('name'): tasks.append(current_task) # Load state try: state = json.load(open(state_file)) except: state = {} # Parse interval to seconds def parse_interval(s): s = s.strip() m = re.match(r'(\d+)\s*(m|min|h|hr|d)', s) if not m: return 3600 # default 1 hour val, unit = int(m.group(1)), m.group(2) if unit in ('m', 'min'): return val * 60 elif unit in ('h', 'hr'): return val * 3600 elif unit == 'd': return val * 86400 return 3600 # Check which tasks are due now = time.time() due = [] for task in tasks: name = task.get('name', '') interval_sec = parse_interval(task.get('interval', '1h')) last_run = state.get(name, {}).get('last_run', 0) if now - last_run >= interval_sec: due.append(task) elif catchup and last_run == 0: due.append(task) # Limit catchup to 5 tasks if catchup: due = due[:5] print(json.dumps(due)) PYEOF ) # --- Run due tasks --- TASK_COUNT=$(echo "$DUE_TASKS" | python3 -c "import sys,json; print(len(json.load(sys.stdin)))" 2>/dev/null) if [ "$TASK_COUNT" = "0" ]; then echo "$(date -u +%Y-%m-%dT%H:%M:%SZ) | heartbeat | HEARTBEAT_OK (no tasks due)" >> "$LOG_DIR/heartbeat.log" exit 0 fi echo "$DUE_TASKS" | python3 -c " import sys, json, subprocess, time, os tasks = json.load(sys.stdin) state_file = '$STATE_FILE' log_dir = '$LOG_DIR' working_dir = '$WORKING_DIR' max_turns = '$MAX_TURNS' ack_max = int('$ACK_MAX_CHARS') catchup = '$CATCHUP_MODE' == 'true' # Load state try: state = json.load(open(state_file)) except: state = {} for i, task in enumerate(tasks): name = task.get('name', 'unknown') prompt = task.get('prompt', '') if not prompt: continue # Stagger catchup tasks if catchup and i > 0: time.sleep(5) ts = time.strftime('%Y-%m-%dT%H:%M:%SZ', time.gmtime()) print(f'{ts} | heartbeat | RUNNING: {name}') try: result = subprocess.run( ['claude', '-p', prompt, '--output-format', 'text', '--max-turns', str(max_turns)], capture_output=True, text=True, timeout=600, cwd=working_dir ) output = result.stdout.strip() # Suppress short ack responses (OpenClaw ackMaxChars pattern) if len(output) <= ack_max and 'HEARTBEAT_OK' in output: log_line = f'{ts} | heartbeat | {name} | HEARTBEAT_OK (suppressed)' else: log_line = f'{ts} | heartbeat | {name} | completed ({len(output)} chars)' # Save full output with open(os.path.join(log_dir, f'heartbeat-{name}-{time.strftime(\"%Y-%m-%d\")}.log'), 'a') as f: f.write(f'--- {ts} ---\n{output}\n\n') except subprocess.TimeoutExpired: log_line = f'{ts} | heartbeat | {name} | TIMEOUT' except Exception as e: log_line = f'{ts} | heartbeat | {name} | ERROR: {str(e)}' with open(os.path.join(log_dir, 'heartbeat.log'), 'a') as f: f.write(log_line + '\n') # Update state state[name] = {'last_run': time.time()} # Save state with open(state_file, 'w') as f: json.dump(state, f, indent=2) " echo "Heartbeat complete: $TASK_COUNT tasks processed" ``` **`scripts/templates/heartbeat/README.md`**: ```markdown # Heartbeat Scheduling Combines OpenClaw's HEARTBEAT.md task format with Paperclip's interval-based heartbeat model. Designed for Claude Code agents running on a schedule. ## How it works 1. A scheduler (cron/launchd/systemd) runs `heartbeat-runner.sh` at a fixed interval (e.g., every 30 minutes) 2. The runner reads `HEARTBEAT.md` for task definitions 3. **Emptiness detection**: if the file has no real tasks, skip the API call entirely (saves cost — from OpenClaw) 4. For each task: check if it's due based on its interval and last-run time 5. Run due tasks via `claude -p` with the task's prompt 6. Suppress short acknowledgment responses (<300 chars containing HEARTBEAT_OK) 7. Update `.heartbeat-state.json` with last-run timestamps ## Two execution types (from OpenClaw) ### systemEvent Injects a text event into an existing session. Lightweight, no new session. Use for: notifications, status checks, simple updates. Template: `scripts/templates/cron/system-event.sh` ### agentTurn Fires a full agent turn with its own session. Full context, full tool access. Use for: background autonomous work, complex tasks, multi-step operations. Template: `scripts/templates/cron/agent-turn.sh` ## Startup catchup (OpenClaw pattern) When the runner starts after downtime (e.g., machine was off): - Run `heartbeat-runner.sh --catchup` - Processes up to 5 missed tasks - 5-second stagger between tasks (prevents thundering herd) ## Cost optimization - **Emptiness detection**: No API call if HEARTBEAT.md has no real content - **ackMaxChars suppression**: Responses under 300 chars with HEARTBEAT_OK are logged but not displayed (saves downstream processing) - **Interval-based**: Only run tasks when actually due, not every heartbeat ## Example cron entries ```bash # Run heartbeat every 30 minutes */30 * * * * /path/to/heartbeat-runner.sh >> /tmp/heartbeat.log 2>&1 # Run heartbeat every hour with catchup on restart @reboot /path/to/heartbeat-runner.sh --catchup >> /tmp/heartbeat.log 2>&1 0 * * * * /path/to/heartbeat-runner.sh >> /tmp/heartbeat.log 2>&1 ``` ## State file format `.heartbeat-state.json`: ```json { "email-check": { "last_run": 1712847600 }, "report-generation": { "last_run": 1712844000 } } ``` ``` ### Verify ```bash bash -n /Users/ktg/repos/agent-builder/scripts/templates/heartbeat/heartbeat-runner.sh ``` Expected: no syntax errors (exit 0) ### On failure retry — fix bash 3.2 syntax issues, then revert if still failing ### Checkpoint ```bash git commit -m "feat(templates): add heartbeat templates with emptiness detection and catchup" ``` --- ## Step 11: Create proactive agent templates with ADL/VFM ### Files to create **`scripts/templates/proactive/PROACTIVE-AGENT.md`**: ```markdown --- name: {{AGENT_NAME}} description: | A proactive agent that can identify improvements and self-modify within strict guardrails. Uses ADL (Anti-Drift Limits) and VFM (Value-First Modification) scoring to prevent uncontrolled drift. Context: Agent identifies a recurring inefficiency user: "Check for improvements" assistant: "I'll review recent performance data and propose changes via VFM scoring." Proactive improvement cycle triggered by performance review. model: sonnet tools: ["Read", "Write", "Edit", "Glob", "Grep", "Bash"] --- ## How you work You are a proactive agent. You don't just respond to tasks — you observe your environment, identify improvements, and implement changes that pass VFM scoring. ### Proactive cycle 1. **Observe**: Read performance data (feedback/FEEDBACK.md, audit.log, cost-events.jsonl) 2. **Identify**: Find patterns: recurring errors, slow steps, unnecessary work 3. **Score**: Run VFM scoring on each proposed change (see VFM protocol below) 4. **Implement**: Only changes with VFM score > 50. All others logged but not applied. 5. **Log**: Record every decision (implement or defer) with scores and reasoning ### VFM protocol Before making ANY change to your own config, skills, prompts, or behavior: 1. Read `${CLAUDE_PLUGIN_ROOT}/scripts/templates/proactive/VFM-SCORING.md` 2. Score the proposed change across 4 dimensions (0-25 each) 3. If total score > 50: implement and log 4. If total score <= 50: log with reason for deferral, do NOT implement ### Self-healing protocol When encountering errors: 1. Log the error with full context 2. Try approach 1 (most likely fix based on error message) 3. If fail: try approach 2 (alternative strategy) 4. If fail: try approach 3 (simplified version) 5. Continue up to 5 attempts with increasingly conservative approaches 6. After 5 failures: escalate to human with full attempt log ## Rules (ADL — Anti-Drift Limits) Read the full ADL at `${CLAUDE_PLUGIN_ROOT}/scripts/templates/proactive/ADL-RULES.md`. Core constraints: - **No fake intelligence**: Do not simulate capabilities you lack - **No unverifiable modifications**: Every change must be testable - **No novelty over stability**: Prefer proven approaches over clever ones - **No scope expansion without approval**: Stay within your defined boundaries - **No silent failures**: All errors must be logged Priority ordering: Stability > Explainability > Reusability > Scalability > Novelty ## Output format After each proactive cycle, produce: ``` PROACTIVE CYCLE REPORT ====================== Date: [timestamp] Observations: [N] patterns found Proposals: [N] changes evaluated | Proposed change | VFM score | Decision | Reason | |----------------|-----------|----------|--------| | [change 1] | [score] | implement/defer | [why] | ... Implemented: [N] Deferred: [N] Errors handled: [N] (max attempt: [N]) ``` ``` **`scripts/templates/proactive/ADL-RULES.md`**: ```markdown # Anti-Drift Limits (ADL) Guardrails that prevent proactive agents from drifting beyond useful behavior. Inspired by OpenClaw's proactive agent skill. ## Constraints ### 1. No fake intelligence Do not simulate capabilities you do not have. If you cannot access a tool, do not pretend the operation succeeded. If you cannot verify a fact, say so. ### 2. No unverifiable modifications Every change you make must be testable. Before implementing: - Define how to verify the change worked - Run the verification after implementation - Revert if verification fails ### 3. No novelty over stability When choosing between a clever new approach and a proven existing one, choose the proven approach unless VFM scoring strongly favors the new one (score > 75). ### 4. No scope expansion without approval Your boundaries are defined by your agent file and CLAUDE.md. You may optimize within those boundaries. You may NOT: - Add new tools to your own configuration - Modify other agents' files - Change system-level settings - Create new agents or skills ### 5. No silent failures Every error, every failed attempt, every unexpected result must be logged. Write to the daily log (memory/YYYY-MM-DD.md) or a dedicated error log. ## Priority Ordering When constraints conflict, apply this priority: ``` Stability > Explainability > Reusability > Scalability > Novelty ``` A stable system that is hard to understand is better than a novel system that breaks. An explainable system that doesn't scale is better than a scalable system that nobody can debug. ## When to override ADL ADL can be overridden ONLY by explicit human instruction. If the user says "try the new approach even though it's risky," that overrides constraint #3. Log the override with the user's exact instruction. Never self-override. The whole point of ADL is to prevent the agent from convincing itself that an exception is warranted. ``` **`scripts/templates/proactive/VFM-SCORING.md`**: ```markdown # Value-First Modification (VFM) Scoring Scoring rubric for evaluating proposed self-modifications. Any change to agent config, prompts, behavior, or pipeline structure must score > 50 to be implemented. ## Dimensions ### Frequency (0-25 points) How often does the issue this change addresses occur? | Score | Criteria | |-------|----------| | 0-5 | Happened once, may not recur | | 6-10 | Happens occasionally (1-2x per week) | | 11-15 | Happens regularly (daily) | | 16-20 | Happens frequently (multiple times per day) | | 21-25 | Happens on nearly every run | ### Failure Reduction (0-25 points) Does this change fix real failures? | Score | Criteria | |-------|----------| | 0-5 | Cosmetic improvement, no failures prevented | | 6-10 | Prevents occasional warnings or non-critical errors | | 11-15 | Prevents errors that require manual intervention | | 16-20 | Prevents errors that cause pipeline failure | | 21-25 | Prevents errors that cause data loss or system damage | ### Burden Reduction (0-25 points) Does this reduce human effort? | Score | Criteria | |-------|----------| | 0-5 | Saves less than 1 minute per occurrence | | 6-10 | Saves 1-5 minutes per occurrence | | 11-15 | Saves 5-30 minutes per occurrence | | 16-20 | Eliminates a manual step entirely | | 21-25 | Eliminates multiple manual steps or a recurring task | ### Cost Savings (0-25 points) Does this reduce API/compute costs? | Score | Criteria | |-------|----------| | 0-5 | Negligible cost difference | | 6-10 | Saves <10% on affected operations | | 11-15 | Saves 10-25% on affected operations | | 16-20 | Saves 25-50% on affected operations | | 21-25 | Saves >50% or eliminates unnecessary API calls entirely | ## Decision threshold | Total score | Decision | |-------------|----------| | > 50 | **Implement** — change is worth the risk | | 26-50 | **Defer** — log for future consideration | | <= 25 | **Reject** — not worth pursuing | ## Logging format Every VFM evaluation must be logged, whether implemented or not: ``` VFM EVALUATION Date: [timestamp] Proposed change: [description] Scores: Frequency: [score] — [justification] Failure reduction: [score] — [justification] Burden reduction: [score] — [justification] Cost savings: [score] — [justification] Total: [sum]/100 Decision: implement / defer / reject ``` ## Worked examples ### Example 1: Add retry logic to web search (Implement) - Frequency: 18 (search fails ~3x daily due to timeouts) - Failure reduction: 15 (prevents pipeline stall requiring manual restart) - Burden reduction: 16 (eliminates manual re-run) - Cost savings: 8 (slight cost from retry, but saves failed run cost) - **Total: 57 → Implement** ### Example 2: Refactor prompt to use XML tags (Defer) - Frequency: 25 (every run) - Failure reduction: 3 (current format works fine) - Burden reduction: 2 (no human effort saved) - Cost savings: 5 (maybe slightly fewer tokens) - **Total: 35 → Defer** (improvement is real but marginal) ### Example 3: Switch to experimental model (Reject) - Frequency: 25 (every run) - Failure reduction: 0 (current model has no failures) - Burden reduction: 0 (no human effort saved) - Cost savings: 10 (newer model might be cheaper) - **Total: 35 → Defer** (stability > novelty per ADL) ``` **`scripts/templates/proactive/README.md`**: ```markdown # Proactive Agent Pattern A proactive agent observes its environment, identifies improvements, and self-modifies within strict guardrails. This pattern is inspired by OpenClaw's proactive agent skill. ## When to use - Agents that run frequently and should improve over time - Pipelines with measurable performance metrics - Systems where the cost of not improving exceeds the risk of changes ## When NOT to use - Simple pipelines that just need to run reliably - Human-in-the-loop workflows (the human provides the feedback) - New systems that haven't established a performance baseline yet ## Components - **PROACTIVE-AGENT.md**: Agent template with proactive cycle, VFM protocol, self-healing - **ADL-RULES.md**: Anti-Drift Limits — constraints that prevent uncontrolled drift - **VFM-SCORING.md**: Value-First Modification — scoring rubric for proposed changes ## How ADL and VFM work together ADL defines what the agent CANNOT do (hard boundaries). VFM determines what the agent SHOULD do (prioritization within boundaries). ``` Proposed change → Check ADL constraints → BLOCKED if constraint violated → Score with VFM → IMPLEMENT if > 50, DEFER if <= 50 → Log decision either way ``` ## Integration with feedback loops The proactive agent reads from: - `feedback/FEEDBACK.md` — pipeline run outcomes - `budget/cost-events.jsonl` — cost data - `logs/audit.log` — tool call history - `memory/MEMORY.md` — long-term patterns It writes to: - Daily log (decisions and scores) - Its own agent file (when implementing approved changes) - SESSION-STATE.md (current proactive cycle state) ``` ### Verify ```bash ls /Users/ktg/repos/agent-builder/scripts/templates/proactive/ | wc -l ``` Expected: `4` ### On failure revert ### Checkpoint ```bash git commit -m "feat(templates): add proactive agent templates with ADL/VFM guardrails" ``` --- ## Step 12: Create cron templates (agentTurn + systemEvent) ### Files to create **`scripts/templates/cron/agent-turn.sh`**: ```bash #!/bin/bash # Agent Turn: Full background autonomy for Claude Code agents. # Fires a complete agent turn with its own named session. # # Bash 3.2 compatible. Uses python3 for JSON/date operations. # # Placeholders: # {{AGENT_NAME}} - name of the agent # {{WORKING_DIR}} - absolute path to project directory # {{MAX_TURNS}} - max turns per agent turn (default: 15) AGENT_NAME="{{AGENT_NAME}}" WORKING_DIR="{{WORKING_DIR}}" MAX_TURNS="${MAX_TURNS:-15}" SESSION_NAME="agent:${AGENT_NAME}:turn" PID_FILE="$WORKING_DIR/.agent-turn-${AGENT_NAME}.pid" LOG_DIR="$WORKING_DIR/logs" STATE_FILE="$WORKING_DIR/.agent-turn-state.json" mkdir -p "$LOG_DIR" # --- Orphan detection --- # Check if a previous agent turn is still running if [ -f "$PID_FILE" ]; then OLD_PID=$(cat "$PID_FILE" 2>/dev/null) if [ -n "$OLD_PID" ] && kill -0 "$OLD_PID" 2>/dev/null; then echo "$(date -u +%Y-%m-%dT%H:%M:%SZ) | agent-turn | SKIP: previous run still active (PID $OLD_PID)" >> "$LOG_DIR/agent-turn.log" exit 0 else echo "$(date -u +%Y-%m-%dT%H:%M:%SZ) | agent-turn | Cleaned orphan PID file (PID $OLD_PID)" >> "$LOG_DIR/agent-turn.log" rm -f "$PID_FILE" fi fi # --- Session rollover check --- # After 200 turns or 72 hours, start a fresh session NEEDS_FRESH=$(python3 -c " import json, time, os state_file = '$STATE_FILE' agent = '$AGENT_NAME' try: state = json.load(open(state_file)) agent_state = state.get(agent, {}) turns = agent_state.get('turn_count', 0) started = agent_state.get('session_started', 0) age_hours = (time.time() - started) / 3600 if started else 999 if turns >= 200 or age_hours >= 72: print('true') else: print('false') except: print('true') " 2>/dev/null) if [ "$NEEDS_FRESH" = "true" ]; then # Use a new session name with timestamp for rollover SESSION_NAME="agent:${AGENT_NAME}:turn:$(date +%s)" # Reset state python3 -c " import json, time, os state_file = '$STATE_FILE' agent = '$AGENT_NAME' try: state = json.load(open(state_file)) except: state = {} state[agent] = {'turn_count': 0, 'session_started': time.time(), 'session_name': '$SESSION_NAME'} with open(state_file, 'w') as f: json.dump(state, f, indent=2) " fi # --- Load context --- # Build prompt from HEARTBEAT.md and SESSION-STATE.md PROMPT=$(python3 -c " import os working_dir = '$WORKING_DIR' parts = [] # Read SESSION-STATE.md for current context ss = os.path.join(working_dir, 'memory', 'SESSION-STATE.md') if os.path.isdir(os.path.join(working_dir, 'memory')) else os.path.join(working_dir, 'SESSION-STATE.md') if os.path.exists(ss): parts.append('## Current session state:\n' + open(ss).read()[:2000]) # Read HEARTBEAT.md for tasks hb = os.path.join(working_dir, 'HEARTBEAT.md') if os.path.exists(hb): parts.append('## Heartbeat tasks:\n' + open(hb).read()[:2000]) if parts: print('\n\n'.join(parts)) else: print('No context files found. Check the working directory and proceed with any pending tasks.') " 2>/dev/null) # --- Run agent turn --- echo $$ > "$PID_FILE" TIMESTAMP=$(date -u +%Y-%m-%dT%H:%M:%SZ) echo "$TIMESTAMP | agent-turn | START: $AGENT_NAME (session: $SESSION_NAME)" >> "$LOG_DIR/agent-turn.log" cd "$WORKING_DIR" OUTPUT=$(claude -p "$PROMPT" \ --name "$SESSION_NAME" \ --output-format text \ --max-turns "$MAX_TURNS" 2>&1) EXIT_CODE=$? TIMESTAMP=$(date -u +%Y-%m-%dT%H:%M:%SZ) # Save output to dated log echo "--- $TIMESTAMP ---" >> "$LOG_DIR/agent-turn-${AGENT_NAME}-$(date +%Y-%m-%d).log" echo "$OUTPUT" >> "$LOG_DIR/agent-turn-${AGENT_NAME}-$(date +%Y-%m-%d).log" echo "" >> "$LOG_DIR/agent-turn-${AGENT_NAME}-$(date +%Y-%m-%d).log" # Update turn count python3 -c " import json, time state_file = '$STATE_FILE' agent = '$AGENT_NAME' try: state = json.load(open(state_file)) except: state = {} if agent not in state: state[agent] = {'turn_count': 0, 'session_started': time.time(), 'session_name': '$SESSION_NAME'} state[agent]['turn_count'] = state[agent].get('turn_count', 0) + 1 state[agent]['last_run'] = time.time() with open(state_file, 'w') as f: json.dump(state, f, indent=2) " if [ "$EXIT_CODE" -eq 0 ]; then echo "$TIMESTAMP | agent-turn | COMPLETE: $AGENT_NAME (exit $EXIT_CODE)" >> "$LOG_DIR/agent-turn.log" else echo "$TIMESTAMP | agent-turn | ERROR: $AGENT_NAME (exit $EXIT_CODE)" >> "$LOG_DIR/agent-turn.log" fi # Cleanup rm -f "$PID_FILE" ``` **`scripts/templates/cron/system-event.sh`**: ```bash #!/bin/bash # System Event: Inject a text event into an existing Claude Code session. # Lighter than agentTurn — does not create a new session. # # Bash 3.2 compatible. # # Usage: ./system-event.sh "session-name" "Event text to inject" # # Placeholders: # {{WORKING_DIR}} - absolute path to project directory WORKING_DIR="{{WORKING_DIR}}" SESSION_NAME="${1:-}" EVENT_TEXT="${2:-}" if [ -z "$SESSION_NAME" ] || [ -z "$EVENT_TEXT" ]; then echo "Usage: $0 " echo "Example: $0 'agent:researcher:turn' 'New data available in /data/inbox'" exit 1 fi LOG_DIR="$WORKING_DIR/logs" mkdir -p "$LOG_DIR" TIMESTAMP=$(date -u +%Y-%m-%dT%H:%M:%SZ) echo "$TIMESTAMP | system-event | INJECT: $SESSION_NAME — $EVENT_TEXT" >> "$LOG_DIR/system-event.log" cd "$WORKING_DIR" # Resume the named session and inject the event OUTPUT=$(claude --resume "$SESSION_NAME" -p "$EVENT_TEXT" \ --output-format text \ --max-turns 3 2>&1) EXIT_CODE=$? TIMESTAMP=$(date -u +%Y-%m-%dT%H:%M:%SZ) if [ "$EXIT_CODE" -eq 0 ]; then echo "$TIMESTAMP | system-event | DELIVERED: $SESSION_NAME (exit $EXIT_CODE)" >> "$LOG_DIR/system-event.log" else echo "$TIMESTAMP | system-event | FAILED: $SESSION_NAME (exit $EXIT_CODE)" >> "$LOG_DIR/system-event.log" fi ``` **`scripts/templates/cron/README.md`**: ```markdown # Cron Execution Templates Two execution types for scheduled agent work, inspired by OpenClaw's cron service. ## agentTurn (agent-turn.sh) Full background autonomy. Creates/resumes its own named session. **Use for:** - Complex multi-step work - Tasks that need full context and tool access - Background autonomous operation (proactive agent cycles) **Features:** - Named sessions via `--name` for resume capability - Orphan detection (checks PID before starting new run) - Session rollover: fresh session after 200 turns or 72 hours - Context loading from SESSION-STATE.md and HEARTBEAT.md - Dated log files per agent per day **Session naming (VERIFIED):** Uses `--name "agent::turn"` with `--resume` for continuity. Note: `--session-id` requires valid UUIDs. Named sessions are the correct approach for human-readable, resumable agent sessions. ## systemEvent (system-event.sh) Injects text into an existing session. No new session created. **Use for:** - Notifications ("new data available") - Simple status checks - Triggering a specific action in a running session **Limitations:** - Requires an active named session to resume - Limited to 3 turns (quick action, not full autonomy) - If the session doesn't exist, the command fails gracefully ## Scheduling examples ```bash # agentTurn: run full agent turn every hour 0 * * * * /path/to/agent-turn.sh >> /tmp/agent-turn.log 2>&1 # systemEvent: notify agent of new data every 30 min */30 * * * * /path/to/system-event.sh "agent:processor:turn" "Check /data/inbox for new files" # agentTurn with catchup on reboot @reboot /path/to/agent-turn.sh >> /tmp/agent-turn.log 2>&1 ``` ``` ### Verify ```bash bash -n /Users/ktg/repos/agent-builder/scripts/templates/cron/agent-turn.sh && bash -n /Users/ktg/repos/agent-builder/scripts/templates/cron/system-event.sh && echo "VALID" ``` Expected: `VALID` ### On failure retry — fix bash 3.2 syntax issues, then revert if still failing ### Checkpoint ```bash git commit -m "feat(templates): add isolated agentTurn and systemEvent cron templates" ``` --- ## Exit Condition - [ ] `ls /Users/ktg/repos/agent-builder/scripts/templates/memory/ | wc -l` → 4 - [ ] `ls /Users/ktg/repos/agent-builder/scripts/templates/heartbeat/ | wc -l` → 3 (HEARTBEAT.md, heartbeat-runner.sh, README.md) - [ ] `ls /Users/ktg/repos/agent-builder/scripts/templates/proactive/ | wc -l` → 4 - [ ] `ls /Users/ktg/repos/agent-builder/scripts/templates/cron/ | wc -l` → 3 - [ ] `bash -n /Users/ktg/repos/agent-builder/scripts/templates/heartbeat/heartbeat-runner.sh` → exit 0 - [ ] `bash -n /Users/ktg/repos/agent-builder/scripts/templates/cron/agent-turn.sh` → exit 0 - [ ] `bash -n /Users/ktg/repos/agent-builder/scripts/templates/cron/system-event.sh` → exit 0 - [ ] All memory templates contain `{{AGENT_NAME}}` placeholder: `grep -l "AGENT_NAME" /Users/ktg/repos/agent-builder/scripts/templates/memory/*.md | wc -l` → 3 ## Quality Criteria - 3-tier memory matches OpenClaw pattern: SESSION-STATE (hot), daily logs (warm), MEMORY (cold) - WAL protocol instructions included in SESSION-STATE template - Working Buffer protocol with 60% danger zone threshold included - Compaction recovery order documented - heartbeat-runner.sh implements emptiness detection (cost saving) - heartbeat-runner.sh implements startup catchup with stagger - heartbeat-runner.sh suppresses ack responses under 300 chars - All shell scripts pass `bash -n` syntax check - agent-turn.sh uses `--name` (not custom session IDs per verified assumption) - ADL and VFM scoring rubrics have concrete thresholds and worked examples