docs(plans): create session blueprints for Agent Factory execution
8 session blueprints covering all 27 steps across 3 waves: - Session 1: Foundation (rename + commands, Steps 1-5) - Session 2: Skills and templates (Steps 6-7) - Session 3: OpenClaw patterns (memory/heartbeat/proactive/cron, Steps 9-12) - Session 4: Paperclip patterns (context/goals/budget/governance/org-chart, Steps 14-18) - Session 5: Self-learning (feedback/optimization, Steps 20-21) - Session 6: Integration (Docker/transfer/5 more domains, Steps 22-24) - Session 7: Skill updates (memory/autonomy/orchestration/governance/MCP refs, Steps 13,19,25) - Session 8: Finalization (build command integration + v1.0, Steps 8,26,27) Also updates plan assumptions table with verified findings. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
3202818c28
commit
1a776bdeb2
9 changed files with 8885 additions and 8 deletions
757
.claude/plans/blueprints/session-4-paperclip.md
Normal file
757
.claude/plans/blueprints/session-4-paperclip.md
Normal file
|
|
@ -0,0 +1,757 @@
|
|||
# Session 4: Paperclip Orchestration Patterns
|
||||
|
||||
> Steps 14, 15, 16, 17, 18 | Wave 1 | Depends on: none
|
||||
|
||||
## Dependencies
|
||||
|
||||
Entry condition: none (creates new template directories only, plus 2 new files in heartbeat/)
|
||||
|
||||
## Scope Fence
|
||||
|
||||
**Touch:**
|
||||
- `scripts/templates/heartbeat/context-packet.md` (new)
|
||||
- `scripts/templates/heartbeat/wake-prompt.md` (new)
|
||||
- `scripts/templates/goals/GOALS.md` (new)
|
||||
- `scripts/templates/goals/goal-tracker.sh` (new)
|
||||
- `scripts/templates/goals/README.md` (new)
|
||||
- `scripts/templates/budget/budget-hook.sh` (new)
|
||||
- `scripts/templates/budget/BUDGET.md` (new)
|
||||
- `scripts/templates/budget/budget-report.sh` (new)
|
||||
- `scripts/templates/budget/README.md` (new)
|
||||
- `scripts/templates/governance/GOVERNANCE.md` (new)
|
||||
- `scripts/templates/governance/approval-gate.sh` (new)
|
||||
- `scripts/templates/governance/README.md` (new)
|
||||
- `scripts/templates/org-chart/ORG-CHART.md` (new)
|
||||
- `scripts/templates/org-chart/org-manager.sh` (new)
|
||||
- `scripts/templates/org-chart/README.md` (new)
|
||||
|
||||
**Never touch:**
|
||||
- `commands/`
|
||||
- `agents/`
|
||||
- `skills/`
|
||||
- `scripts/templates/heartbeat/HEARTBEAT.md` (Session 3)
|
||||
- `scripts/templates/heartbeat/heartbeat-runner.sh` (Session 3)
|
||||
- `scripts/templates/heartbeat/README.md` (Session 3)
|
||||
- `scripts/templates/memory/`
|
||||
- `scripts/templates/proactive/`
|
||||
- `scripts/templates/cron/`
|
||||
- `.claude-plugin/`, `CLAUDE.md`, `README.md`
|
||||
|
||||
---
|
||||
|
||||
## Step 14: Create heartbeat context injection templates
|
||||
|
||||
### Files to create
|
||||
|
||||
**`scripts/templates/heartbeat/context-packet.md`** — Paperclip's "Memento Man" pattern:
|
||||
|
||||
```markdown
|
||||
# Context Packet: {{AGENT_NAME}}
|
||||
Generated: {{TIMESTAMP}}
|
||||
|
||||
## Identity
|
||||
{{AGENT_IDENTITY}}
|
||||
|
||||
## Current Goals
|
||||
{{ACTIVE_GOALS}}
|
||||
|
||||
## Memory State
|
||||
{{MEMORY_SUMMARY}}
|
||||
|
||||
## Task Queue
|
||||
{{PENDING_TASKS}}
|
||||
|
||||
## Recent Events (last 24h)
|
||||
{{RECENT_EVENTS}}
|
||||
|
||||
## Wake Reason
|
||||
{{WAKE_REASON}}
|
||||
|
||||
## Budget Status
|
||||
Spent: {{BUDGET_SPENT}} / {{BUDGET_LIMIT}} ({{BUDGET_PERCENT}}%)
|
||||
{{BUDGET_WARNING}}
|
||||
```
|
||||
|
||||
**`scripts/templates/heartbeat/wake-prompt.md`** — Prompt template for each heartbeat wakeup:
|
||||
|
||||
```markdown
|
||||
You are {{AGENT_NAME}}. You are waking up for a scheduled heartbeat.
|
||||
|
||||
Read the context packet below carefully. It contains everything you
|
||||
need to know about your current state and pending work.
|
||||
|
||||
{{CONTEXT_PACKET}}
|
||||
|
||||
Your task for this beat:
|
||||
{{WAKE_REASON}}
|
||||
|
||||
Rules:
|
||||
- Do NOT infer tasks from prior conversations
|
||||
- Only work on what is in the context packet and wake reason
|
||||
- If nothing needs attention, respond with HEARTBEAT_OK
|
||||
- Update SESSION-STATE.md before finishing
|
||||
```
|
||||
|
||||
### Verify
|
||||
|
||||
```bash
|
||||
ls /Users/ktg/repos/agent-builder/scripts/templates/heartbeat/ | wc -l
|
||||
```
|
||||
Expected: `5` (3 from Session 3 + 2 from this step)
|
||||
|
||||
### On failure: revert
|
||||
|
||||
### Checkpoint
|
||||
```bash
|
||||
git commit -m "feat(templates): add context injection templates (Paperclip heartbeat pattern)"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 15: Create goal hierarchy templates
|
||||
|
||||
### Files to create
|
||||
|
||||
**`scripts/templates/goals/GOALS.md`**:
|
||||
|
||||
```markdown
|
||||
# Goals: {{PROJECT_NAME}}
|
||||
|
||||
## Company Goals
|
||||
- [G1] {{COMPANY_GOAL_1}}
|
||||
- [G2] {{COMPANY_GOAL_2}}
|
||||
|
||||
## Project Goals
|
||||
- [G1.1] {{PROJECT_GOAL_1}} (parent: G1)
|
||||
- [G1.2] {{PROJECT_GOAL_2}} (parent: G1)
|
||||
- [G2.1] {{PROJECT_GOAL_3}} (parent: G2)
|
||||
|
||||
## Task Goals
|
||||
- [G1.1.1] {{TASK_GOAL_1}} (parent: G1.1, owner: {{AGENT_NAME}}, status: active)
|
||||
- [G1.1.2] {{TASK_GOAL_2}} (parent: G1.1, owner: {{AGENT_NAME}}, status: pending)
|
||||
|
||||
## Notes
|
||||
|
||||
Goal IDs use hierarchical dot notation. Each goal has:
|
||||
- ID: unique identifier (e.g., G1.1.1)
|
||||
- Description: what the goal is
|
||||
- Parent: which goal this supports (simple parent reference, not recursive)
|
||||
- Owner: which agent is responsible (task goals only)
|
||||
- Status: active | pending | complete | blocked
|
||||
```
|
||||
|
||||
**`scripts/templates/goals/goal-tracker.sh`**:
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# Goal tracker: parse and manage GOALS.md
|
||||
# Bash 3.2 compatible. Uses python3 for parsing.
|
||||
#
|
||||
# Usage:
|
||||
# ./goal-tracker.sh # Show goal summary
|
||||
# ./goal-tracker.sh complete G1.1.1 # Mark goal as complete
|
||||
# ./goal-tracker.sh status # Show status counts
|
||||
# ./goal-tracker.sh context # Generate context for heartbeat injection
|
||||
#
|
||||
# Placeholders:
|
||||
# {{WORKING_DIR}} - absolute path to project directory
|
||||
|
||||
WORKING_DIR="{{WORKING_DIR}}"
|
||||
GOALS_FILE="$WORKING_DIR/GOALS.md"
|
||||
ACTION="${1:-summary}"
|
||||
GOAL_ID="${2:-}"
|
||||
|
||||
if [ ! -f "$GOALS_FILE" ]; then
|
||||
echo "Error: $GOALS_FILE not found"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
case "$ACTION" in
|
||||
summary|status)
|
||||
python3 << PYEOF
|
||||
import re
|
||||
|
||||
goals = []
|
||||
with open("$GOALS_FILE") as f:
|
||||
for line in f:
|
||||
m = re.match(r'-\s*\[(\S+)\]\s+(.+)', line.strip())
|
||||
if m:
|
||||
gid = m.group(1)
|
||||
rest = m.group(2)
|
||||
status_m = re.search(r'status:\s*(\w+)', rest)
|
||||
parent_m = re.search(r'parent:\s*(\S+)', rest)
|
||||
owner_m = re.search(r'owner:\s*(\S+)', rest)
|
||||
status = status_m.group(1) if status_m else 'active'
|
||||
parent = parent_m.group(1).rstrip(',)') if parent_m else None
|
||||
owner = owner_m.group(1).rstrip(',)') if owner_m else None
|
||||
desc = re.sub(r'\(.*\)', '', rest).strip()
|
||||
goals.append({'id': gid, 'desc': desc, 'status': status, 'parent': parent, 'owner': owner})
|
||||
|
||||
# Status counts
|
||||
counts = {}
|
||||
for g in goals:
|
||||
counts[g['status']] = counts.get(g['status'], 0) + 1
|
||||
|
||||
print("Goal Status Summary")
|
||||
print("=" * 40)
|
||||
for status, count in sorted(counts.items()):
|
||||
print(f" {status}: {count}")
|
||||
print(f" Total: {len(goals)}")
|
||||
|
||||
# Check for orphans
|
||||
all_ids = set(g['id'] for g in goals)
|
||||
orphans = [g for g in goals if g['parent'] and g['parent'] not in all_ids]
|
||||
if orphans:
|
||||
print(f"\nOrphaned goals (parent not found):")
|
||||
for g in orphans:
|
||||
print(f" [{g['id']}] parent: {g['parent']}")
|
||||
|
||||
# Goals without owners at task level
|
||||
unowned = [g for g in goals if '.' in g['id'] and g['id'].count('.') >= 2 and not g['owner']]
|
||||
if unowned:
|
||||
print(f"\nTask goals without owners:")
|
||||
for g in unowned:
|
||||
print(f" [{g['id']}] {g['desc']}")
|
||||
PYEOF
|
||||
;;
|
||||
|
||||
complete)
|
||||
if [ -z "$GOAL_ID" ]; then
|
||||
echo "Usage: $0 complete <goal-id>"
|
||||
exit 1
|
||||
fi
|
||||
python3 -c "
|
||||
import re
|
||||
goal_id = '$GOAL_ID'
|
||||
with open('$GOALS_FILE') as f:
|
||||
content = f.read()
|
||||
# Replace status for the specific goal
|
||||
pattern = r'(\[' + re.escape(goal_id) + r'\][^)]*status:\s*)\w+'
|
||||
if re.search(pattern, content):
|
||||
content = re.sub(pattern, r'\1complete', content)
|
||||
with open('$GOALS_FILE', 'w') as f:
|
||||
f.write(content)
|
||||
print(f'Goal {goal_id} marked as complete')
|
||||
else:
|
||||
print(f'Goal {goal_id} not found or has no status field')
|
||||
"
|
||||
;;
|
||||
|
||||
context)
|
||||
# Generate a goal summary for heartbeat context injection
|
||||
python3 << PYEOF
|
||||
import re
|
||||
|
||||
goals = []
|
||||
with open("$GOALS_FILE") as f:
|
||||
for line in f:
|
||||
m = re.match(r'-\s*\[(\S+)\]\s+(.+)', line.strip())
|
||||
if m:
|
||||
gid = m.group(1)
|
||||
rest = m.group(2)
|
||||
status_m = re.search(r'status:\s*(\w+)', rest)
|
||||
status = status_m.group(1) if status_m else 'active'
|
||||
desc = re.sub(r'\(.*\)', '', rest).strip()
|
||||
goals.append({'id': gid, 'desc': desc, 'status': status})
|
||||
|
||||
active = [g for g in goals if g['status'] == 'active']
|
||||
if active:
|
||||
print("Active goals:")
|
||||
for g in active:
|
||||
print(f" [{g['id']}] {g['desc']}")
|
||||
else:
|
||||
print("No active goals.")
|
||||
PYEOF
|
||||
;;
|
||||
|
||||
*)
|
||||
echo "Usage: $0 [summary|complete <id>|status|context]"
|
||||
exit 1
|
||||
;;
|
||||
esac
|
||||
```
|
||||
|
||||
**`scripts/templates/goals/README.md`**:
|
||||
|
||||
```markdown
|
||||
# Goal Hierarchy
|
||||
|
||||
File-based goal hierarchy inspired by Paperclip's goal system.
|
||||
|
||||
## Design decisions
|
||||
|
||||
- **Simple parent_id, not recursive**: Each goal references its parent by ID.
|
||||
No recursive traversal at runtime — matching Paperclip's actual implementation
|
||||
(not their aspirational docs which describe "full ancestry").
|
||||
- **Dot notation for hierarchy**: G1 → G1.1 → G1.1.1 makes the hierarchy visible
|
||||
in the ID itself.
|
||||
- **File-based, not database**: Human-editable, version-controlled, no dependencies.
|
||||
|
||||
## Goal levels
|
||||
|
||||
| Level | ID pattern | Example | Description |
|
||||
|-------|-----------|---------|-------------|
|
||||
| Company | G1, G2 | G1 | Top-level organizational goals |
|
||||
| Project | G1.1, G1.2 | G1.1 | Goals that support a company goal |
|
||||
| Task | G1.1.1 | G1.1.1 | Specific tasks assigned to agents |
|
||||
|
||||
## Usage
|
||||
|
||||
```bash
|
||||
# View goal summary
|
||||
./goal-tracker.sh
|
||||
|
||||
# Mark a task as complete
|
||||
./goal-tracker.sh complete G1.1.1
|
||||
|
||||
# Generate context for heartbeat injection
|
||||
./goal-tracker.sh context
|
||||
```
|
||||
|
||||
## Integration with heartbeat
|
||||
|
||||
The `context` command produces a summary suitable for injection into
|
||||
the heartbeat context packet (see `scripts/templates/heartbeat/context-packet.md`).
|
||||
The heartbeat runner can call `./goal-tracker.sh context` and inject
|
||||
the output as `{{ACTIVE_GOALS}}`.
|
||||
```
|
||||
|
||||
### Verify
|
||||
|
||||
```bash
|
||||
bash -n /Users/ktg/repos/agent-builder/scripts/templates/goals/goal-tracker.sh && echo "VALID"
|
||||
```
|
||||
Expected: `VALID`
|
||||
|
||||
### On failure: retry — fix bash syntax, then revert
|
||||
|
||||
### Checkpoint
|
||||
```bash
|
||||
git commit -m "feat(templates): add goal hierarchy templates (Paperclip pattern)"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 16: Create budget tracking templates
|
||||
|
||||
### Files to create
|
||||
|
||||
**`scripts/templates/budget/BUDGET.md`**:
|
||||
|
||||
```markdown
|
||||
# Budget Policy: {{PROJECT_NAME}}
|
||||
|
||||
## Company Budget
|
||||
- window: {{BUDGET_WINDOW}}
|
||||
- limit: {{BUDGET_LIMIT_CENTS}} cents
|
||||
- warn_percent: 80
|
||||
- hard_stop: true
|
||||
|
||||
## Agent Budgets
|
||||
- {{AGENT_NAME}}: {{AGENT_BUDGET_CENTS}} cents/{{BUDGET_WINDOW}}
|
||||
|
||||
## Notification
|
||||
- on_warn: log
|
||||
- on_hard_stop: pause
|
||||
|
||||
## Notes
|
||||
|
||||
Budget enforcement is POST-HOC (checked after each run, not before).
|
||||
This matches Paperclip's proven approach: check SUM(cost) after run,
|
||||
pause if exceeded. No pre-run reservation needed.
|
||||
|
||||
Cost estimation uses token counts × published pricing. For accurate
|
||||
cost data, organizations can use the Admin API:
|
||||
`/v1/organizations/cost_report` (requires Admin API key: sk-ant-admin...).
|
||||
|
||||
For headless runs, use `claude -p --max-budget-usd N` as a per-run cap.
|
||||
```
|
||||
|
||||
**`scripts/templates/budget/budget-hook.sh`**:
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# PostToolUse hook: Log cost events and enforce budget.
|
||||
# Bash 3.2 compatible. Uses python3 for JSON parsing.
|
||||
#
|
||||
# Follows Paperclip's post-hoc enforcement pattern:
|
||||
# 1. Log cost event after each tool call
|
||||
# 2. Check cumulative cost against budget policy
|
||||
# 3. Warn at soft threshold, pause at hard threshold
|
||||
#
|
||||
# Placeholders:
|
||||
# {{WORKING_DIR}} - absolute path to project directory
|
||||
|
||||
WORKING_DIR="{{WORKING_DIR}}"
|
||||
BUDGET_DIR="$WORKING_DIR/budget"
|
||||
COST_LOG="$BUDGET_DIR/cost-events.jsonl"
|
||||
BUDGET_FILE="$WORKING_DIR/BUDGET.md"
|
||||
PAUSED_FLAG="$BUDGET_DIR/PAUSED"
|
||||
|
||||
mkdir -p "$BUDGET_DIR"
|
||||
|
||||
# Read hook input
|
||||
INPUT=$(cat)
|
||||
TOOL_NAME=$(echo "$INPUT" | python3 -c "import sys,json; print(json.load(sys.stdin).get('tool_name',''))" 2>/dev/null)
|
||||
|
||||
# Log cost event
|
||||
python3 << PYEOF
|
||||
import json, sys, time, os
|
||||
|
||||
try:
|
||||
data = json.loads('''$INPUT''')
|
||||
except:
|
||||
sys.exit(0)
|
||||
|
||||
tool_name = data.get('tool_name', '')
|
||||
# Estimate cost from token counts if available in tool result
|
||||
# This is a rough estimate; actual costs come from the Admin API
|
||||
event = {
|
||||
'timestamp': time.strftime('%Y-%m-%dT%H:%M:%SZ', time.gmtime()),
|
||||
'tool_name': tool_name,
|
||||
'agent': os.environ.get('AGENT_NAME', 'unknown'),
|
||||
'estimated_tokens': 0
|
||||
}
|
||||
|
||||
cost_log = "$COST_LOG"
|
||||
with open(cost_log, 'a') as f:
|
||||
f.write(json.dumps(event) + '\n')
|
||||
PYEOF
|
||||
|
||||
# Check budget if BUDGET.md exists
|
||||
if [ -f "$BUDGET_FILE" ] && [ -f "$COST_LOG" ]; then
|
||||
BUDGET_CHECK=$(python3 << 'PYEOF'
|
||||
import re, json, os
|
||||
|
||||
budget_file = os.environ.get('BUDGET_FILE', '')
|
||||
cost_log = os.environ.get('COST_LOG', '')
|
||||
paused_flag = os.environ.get('PAUSED_FLAG', '')
|
||||
|
||||
if not budget_file or not cost_log:
|
||||
print("ok")
|
||||
exit(0)
|
||||
|
||||
# Parse budget
|
||||
try:
|
||||
content = open(budget_file).read()
|
||||
limit_m = re.search(r'limit:\s*(\d+)\s*cents', content)
|
||||
warn_m = re.search(r'warn_percent:\s*(\d+)', content)
|
||||
hard_m = re.search(r'hard_stop:\s*(\w+)', content)
|
||||
if not limit_m:
|
||||
print("ok")
|
||||
exit(0)
|
||||
limit = int(limit_m.group(1))
|
||||
warn_pct = int(warn_m.group(1)) if warn_m else 80
|
||||
hard_stop = hard_m.group(1).lower() == 'true' if hard_m else True
|
||||
except:
|
||||
print("ok")
|
||||
exit(0)
|
||||
|
||||
# Sum cost events (count events as rough proxy — actual cost tracking
|
||||
# requires Admin API or token counting from responses)
|
||||
try:
|
||||
event_count = sum(1 for _ in open(cost_log))
|
||||
except:
|
||||
event_count = 0
|
||||
|
||||
# Rough estimate: each event ~ 1 cent (placeholder — customize per model)
|
||||
estimated_cents = event_count
|
||||
|
||||
pct = (estimated_cents / limit * 100) if limit > 0 else 0
|
||||
|
||||
if pct >= 100 and hard_stop:
|
||||
open(paused_flag, 'w').write(f'Budget exceeded: {estimated_cents}/{limit} cents')
|
||||
print("hard_stop")
|
||||
elif pct >= warn_pct:
|
||||
print("warn")
|
||||
else:
|
||||
print("ok")
|
||||
PYEOF
|
||||
)
|
||||
|
||||
BUDGET_FILE="$BUDGET_FILE" COST_LOG="$COST_LOG" PAUSED_FLAG="$PAUSED_FLAG" \
|
||||
BUDGET_RESULT=$(BUDGET_FILE="$BUDGET_FILE" COST_LOG="$COST_LOG" PAUSED_FLAG="$PAUSED_FLAG" python3 -c "
|
||||
import re, json, os
|
||||
budget_file = '$BUDGET_FILE'
|
||||
cost_log = '$COST_LOG'
|
||||
paused_flag = '$PAUSED_FLAG'
|
||||
try:
|
||||
content = open(budget_file).read()
|
||||
limit_m = re.search(r'limit:\s*(\d+)\s*cents', content)
|
||||
if not limit_m: print('ok'); exit(0)
|
||||
limit = int(limit_m.group(1))
|
||||
warn_m = re.search(r'warn_percent:\s*(\d+)', content)
|
||||
warn_pct = int(warn_m.group(1)) if warn_m else 80
|
||||
hard_m = re.search(r'hard_stop:\s*(\w+)', content)
|
||||
hard_stop = hard_m.group(1).lower() == 'true' if hard_m else True
|
||||
event_count = sum(1 for _ in open(cost_log))
|
||||
estimated_cents = event_count
|
||||
pct = (estimated_cents / limit * 100) if limit > 0 else 0
|
||||
if pct >= 100 and hard_stop:
|
||||
open(paused_flag, 'w').write(f'Budget exceeded: {estimated_cents}/{limit} cents')
|
||||
print('hard_stop')
|
||||
elif pct >= warn_pct:
|
||||
print('warn')
|
||||
else:
|
||||
print('ok')
|
||||
except Exception as e:
|
||||
print('ok')
|
||||
" 2>/dev/null)
|
||||
|
||||
if [ "$BUDGET_RESULT" = "hard_stop" ]; then
|
||||
echo "BUDGET EXCEEDED — agent paused. Check $PAUSED_FLAG" >&2
|
||||
elif [ "$BUDGET_RESULT" = "warn" ]; then
|
||||
echo "BUDGET WARNING — approaching limit" >&2
|
||||
fi
|
||||
fi
|
||||
|
||||
# Check if agent is paused
|
||||
if [ -f "$PAUSED_FLAG" ]; then
|
||||
echo '{"decision": "block", "reason": "Agent paused: budget exceeded. Remove '"$PAUSED_FLAG"' to resume."}'
|
||||
exit 2
|
||||
fi
|
||||
|
||||
exit 0
|
||||
```
|
||||
|
||||
**`scripts/templates/budget/budget-report.sh`**:
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# Budget report: summarize cost events and compare against policy.
|
||||
# Bash 3.2 compatible. Uses python3 for aggregation.
|
||||
#
|
||||
# Usage: ./budget-report.sh
|
||||
#
|
||||
# Placeholders:
|
||||
# {{WORKING_DIR}} - absolute path to project directory
|
||||
|
||||
WORKING_DIR="{{WORKING_DIR}}"
|
||||
COST_LOG="$WORKING_DIR/budget/cost-events.jsonl"
|
||||
BUDGET_FILE="$WORKING_DIR/BUDGET.md"
|
||||
PAUSED_FLAG="$WORKING_DIR/budget/PAUSED"
|
||||
|
||||
if [ ! -f "$COST_LOG" ]; then
|
||||
echo "No cost events recorded yet."
|
||||
exit 0
|
||||
fi
|
||||
|
||||
python3 << PYEOF
|
||||
import json, re, os
|
||||
from collections import defaultdict
|
||||
|
||||
cost_log = "$COST_LOG"
|
||||
budget_file = "$BUDGET_FILE"
|
||||
paused_flag = "$PAUSED_FLAG"
|
||||
|
||||
# Read events
|
||||
events = []
|
||||
with open(cost_log) as f:
|
||||
for line in f:
|
||||
line = line.strip()
|
||||
if line:
|
||||
try:
|
||||
events.append(json.loads(line))
|
||||
except:
|
||||
pass
|
||||
|
||||
# Aggregate
|
||||
by_agent = defaultdict(int)
|
||||
by_day = defaultdict(int)
|
||||
by_tool = defaultdict(int)
|
||||
|
||||
for e in events:
|
||||
agent = e.get('agent', 'unknown')
|
||||
day = e.get('timestamp', '')[:10]
|
||||
tool = e.get('tool_name', 'unknown')
|
||||
by_agent[agent] += 1
|
||||
by_day[day] += 1
|
||||
by_tool[tool] += 1
|
||||
|
||||
print("BUDGET REPORT")
|
||||
print("=" * 50)
|
||||
print(f"Total events: {len(events)}")
|
||||
print()
|
||||
|
||||
# Per-agent breakdown
|
||||
print("By Agent:")
|
||||
for agent, count in sorted(by_agent.items(), key=lambda x: -x[1]):
|
||||
print(f" {agent}: {count} events")
|
||||
print()
|
||||
|
||||
# Per-day breakdown (last 7 days)
|
||||
print("By Day (last 7):")
|
||||
for day, count in sorted(by_day.items())[-7:]:
|
||||
print(f" {day}: {count} events")
|
||||
print()
|
||||
|
||||
# Budget comparison
|
||||
if os.path.exists(budget_file):
|
||||
content = open(budget_file).read()
|
||||
limit_m = re.search(r'limit:\s*(\d+)\s*cents', content)
|
||||
if limit_m:
|
||||
limit = int(limit_m.group(1))
|
||||
est_cents = len(events) # rough proxy
|
||||
pct = (est_cents / limit * 100) if limit > 0 else 0
|
||||
print(f"Budget: ~{est_cents}/{limit} cents ({pct:.0f}%)")
|
||||
|
||||
# Paused status
|
||||
if os.path.exists(paused_flag):
|
||||
print(f"\n!! AGENT PAUSED: {open(paused_flag).read().strip()}")
|
||||
print(f" Remove {paused_flag} to resume")
|
||||
PYEOF
|
||||
```
|
||||
|
||||
**`scripts/templates/budget/README.md`**:
|
||||
|
||||
```markdown
|
||||
# Budget Tracking
|
||||
|
||||
Post-hoc budget enforcement inspired by Paperclip's budget system.
|
||||
|
||||
## How it works
|
||||
|
||||
1. `budget-hook.sh` runs as a PostToolUse hook after every tool call
|
||||
2. Each call is logged to `budget/cost-events.jsonl`
|
||||
3. After logging, cumulative cost is compared against `BUDGET.md` policy
|
||||
4. If soft threshold (default 80%) exceeded: warning to stderr
|
||||
5. If hard threshold (100%) exceeded and hard_stop=true: creates `budget/PAUSED`
|
||||
flag file, subsequent tool calls are blocked (exit 2)
|
||||
|
||||
## Why post-hoc, not pre-run?
|
||||
|
||||
Paperclip uses the same approach. Pre-run budget reservation requires a
|
||||
persistent service or lock file coordination. Post-hoc checking is simpler
|
||||
and robust enough in practice — the worst case is one extra run before pause.
|
||||
|
||||
## Cost estimation
|
||||
|
||||
The current implementation counts events as a rough proxy for cost. For
|
||||
accurate cost tracking, you have two options:
|
||||
|
||||
1. **Admin API** (org accounts only): Query `/v1/organizations/cost_report`
|
||||
with an Admin API key (`sk-ant-admin...`). This gives actual USD costs.
|
||||
2. **Token estimation**: Parse token counts from Claude's responses and
|
||||
multiply by published per-token prices. More accurate than event counting
|
||||
but still an estimate.
|
||||
|
||||
For headless runs, `claude -p --max-budget-usd N` provides a per-run
|
||||
budget cap directly in the CLI.
|
||||
|
||||
## Integration
|
||||
|
||||
Add to `.claude/settings.json`:
|
||||
```json
|
||||
{
|
||||
"hooks": {
|
||||
"PostToolUse": [{
|
||||
"matcher": "*",
|
||||
"hooks": [{"type": "command", "command": "bash budget/budget-hook.sh"}]
|
||||
}]
|
||||
}
|
||||
}
|
||||
```
|
||||
```
|
||||
|
||||
### Verify
|
||||
|
||||
```bash
|
||||
bash -n /Users/ktg/repos/agent-builder/scripts/templates/budget/budget-hook.sh && bash -n /Users/ktg/repos/agent-builder/scripts/templates/budget/budget-report.sh && echo "VALID"
|
||||
```
|
||||
Expected: `VALID`
|
||||
|
||||
### On failure: retry — fix bash syntax, then revert
|
||||
|
||||
### Checkpoint
|
||||
```bash
|
||||
git commit -m "feat(templates): add budget tracking templates (Paperclip pattern)"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 17: Create governance and approval gate templates
|
||||
|
||||
### Files to create
|
||||
|
||||
**`scripts/templates/governance/GOVERNANCE.md`** — See plan Step 17 for full content. Key sections:
|
||||
- Autonomy Levels (0-4 scale from full manual to full autonomy)
|
||||
- Approval Gates with `{{GATE_NAME}}` and `{{GATE_CONDITION}}` placeholders
|
||||
- Escalation Rules (budget exceeded, error threshold, unknown tool, scope violation)
|
||||
- Audit Requirements (tool calls, budget events, approvals, retention)
|
||||
|
||||
**`scripts/templates/governance/approval-gate.sh`** — PreToolUse hook implementing approval gates. Bash 3.2 compatible. Key behavior:
|
||||
1. Reads GOVERNANCE.md for current autonomy level
|
||||
2. Based on level, auto-approves or requires approval
|
||||
3. For gated operations: writes request to `governance/pending-approvals.jsonl`
|
||||
4. Checks `governance/approval-responses.jsonl` for matching response
|
||||
5. No response within timeout → block (exit 2)
|
||||
|
||||
**`scripts/templates/governance/README.md`** — Explains the governance model, autonomy levels, Paperclip's "autonomy is a privilege" philosophy.
|
||||
|
||||
### Verify
|
||||
|
||||
```bash
|
||||
bash -n /Users/ktg/repos/agent-builder/scripts/templates/governance/approval-gate.sh && echo "VALID"
|
||||
```
|
||||
Expected: `VALID`
|
||||
|
||||
### On failure: retry — fix bash syntax, then revert
|
||||
|
||||
### Checkpoint
|
||||
```bash
|
||||
git commit -m "feat(templates): add governance and approval gate templates (Paperclip pattern)"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 18: Create org-chart template
|
||||
|
||||
### Files to create
|
||||
|
||||
**`scripts/templates/org-chart/ORG-CHART.md`** — Markdown table with columns: Agent, Role, Reports To, Status, Budget. Uses `reportsTo` pattern from Paperclip. Includes delegation rules and human override section.
|
||||
|
||||
**`scripts/templates/org-chart/org-manager.sh`** — Bash 3.2 script that:
|
||||
- Parses ORG-CHART.md table with python3
|
||||
- Validates: agents exist in `.claude/agents/`, no circular chains
|
||||
- Can add/remove agents from the chart
|
||||
- Generates text-based org tree visualization
|
||||
|
||||
**`scripts/templates/org-chart/README.md`** — Explains the simple `reportsTo` pattern, delegation flows, cross-team routing.
|
||||
|
||||
### Verify
|
||||
|
||||
```bash
|
||||
bash -n /Users/ktg/repos/agent-builder/scripts/templates/org-chart/org-manager.sh && echo "VALID"
|
||||
```
|
||||
Expected: `VALID`
|
||||
|
||||
### On failure: retry — fix bash syntax, then revert
|
||||
|
||||
### Checkpoint
|
||||
```bash
|
||||
git commit -m "feat(templates): add org-chart template (Paperclip pattern)"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Exit Condition
|
||||
|
||||
- [ ] `ls /Users/ktg/repos/agent-builder/scripts/templates/heartbeat/ | wc -l` → 5
|
||||
- [ ] `ls /Users/ktg/repos/agent-builder/scripts/templates/goals/ | wc -l` → 3
|
||||
- [ ] `ls /Users/ktg/repos/agent-builder/scripts/templates/budget/ | wc -l` → 4
|
||||
- [ ] `ls /Users/ktg/repos/agent-builder/scripts/templates/governance/ | wc -l` → 3
|
||||
- [ ] `ls /Users/ktg/repos/agent-builder/scripts/templates/org-chart/ | wc -l` → 3
|
||||
- [ ] All shell scripts pass `bash -n`: `find /Users/ktg/repos/agent-builder/scripts/templates/goals /Users/ktg/repos/agent-builder/scripts/templates/budget /Users/ktg/repos/agent-builder/scripts/templates/governance /Users/ktg/repos/agent-builder/scripts/templates/org-chart -name "*.sh" -exec bash -n {} \;` → no errors
|
||||
- [ ] context-packet.md contains `{{WAKE_REASON}}` placeholder
|
||||
- [ ] budget-hook.sh contains reference to PAUSED flag file
|
||||
|
||||
## Quality Criteria
|
||||
|
||||
- Context packet follows Paperclip's "Memento Man" pattern with all sections
|
||||
- Wake prompt includes rules about not inferring from prior conversations
|
||||
- Goal hierarchy uses simple parent_id (not recursive) matching Paperclip's actual code
|
||||
- Budget enforcement is post-hoc matching Paperclip's pattern
|
||||
- Budget README honestly documents Admin API limitation (org-only, admin key)
|
||||
- Governance has 5 autonomy levels (0-4) with clear descriptions
|
||||
- Org chart uses simple reportsTo pattern with human override authority
|
||||
- All bash scripts are 3.2 compatible
|
||||
Loading…
Add table
Add a link
Reference in a new issue