docs(plans): Agent Factory ultraplan + execution guide
27-step plan across 8 sessions in 3 waves for transforming agent-builder into Agent Factory v1.0.0. Includes research briefs, spec, and wave-by-wave execution prompts with scope fences. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
075383990f
commit
7419d4283d
5 changed files with 2294 additions and 0 deletions
346
.claude/plans/execution-guide.md
Normal file
346
.claude/plans/execution-guide.md
Normal file
|
|
@ -0,0 +1,346 @@
|
|||
# Agent Factory — Execution Guide
|
||||
|
||||
## Overview
|
||||
|
||||
The ultraplan (`ultraplan-2026-04-11-agent-factory.md`) has 27 steps across 8 sessions
|
||||
in 3 waves. This guide provides self-contained prompts for each wave.
|
||||
|
||||
**Key principle:** Each session reads its blueprint document (in `blueprints/`) which
|
||||
contains exact file contents. No interpretation needed — implement what the blueprint specifies.
|
||||
|
||||
## Reference Documents
|
||||
|
||||
- **Plan:** `.claude/plans/ultraplan-2026-04-11-agent-factory.md`
|
||||
- **Spec:** `.claude/ultraplan-spec-2026-04-11-agent-factory.md`
|
||||
- **Research brief:** `.claude/research/ultraresearch-2026-04-11-openclaw-paperclip-agent-frameworks.md`
|
||||
- **Source code analysis:** `.claude/research/source-code-analysis-2026-04-11.md`
|
||||
- **Blueprints:** `.claude/plans/blueprints/session-{N}-*.md`
|
||||
|
||||
## Execution Order
|
||||
|
||||
```
|
||||
Wave 0: Preparation (blueprints + assumption verification)
|
||||
│
|
||||
Wave 1: S1 ─┬─ S2 ─┬─ S3 ─┬─ S4 ─┬─ S5 ─┬─ S6 (6 parallel)
|
||||
│ │ │ │ │ │
|
||||
Wave 2: ────────────────── S7 ────────────────── (after S3+S4)
|
||||
│
|
||||
Wave 3: ─────────────────── S8 ───────────────── (after S1+S2+S7)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Wave 0 — Preparation (CREATE BLUEPRINTS)
|
||||
|
||||
Run this FIRST. It creates the detailed blueprints that all other waves depend on.
|
||||
|
||||
```
|
||||
/ultraexecute-local .claude/plans/ultraplan-2026-04-11-agent-factory.md
|
||||
```
|
||||
|
||||
If not using ultraexecute, use this prompt:
|
||||
|
||||
```
|
||||
Agent Factory Wave 0: Create session blueprints.
|
||||
|
||||
Context:
|
||||
- Plan: .claude/plans/ultraplan-2026-04-11-agent-factory.md
|
||||
- Spec: .claude/ultraplan-spec-2026-04-11-agent-factory.md
|
||||
- Research: .claude/research/ultraresearch-2026-04-11-openclaw-paperclip-agent-frameworks.md
|
||||
- Source analysis: .claude/research/source-code-analysis-2026-04-11.md
|
||||
- Current codebase: 15 files, read ALL of them to understand conventions
|
||||
|
||||
Task 1: Verify the 4 assumptions in the plan:
|
||||
a) Search for Anthropic billing API docs (WebSearch). Document what exists.
|
||||
b) Test `claude --resume` with a custom session ID format. Document behavior.
|
||||
c) Check /schedule trigger docs. Document stability.
|
||||
d) Confirm Docker approach (Dockerfile + docker-compose.yml).
|
||||
Update the plan's Assumptions table with findings.
|
||||
|
||||
Task 2: Create 8 session blueprint documents in .claude/plans/blueprints/:
|
||||
- session-1-foundation.md (Steps 1-5)
|
||||
- session-2-skills-templates.md (Steps 6-7)
|
||||
- session-3-openclaw.md (Steps 9-12)
|
||||
- session-4-paperclip.md (Steps 14-18)
|
||||
- session-5-selflearning.md (Steps 20-21)
|
||||
- session-6-integration.md (Steps 22-24)
|
||||
- session-7-skill-updates.md (Steps 13, 19, 25)
|
||||
- session-8-finalization.md (Steps 8, 26, 27)
|
||||
|
||||
Each blueprint MUST contain:
|
||||
1. EXACT file contents for every new file (copy-paste ready)
|
||||
2. Precise diff descriptions for files being modified
|
||||
3. Verify commands that check CONTENT, not just file existence
|
||||
4. Quality criteria specific to the session
|
||||
5. Scope fence (what this session may/may not touch)
|
||||
|
||||
For exact template content, use:
|
||||
- Research brief for OpenClaw/Paperclip patterns (3-tier memory, WAL, heartbeat, etc.)
|
||||
- Source code analysis for implementation details (heartbeat format, budget schema, etc.)
|
||||
- Existing codebase files for conventions (frontmatter format, placeholder syntax, hook patterns)
|
||||
|
||||
All bash scripts must be bash 3.2 compatible. All templates use {{PLACEHOLDER}} syntax.
|
||||
Python3 for JSON/YAML/date parsing in scripts.
|
||||
|
||||
Commit after all blueprints are created:
|
||||
git commit -m "docs(plans): create session blueprints for Agent Factory execution"
|
||||
git push origin main
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Wave 1 — Parallel Execution (6 sessions)
|
||||
|
||||
Run these 6 sessions in parallel. Each is independent.
|
||||
|
||||
### Session 1: Foundation — Rename and Commands
|
||||
|
||||
```
|
||||
Agent Factory Session 1: Foundation — Rename and Commands.
|
||||
|
||||
Read these files FIRST:
|
||||
- Blueprint: .claude/plans/blueprints/session-1-foundation.md
|
||||
- Plan steps 1-5: .claude/plans/ultraplan-2026-04-11-agent-factory.md
|
||||
|
||||
Execute steps 1-5 from the blueprint:
|
||||
1. Rename plugin from agent-builder to agent-factory (plugin.json, CLAUDE.md, README, commands, skills)
|
||||
2. Create /agent-factory:deploy command (commands/deploy.md)
|
||||
3. Create deployment-advisor agent (agents/deployment-advisor.md)
|
||||
4. Create /agent-factory:evaluate command (commands/evaluate.md)
|
||||
5. Create /agent-factory:status command (commands/status.md)
|
||||
|
||||
SCOPE FENCE:
|
||||
- Touch: .claude-plugin/plugin.json, CLAUDE.md, README.md, commands/*, agents/deployment-advisor.md
|
||||
- Touch: skills/agent-system-design/SKILL.md (rename references ONLY)
|
||||
- NEVER touch: scripts/templates/*, skills/managed-agents/
|
||||
|
||||
Implement EXACTLY what the blueprint specifies. Commit after each step.
|
||||
Run all verify commands. Push when all 5 steps pass.
|
||||
```
|
||||
|
||||
### Session 2: Skills and Initial Templates
|
||||
|
||||
```
|
||||
Agent Factory Session 2: Skills and Domain Templates.
|
||||
|
||||
Read these files FIRST:
|
||||
- Blueprint: .claude/plans/blueprints/session-2-skills-templates.md
|
||||
- Plan steps 6-7: .claude/plans/ultraplan-2026-04-11-agent-factory.md
|
||||
|
||||
Execute steps 6-7:
|
||||
6. Create managed-agents skill (skills/managed-agents/SKILL.md + references)
|
||||
7. Create 5 domain templates (content-pipeline, code-review, monitoring, research-synthesis, data-processing)
|
||||
|
||||
SCOPE FENCE:
|
||||
- Touch: skills/managed-agents/*, scripts/templates/domains/*
|
||||
- NEVER touch: commands/, agents/, .claude-plugin/, scripts/templates/memory/, scripts/templates/heartbeat/
|
||||
|
||||
Implement EXACTLY what the blueprint specifies. Commit after each step.
|
||||
Run all verify commands. Push when done.
|
||||
```
|
||||
|
||||
### Session 3: OpenClaw Memory and Autonomy
|
||||
|
||||
```
|
||||
Agent Factory Session 3: OpenClaw Memory and Autonomy Patterns.
|
||||
|
||||
Read these files FIRST:
|
||||
- Blueprint: .claude/plans/blueprints/session-3-openclaw.md
|
||||
- Plan steps 9-12: .claude/plans/ultraplan-2026-04-11-agent-factory.md
|
||||
- Source analysis: .claude/research/source-code-analysis-2026-04-11.md (OpenClaw section)
|
||||
|
||||
Execute steps 9-12:
|
||||
9. Create 3-tier memory templates (SESSION-STATE.md, DAILY-LOG.md, MEMORY.md)
|
||||
10. Create heartbeat + cron templates (HEARTBEAT.md, heartbeat-runner.sh) with emptiness detection
|
||||
11. Create proactive agent template with ADL/VFM guardrails
|
||||
12. Create isolated agentTurn and systemEvent cron templates
|
||||
|
||||
SCOPE FENCE:
|
||||
- Touch: scripts/templates/memory/, scripts/templates/heartbeat/HEARTBEAT.md,
|
||||
scripts/templates/heartbeat/heartbeat-runner.sh, scripts/templates/heartbeat/README.md,
|
||||
scripts/templates/proactive/, scripts/templates/cron/
|
||||
- NEVER touch: commands/, agents/, skills/,
|
||||
scripts/templates/heartbeat/context-packet.md, scripts/templates/heartbeat/wake-prompt.md
|
||||
|
||||
All bash scripts MUST pass `bash -n`. Use python3 for JSON/YAML/date parsing.
|
||||
Implement EXACTLY what the blueprint specifies. Commit after each step. Push when done.
|
||||
```
|
||||
|
||||
### Session 4: Paperclip Orchestration
|
||||
|
||||
```
|
||||
Agent Factory Session 4: Paperclip Orchestration Patterns.
|
||||
|
||||
Read these files FIRST:
|
||||
- Blueprint: .claude/plans/blueprints/session-4-paperclip.md
|
||||
- Plan steps 14-18: .claude/plans/ultraplan-2026-04-11-agent-factory.md
|
||||
- Source analysis: .claude/research/source-code-analysis-2026-04-11.md (Paperclip section)
|
||||
|
||||
Execute steps 14-18:
|
||||
14. Create heartbeat context injection templates (context-packet.md, wake-prompt.md)
|
||||
15. Create goal hierarchy templates (GOALS.md, goal-tracker.sh)
|
||||
16. Create budget tracking templates (budget-hook.sh, BUDGET.md, budget-report.sh)
|
||||
17. Create governance and approval gate templates (GOVERNANCE.md, approval-gate.sh)
|
||||
18. Create org-chart template (ORG-CHART.md, org-manager.sh)
|
||||
|
||||
SCOPE FENCE:
|
||||
- Touch: scripts/templates/heartbeat/context-packet.md, scripts/templates/heartbeat/wake-prompt.md,
|
||||
scripts/templates/goals/, scripts/templates/budget/, scripts/templates/governance/,
|
||||
scripts/templates/org-chart/
|
||||
- NEVER touch: commands/, agents/, skills/,
|
||||
scripts/templates/heartbeat/HEARTBEAT.md, scripts/templates/heartbeat/heartbeat-runner.sh
|
||||
|
||||
All bash scripts MUST pass `bash -n`. Use python3 for JSON/YAML/date parsing.
|
||||
Implement EXACTLY what the blueprint specifies. Commit after each step. Push when done.
|
||||
```
|
||||
|
||||
### Session 5: Self-Learning Systems
|
||||
|
||||
```
|
||||
Agent Factory Session 5: Self-Learning Systems.
|
||||
|
||||
Read these files FIRST:
|
||||
- Blueprint: .claude/plans/blueprints/session-5-selflearning.md
|
||||
- Plan steps 20-21: .claude/plans/ultraplan-2026-04-11-agent-factory.md
|
||||
|
||||
Execute steps 20-21:
|
||||
20. Create feedback loop templates (FEEDBACK.md, feedback-collector.sh, performance-scorer.sh)
|
||||
21. Create pipeline optimization and self-healing templates (pipeline-optimizer.sh, self-healing.sh)
|
||||
|
||||
SCOPE FENCE:
|
||||
- Touch: scripts/templates/feedback/, scripts/templates/optimization/
|
||||
- NEVER touch: commands/, agents/, skills/, all other template dirs
|
||||
|
||||
All bash scripts MUST pass `bash -n`. Use python3 for JSON/YAML/date parsing.
|
||||
Implement EXACTLY what the blueprint specifies. Commit after each step. Push when done.
|
||||
```
|
||||
|
||||
### Session 6: Integration — Docker, Transfer, Templates
|
||||
|
||||
```
|
||||
Agent Factory Session 6: Integration — Docker, Transfer, Additional Templates.
|
||||
|
||||
Read these files FIRST:
|
||||
- Blueprint: .claude/plans/blueprints/session-6-integration.md
|
||||
- Plan steps 22-24: .claude/plans/ultraplan-2026-04-11-agent-factory.md
|
||||
|
||||
Execute steps 22-24:
|
||||
22. Create Docker deployment templates (Dockerfile, docker-compose.yml, docker-entrypoint.sh)
|
||||
23. Create import/export system (export-system.sh, import-system.sh, MANIFEST.md)
|
||||
24. Create 5 additional domain templates (customer-support, devops, legal, sales, security)
|
||||
|
||||
SCOPE FENCE:
|
||||
- Touch: scripts/templates/docker/, scripts/templates/transfer/,
|
||||
scripts/templates/domains/customer-support.md, devops-automation.md,
|
||||
legal-review.md, sales-intelligence.md, security-audit.md,
|
||||
scripts/templates/domains/README.md (update only)
|
||||
- NEVER touch: commands/, agents/, skills/,
|
||||
scripts/templates/domains/content-pipeline.md, code-review.md,
|
||||
monitoring.md, research-synthesis.md, data-processing.md
|
||||
|
||||
All bash scripts MUST pass `bash -n`.
|
||||
Implement EXACTLY what the blueprint specifies. Commit after each step. Push when done.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Wave 2 — Skill Updates (after Wave 1 Sessions 3+4)
|
||||
|
||||
### Session 7: Skill References
|
||||
|
||||
```
|
||||
Agent Factory Session 7: Skill Updates and References.
|
||||
|
||||
PREREQUISITE: Wave 1 Sessions 3 and 4 must be complete. Verify:
|
||||
ls scripts/templates/memory/ && ls scripts/templates/heartbeat/ &&
|
||||
ls scripts/templates/goals/ && ls scripts/templates/budget/
|
||||
|
||||
Read these files FIRST:
|
||||
- Blueprint: .claude/plans/blueprints/session-7-skill-updates.md
|
||||
- Plan steps 13, 19, 25: .claude/plans/ultraplan-2026-04-11-agent-factory.md
|
||||
- The templates created in Sessions 3+4 (to reference accurately)
|
||||
|
||||
Execute steps 13, 19, 25:
|
||||
13. Update agent-system-design skill with OpenClaw pattern references (memory-patterns.md, autonomy-patterns.md)
|
||||
19. Update agent-system-design skill with Paperclip pattern references (orchestration-patterns.md, governance-patterns.md)
|
||||
25. Create MCP integration reference (mcp-integrations.md)
|
||||
|
||||
SCOPE FENCE:
|
||||
- Touch: skills/agent-system-design/SKILL.md, skills/agent-system-design/references/*
|
||||
- NEVER touch: commands/, agents/, scripts/templates/
|
||||
|
||||
Steps 13 and 19 both modify SKILL.md — execute them SEQUENTIALLY.
|
||||
Implement EXACTLY what the blueprint specifies. Commit after each step. Push when done.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Wave 3 — Finalization (after Wave 1 Sessions 1+2 + Wave 2)
|
||||
|
||||
### Session 8: Build Command Integration
|
||||
|
||||
```
|
||||
Agent Factory Session 8: Build Command Integration and Finalization.
|
||||
|
||||
PREREQUISITE: All Wave 1 + Wave 2 sessions must be complete. Verify:
|
||||
ls commands/deploy.md commands/evaluate.md commands/status.md &&
|
||||
ls skills/managed-agents/SKILL.md &&
|
||||
ls scripts/templates/domains/ | wc -l # should be 11 (10 templates + README)
|
||||
ls skills/agent-system-design/references/memory-patterns.md
|
||||
|
||||
Read these files FIRST:
|
||||
- Blueprint: .claude/plans/blueprints/session-8-finalization.md
|
||||
- Plan steps 8, 26, 27: .claude/plans/ultraplan-2026-04-11-agent-factory.md
|
||||
- Current state of commands/build.md (to understand what to modify)
|
||||
- Current state of .claude-plugin/plugin.json
|
||||
|
||||
Execute steps 8, 26, 27:
|
||||
8. Update build command for domain templates and new features (Phase 0 template selection)
|
||||
26. Update build command to integrate ALL Phase 2-5 features (memory, proactive, governance, goals, org-chart, budget, heartbeat, Docker)
|
||||
27. Update plugin.json to v1.0.0, rewrite CLAUDE.md and README.md for full Agent Factory
|
||||
|
||||
Steps 8 and 26 both modify commands/build.md — execute them SEQUENTIALLY.
|
||||
Step 27 is the final commit: "feat: Agent Factory v1.0.0"
|
||||
|
||||
SCOPE FENCE:
|
||||
- Touch: commands/build.md, .claude-plugin/plugin.json, CLAUDE.md, README.md
|
||||
- NEVER touch: scripts/templates/, skills/, agents/
|
||||
|
||||
After step 27, run ALL verification commands from the plan's Verification section.
|
||||
Commit and push. Tag: git tag v1.0.0
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Post-Execution Verification
|
||||
|
||||
After all waves complete, run the full verification suite:
|
||||
|
||||
```bash
|
||||
# All renamed
|
||||
grep -r "agent-builder" . --include="*.md" --include="*.json" | grep -v ".git/" | grep -v "research/" | grep -v "ultraplan" | wc -l # → 0
|
||||
|
||||
# Plugin manifest
|
||||
python3 -c "import json; d=json.load(open('.claude-plugin/plugin.json')); print(d['name'], d['version'])" # → agent-factory 1.0.0
|
||||
|
||||
# All commands
|
||||
ls commands/ | sort # → build.md deploy.md evaluate.md status.md
|
||||
|
||||
# All agents
|
||||
ls agents/ | sort # → builder.md deployment-advisor.md
|
||||
|
||||
# All skills
|
||||
ls skills/ | sort # → agent-system-design managed-agents
|
||||
|
||||
# Template directories
|
||||
ls scripts/templates/ | sort # → budget cron docker domains feedback goals governance heartbeat memory optimization org-chart proactive transfer + existing files
|
||||
|
||||
# Domain templates
|
||||
ls scripts/templates/domains/*.md | wc -l # → 11
|
||||
|
||||
# All bash scripts pass syntax check
|
||||
find scripts/templates -name "*.sh" -exec bash -n {} \; # → no errors
|
||||
|
||||
# All agents have valid frontmatter
|
||||
find agents -name "*.md" -exec python3 -c "import yaml,sys; yaml.safe_load(open(sys.argv[1]).read().split('---')[1])" {} \; # → no errors
|
||||
```
|
||||
1138
.claude/plans/ultraplan-2026-04-11-agent-factory.md
Normal file
1138
.claude/plans/ultraplan-2026-04-11-agent-factory.md
Normal file
File diff suppressed because it is too large
Load diff
489
.claude/research/source-code-analysis-2026-04-11.md
Normal file
489
.claude/research/source-code-analysis-2026-04-11.md
Normal file
|
|
@ -0,0 +1,489 @@
|
|||
---
|
||||
type: source-code-analysis
|
||||
created: 2026-04-11
|
||||
repos_analyzed: [paperclipai/paperclip, openclaw/openclaw]
|
||||
purpose: "Implementation-level details for replicating best patterns in Agent Factory"
|
||||
---
|
||||
|
||||
# Source Code Analysis: OpenClaw & Paperclip
|
||||
|
||||
Repos were cloned and analyzed at code level on 2026-04-11. This document
|
||||
captures implementation details NOT available in docs or articles — the actual
|
||||
patterns, interfaces, and mechanisms worth replicating.
|
||||
|
||||
## Critical Corrections (vs. docs/articles)
|
||||
|
||||
These are things docs described differently than the code implements:
|
||||
|
||||
1. **Canvas/A2UI (OpenClaw) is NOT generative rendering.** It's a static file
|
||||
server. Agents write files to a workspace directory, the canvas-host serves
|
||||
them over HTTP. No server-side rendering, no UI generation. This is NOT a
|
||||
meaningful capability gap for Claude Code.
|
||||
|
||||
2. **Goal hierarchy (Paperclip) is a simple adjacency list.** Just a `parent_id`
|
||||
FK on the `goals` table. No recursive traversal at runtime — only the directly
|
||||
referenced goal is passed to agents in `context_snapshot`. Docs said "full
|
||||
ancestry" but that's aspirational, not implemented.
|
||||
|
||||
3. **Budget enforcement (Paperclip) is post-hoc, not atomic.** Checked AFTER each
|
||||
run via `evaluateCostEvent()`: reads `SUM(cost_cents)`, compares with policy,
|
||||
pauses agent if exceeded. No pre-run budget reservation. Robust enough in practice.
|
||||
|
||||
4. **OpenClaw has real vector memory.** Not just MEMORY.md files. Uses `sqlite-vec`
|
||||
extension for vector search with embedding providers (Gemini, Mistral, Ollama,
|
||||
OpenAI, Voyage, Bedrock, local llama). This is significantly more sophisticated
|
||||
than file-based memory.
|
||||
|
||||
---
|
||||
|
||||
## Paperclip Implementation Details
|
||||
|
||||
### Heartbeat Scheduler
|
||||
|
||||
**File:** `server/src/services/heartbeat.ts` (4534 lines)
|
||||
|
||||
Poll-based, not event-driven. `tickTimers()` iterates all agents on each tick:
|
||||
|
||||
```typescript
|
||||
tickTimers: async (now = new Date()) => {
|
||||
const allAgents = await db.select().from(agents);
|
||||
for (const agent of allAgents) {
|
||||
if (agent.status === "paused" || "terminated" || "pending_approval") continue;
|
||||
const policy = parseHeartbeatPolicy(agent);
|
||||
if (!policy.enabled || policy.intervalSec <= 0) continue;
|
||||
const elapsed = now.getTime() - new Date(agent.lastHeartbeatAt ?? agent.createdAt).getTime();
|
||||
if (elapsed < policy.intervalSec * 1000) continue;
|
||||
await enqueueWakeup(agent.id, { source: "timer" });
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Heartbeat policy from `agent.runtimeConfig.heartbeat`:
|
||||
- `enabled: boolean`
|
||||
- `intervalSec: number`
|
||||
- `wakeOnDemand: boolean`
|
||||
- `maxConcurrentRuns: 1-10`
|
||||
|
||||
4 wakeup triggers: `timer`, `assignment`, `on_demand`, `automation`.
|
||||
|
||||
Concurrency control: in-process promise chain per agent (`startLocksByAgent` Map).
|
||||
Not distributed — single server process only.
|
||||
|
||||
### Run Lifecycle
|
||||
|
||||
1. `enqueueWakeup()` → insert `heartbeat_runs` (status=queued) + `agent_wakeup_requests`
|
||||
2. `startNextQueuedRunForAgent()` → check running count vs maxConcurrentRuns
|
||||
3. `claimQueuedRun()` → `UPDATE heartbeat_runs SET status='running' WHERE status='queued'`
|
||||
4. `executeRun()` → call `adapter.execute()`, stream output via `onLog`
|
||||
5. On completion → update runs, runtime state, task sessions, create cost events
|
||||
6. Orphan detection: `reapOrphanedRuns()` checks PIDs, auto-retries once
|
||||
|
||||
### Adapter Interface
|
||||
|
||||
**File:** `packages/adapter-utils/src/types.ts`
|
||||
|
||||
```typescript
|
||||
interface ServerAdapterModule {
|
||||
type: string;
|
||||
execute(ctx: AdapterExecutionContext): Promise<AdapterExecutionResult>;
|
||||
testEnvironment(ctx: AdapterEnvironmentTestContext): Promise<AdapterEnvironmentTestResult>;
|
||||
listSkills?: (ctx) => Promise<AdapterSkillSnapshot>;
|
||||
syncSkills?: (ctx, desiredSkills) => Promise<AdapterSkillSnapshot>;
|
||||
sessionCodec?: AdapterSessionCodec;
|
||||
sessionManagement?: AdapterSessionManagement;
|
||||
}
|
||||
```
|
||||
|
||||
10 built-in adapters: `claude_local`, `codex_local`, `cursor_local`, `gemini_local`,
|
||||
`openclaw_gateway`, `opencode_local`, `pi_local`, `hermes_local`, `process`, `http`.
|
||||
|
||||
### Claude Adapter Execution
|
||||
|
||||
**File:** `packages/adapters/claude-local/src/server/execute.ts`
|
||||
|
||||
Invokes CLI as:
|
||||
```
|
||||
claude --print - --output-format stream-json --verbose \
|
||||
[--resume <sessionId>] \
|
||||
[--dangerously-skip-permissions] \
|
||||
[--model <model>] \
|
||||
[--max-turns N] \
|
||||
[--append-system-prompt-file <file>] \
|
||||
[--add-dir <skillsDir>]
|
||||
```
|
||||
|
||||
Prompt composed from: `bootstrapPromptTemplate` (fresh sessions only) + wake payload
|
||||
+ session handoff note + main `promptTemplate`. Template variables: `{{agent.id}}`,
|
||||
`{{agent.name}}`, `{{context.wakeReason}}`, etc.
|
||||
|
||||
### Task Checkout (Atomic Locking)
|
||||
|
||||
**File:** `server/src/services/heartbeat.ts` (lines 3756-4010)
|
||||
|
||||
Issues have `execution_run_id` column as soft lock. Uses PostgreSQL row-level locking:
|
||||
|
||||
```sql
|
||||
SELECT id FROM issues WHERE id = $1 AND company_id = $2 FOR UPDATE
|
||||
```
|
||||
|
||||
Then conditional update:
|
||||
```sql
|
||||
UPDATE issues SET execution_run_id = $claimed_id
|
||||
WHERE id = $issue_id AND (execution_run_id IS NULL OR execution_run_id = $claimed_id)
|
||||
```
|
||||
|
||||
When same agent has running run → coalesce (merge context).
|
||||
When different agent → defer (status `deferred_issue_execution`), promoted when original completes.
|
||||
|
||||
### Budget Enforcement
|
||||
|
||||
**File:** `server/src/services/budgets.ts`
|
||||
|
||||
Schema:
|
||||
```
|
||||
budget_policies: scope_type (company|agent|project), scope_id, metric (billed_cents),
|
||||
window_kind (calendar_month_utc|lifetime), amount (cents), warn_percent (80),
|
||||
hard_stop_enabled, notify_enabled
|
||||
```
|
||||
|
||||
Flow after each run:
|
||||
1. Load active policies for company/agent/project
|
||||
2. `SELECT SUM(cost_cents) FROM cost_events` filtered by window
|
||||
3. If >= soft threshold → create `budget_incidents` (type soft)
|
||||
4. If >= amount AND hard_stop → `pauseScopeForBudget()` → `UPDATE agents SET status='paused'`
|
||||
→ `cancelBudgetScopeWork()` → SIGTERM → SIGKILL (with graceSec)
|
||||
|
||||
Pre-run check: `getInvocationBlock()` only checks `paused` flag, not live budget sum.
|
||||
|
||||
### Skills System
|
||||
|
||||
Skills injected as symlinked tmpdir per run:
|
||||
```typescript
|
||||
async function buildSkillsDir(config) {
|
||||
const tmp = await fs.mkdtemp(path.join(os.tmpdir(), "paperclip-skills-"));
|
||||
const target = path.join(tmp, ".claude", "skills");
|
||||
await fs.mkdir(target, { recursive: true });
|
||||
for (const entry of availableEntries) {
|
||||
if (!desiredNames.has(entry.key)) continue;
|
||||
await fs.symlink(entry.source, path.join(target, entry.runtimeName));
|
||||
}
|
||||
return tmp; // Passed as: claude --add-dir <skillsDir>
|
||||
}
|
||||
```
|
||||
|
||||
Company skills stored in DB: `company_skills` table with `markdown` content,
|
||||
`source_type` (github|url|local_path|skills_sh), `file_inventory`, `trust_level`.
|
||||
|
||||
### Session Persistence
|
||||
|
||||
`agent_task_sessions` table: unique on `(companyId, agentId, adapterType, taskKey)`.
|
||||
- taskKey = issueId (for issue-scoped) or `"__heartbeat__"` (timer-only)
|
||||
- sessionParamsJson = adapter-specific (Claude stores `{ sessionId, cwd }`)
|
||||
- Upserted after each run completion
|
||||
|
||||
Session compaction: rotate after 200 runs, 2M raw input tokens, or 72h age.
|
||||
Claude adapter: `nativeContextManagement: "confirmed"` → compaction disabled
|
||||
(Claude manages its own context window).
|
||||
|
||||
### Org Chart
|
||||
|
||||
Just `agents.reportsTo` self-referential FK. `agents.role` text field.
|
||||
Rendered as SVG server-side (5 visual styles). No separate table.
|
||||
|
||||
### Database Schema
|
||||
|
||||
PostgreSQL via Drizzle ORM. 55 migrations. Key tables:
|
||||
- `companies` — tenant root, status, budget
|
||||
- `agents` — adapter_type, adapter_config (jsonb), runtime_config (jsonb), reports_to, status, budget
|
||||
- `goals` — self-referencing parent_id, level (company/project/task), owner_agent_id
|
||||
- `issues` — FK to goals/projects/agents, execution_run_id (soft lock), parent_id
|
||||
- `heartbeat_runs` — status, context_snapshot (jsonb), session_id, process_pid, usage_json
|
||||
- `agent_wakeup_requests` — wake queue with status enum
|
||||
- `agent_task_sessions` — per-(agent, adapter, taskKey) session state
|
||||
- `budget_policies` / `budget_incidents` / `cost_events` — cost control
|
||||
- `company_skills` — skill definitions with markdown content
|
||||
- `approvals` — human approval requests
|
||||
- `routines` — scheduled workflows with cron expressions
|
||||
|
||||
### Agent Configuration Format
|
||||
|
||||
```json
|
||||
{
|
||||
"adapterConfig": {
|
||||
"command": "claude",
|
||||
"model": "claude-opus-4-5",
|
||||
"cwd": "/path/to/project",
|
||||
"promptTemplate": "You are agent {{agent.name}}...",
|
||||
"instructionsFilePath": "/path/to/AGENTS.md",
|
||||
"dangerouslySkipPermissions": true,
|
||||
"maxTurnsPerRun": 0,
|
||||
"timeoutSec": 0,
|
||||
"graceSec": 20,
|
||||
"skills": ["paperclipai/paperclip/mcp-server"]
|
||||
},
|
||||
"runtimeConfig": {
|
||||
"heartbeat": {
|
||||
"enabled": true,
|
||||
"intervalSec": 300,
|
||||
"wakeOnDemand": true,
|
||||
"maxConcurrentRuns": 1
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## OpenClaw Implementation Details
|
||||
|
||||
### Gateway
|
||||
|
||||
**File:** `src/gateway/server.impl.ts`
|
||||
|
||||
WebSocket server on port 18789. Flat dispatch table:
|
||||
```typescript
|
||||
const coreGatewayHandlers: Record<string, GatewayRequestHandler> = {
|
||||
...connectHandlers, ...chatHandlers, ...cronHandlers,
|
||||
...skillsHandlers, ...sessionsHandlers, ...agentHandlers,
|
||||
...channelsHandlers, ...modelsHandlers, // 28 handler groups
|
||||
}
|
||||
```
|
||||
|
||||
Auth: roles (`operator` | `node`), operator scopes
|
||||
(`admin`, `read`, `write`, `approvals`, `pairing`).
|
||||
|
||||
### Skills System
|
||||
|
||||
**Files:** `src/agents/skills/workspace.ts`, `skill-contract.ts`
|
||||
|
||||
Skill = directory with SKILL.md. Frontmatter parsed for metadata.
|
||||
|
||||
Loading limits:
|
||||
- Max 300 candidates per root
|
||||
- Max 200 loaded per source
|
||||
- Max 150 in prompt
|
||||
- Max 30,000 chars in prompt
|
||||
- Max 256 KB per skill file
|
||||
|
||||
Prompt format (XML):
|
||||
```xml
|
||||
<available_skills>
|
||||
<skill>
|
||||
<name>github</name>
|
||||
<description>...</description>
|
||||
<location>~/.openclaw/workspace/skills/github/SKILL.md</location>
|
||||
</skill>
|
||||
</available_skills>
|
||||
```
|
||||
|
||||
Path compaction: home dir → `~` (saves 5-6 tokens per path).
|
||||
|
||||
Skill metadata fields: `always`, `skillKey`, `emoji`, `homepage`, `os`,
|
||||
`requires` (bins, anyBins, env, config), `install` specs (brew, node, go, uv, download).
|
||||
|
||||
ClawHub integration for remote skill registry (search, install, update).
|
||||
|
||||
### Memory System
|
||||
|
||||
**Files:** `packages/memory-host-sdk/`
|
||||
|
||||
Two backends:
|
||||
- `builtin` — SQLite + sqlite-vec extension for vector search
|
||||
- `qmd` — External QuickMemory Daemon process
|
||||
|
||||
Embedding providers: Gemini, Mistral, Ollama, OpenAI, Voyage, Bedrock, local (node-llama).
|
||||
|
||||
Interface:
|
||||
```typescript
|
||||
interface MemorySearchManager {
|
||||
search(query, opts?: { maxResults?, minScore?, sessionKey? }): Promise<MemorySearchResult[]>
|
||||
readFile(params): Promise<{ text, path }>
|
||||
status(): MemoryProviderStatus
|
||||
sync?(params?): Promise<void>
|
||||
}
|
||||
```
|
||||
|
||||
Session transcripts indexable into memory backend. MEMORY.md / memory.md as default
|
||||
memory file convention.
|
||||
|
||||
### HEARTBEAT Mechanism
|
||||
|
||||
**File:** `src/auto-reply/heartbeat.ts`
|
||||
|
||||
Default prompt:
|
||||
```
|
||||
"Read HEARTBEAT.md if it exists. Follow it strictly. Do not infer or repeat
|
||||
old tasks from prior chats. If nothing needs attention, reply HEARTBEAT_OK."
|
||||
```
|
||||
|
||||
HEARTBEAT.md task format:
|
||||
```yaml
|
||||
tasks:
|
||||
- name: email-check
|
||||
interval: 30m
|
||||
prompt: "Check for urgent unread emails"
|
||||
```
|
||||
|
||||
Key functions:
|
||||
- `isHeartbeatContentEffectivelyEmpty()` — skips API calls when file has only
|
||||
headers/empty items. Saves significant cost.
|
||||
- `parseHeartbeatTasks()` — parses YAML tasks block
|
||||
- `isTaskDue()` — checks intervals against last-run timestamps
|
||||
- `stripHeartbeatToken()` — strips HEARTBEAT_OK from responses; responses
|
||||
under `ackMaxChars` (300) suppressed from chat
|
||||
|
||||
**HeartbeatRunner** (`infra/heartbeat-runner.ts`):
|
||||
- Per-agent intervals (default 30m)
|
||||
- `HeartbeatAgentState` tracks lastRunMs, nextDueMs, intervalMs
|
||||
- On fire: reads HEARTBEAT.md, builds prompt, dispatches inbound message
|
||||
|
||||
### Cron Service
|
||||
|
||||
**File:** `src/cron/service.ts`
|
||||
|
||||
Three schedule types:
|
||||
- `{ kind: "at"; at: string }` — one-shot
|
||||
- `{ kind: "every"; everyMs: number }` — interval
|
||||
- `{ kind: "cron"; expr: string; tz?: string; staggerMs?: number }` — cron expression
|
||||
|
||||
Two job payload types:
|
||||
- `systemEvent` — injects text into existing session (needs attention available)
|
||||
- `agentTurn` — fires full agent turn (true background autonomy)
|
||||
|
||||
Session targets: `"main" | "isolated" | "current" | "session:<id>"`.
|
||||
Isolated gets own session key with freshness/rollover logic.
|
||||
|
||||
Startup catchup: runs up to 5 missed jobs immediately, staggers rest (5s gap).
|
||||
Failure alerts after N consecutive errors, 1h cooldown.
|
||||
|
||||
### Multi-Agent Routing
|
||||
|
||||
Session key format: `agent:<agentId>:<key>`
|
||||
Type detection via: `isCronSessionKey()`, `isSubagentSessionKey()`, `isAcpSessionKey()`
|
||||
|
||||
Per-agent isolation: own workspace, session store, skill set, heartbeat config, model config.
|
||||
|
||||
Subagent spawning: ACP-based, session depth tracked in keys, reactivation support.
|
||||
|
||||
### Channel Adapter Interface
|
||||
|
||||
**File:** `src/channels/plugins/types.plugin.ts`
|
||||
|
||||
```typescript
|
||||
type ChannelPlugin<ResolvedAccount> = {
|
||||
id: ChannelId;
|
||||
meta: ChannelMeta;
|
||||
capabilities: ChannelCapabilities;
|
||||
outbound?: ChannelOutboundAdapter;
|
||||
messaging?: ChannelMessagingAdapter;
|
||||
lifecycle?: ChannelLifecycleAdapter;
|
||||
heartbeat?: ChannelHeartbeatAdapter;
|
||||
security?: ChannelSecurityAdapter;
|
||||
agentTools?: ChannelAgentToolFactory;
|
||||
streaming?: ChannelStreamingAdapter;
|
||||
threading?: ChannelThreadingAdapter;
|
||||
// ~15 optional adapter slots total
|
||||
}
|
||||
```
|
||||
|
||||
Restart policy: exponential backoff (5s initial, 5min max, factor 2, jitter 0.1,
|
||||
max 10 attempts).
|
||||
|
||||
### Security
|
||||
|
||||
- Exec approval: `ExecApprovalManager` with promise-based flow, `allow-once` vs
|
||||
`allow-always`, 15s grace timeout
|
||||
- Tool policy: `pickSandboxToolPolicy()` per sandbox config
|
||||
- Security audit: comprehensive checks (gateway auth, channel config, plugin trust,
|
||||
exec surfaces, filesystem ACLs)
|
||||
- Auth rate limiting with browser-specific stricter limits
|
||||
- External content guard: tracks provenance, `allowUnsafeExternalContent` flag
|
||||
|
||||
### Agent Configuration
|
||||
|
||||
```yaml
|
||||
agents:
|
||||
defaults:
|
||||
model:
|
||||
primary: "anthropic/claude-opus-4-5"
|
||||
fallbacks: ["anthropic/claude-sonnet-4-5"]
|
||||
heartbeat:
|
||||
enabled: true
|
||||
every: "30m"
|
||||
prompt: "Check HEARTBEAT.md"
|
||||
ackMaxChars: 300
|
||||
skills:
|
||||
limits:
|
||||
maxSkillsInPrompt: 150
|
||||
maxSkillsPromptChars: 30000
|
||||
list:
|
||||
- id: "myagent"
|
||||
workspace: "~/workspace"
|
||||
model:
|
||||
primary: "anthropic/claude-sonnet-4-5"
|
||||
heartbeat:
|
||||
every: "1h"
|
||||
skills:
|
||||
filter: ["github", "slack"]
|
||||
```
|
||||
|
||||
Workspace files: AGENTS.md, SOUL.md, TOOLS.md, IDENTITY.md, USER.md,
|
||||
HEARTBEAT.md, BOOTSTRAP.md, MEMORY.md.
|
||||
|
||||
### Plugin Hooks
|
||||
|
||||
29 lifecycle hook points:
|
||||
`before_model_resolve`, `before_prompt_build`, `before_agent_start`,
|
||||
`before_agent_reply`, `llm_input`, `llm_output`, `agent_end`,
|
||||
`inbound_claim`, `message_received`, `message_sending`, `message_sent`,
|
||||
`before_tool_call`, `after_tool_call`, `session_start`, `session_end`,
|
||||
`subagent_spawning`, `subagent_delivery_target`, `gateway_start`,
|
||||
`gateway_stop`, `before_dispatch`, `reply_dispatch`, `before_install`, etc.
|
||||
|
||||
---
|
||||
|
||||
## Patterns Worth Replicating in Agent Factory
|
||||
|
||||
### From Paperclip
|
||||
|
||||
1. **Heartbeat as context injection** — Each beat starts clean, loads curated context
|
||||
packet. Maps to: `/schedule` trigger + CLAUDE.md + memory files loaded per session.
|
||||
|
||||
2. **Adapter interface** — Clean `execute(ctx)` pattern. Maps to: our agent files
|
||||
are already adapter-like (model, tools, prompt per agent).
|
||||
|
||||
3. **Budget as governance primitive** — Post-hoc cost tracking with pause thresholds.
|
||||
Maps to: hook that reads `/usage` after each run, logs to cost-events file,
|
||||
alerts when threshold crossed.
|
||||
|
||||
4. **Task checkout via file locking** — Paperclip uses PostgreSQL. We can use
|
||||
file-based locking (write `task.lock` with agent name, check before claiming).
|
||||
|
||||
5. **Session persistence via taskKey** — Different tasks get different sessions.
|
||||
Maps to: `--resume` with task-specific session IDs.
|
||||
|
||||
### From OpenClaw
|
||||
|
||||
6. **HEARTBEAT.md with task parsing** — YAML tasks block with intervals and
|
||||
due-time checking. Maps directly to our generated HEARTBEAT.md files.
|
||||
|
||||
7. **Emptiness detection** — Skip API calls when heartbeat file is effectively empty.
|
||||
Critical cost saver. Include in generated heartbeat scripts.
|
||||
|
||||
8. **Skill prompt XML format** — Standardized skill discovery in system prompt.
|
||||
Our skills already use this via Claude Code's built-in mechanism.
|
||||
|
||||
9. **3-tier memory** — SESSION-STATE.md (hot) + daily logs (warm) + MEMORY.md (cold).
|
||||
Maps to: templates we generate in the user's project.
|
||||
|
||||
10. **Startup catchup with stagger** — Run missed jobs on restart, but don't
|
||||
thundering-herd. Include in generated automation scripts.
|
||||
|
||||
### Unique to Agent Factory
|
||||
|
||||
11. **Guided construction** — Neither tool helps you BUILD the system. We do.
|
||||
12. **Progressive complexity** — Start with 1 agent, grow to full org.
|
||||
13. **Domain templates** — Not just researcher→writer→reviewer. Monitoring,
|
||||
code review, data processing, research synthesis.
|
||||
14. **Claude Code-native** — No PostgreSQL, no Node.js server, no Docker required.
|
||||
Just agents, skills, hooks, settings.json, /schedule.
|
||||
|
|
@ -0,0 +1,215 @@
|
|||
---
|
||||
type: ultraresearch-brief
|
||||
created: 2026-04-11
|
||||
question: "Research OpenClaw and Paperclip agent frameworks to find inspiration and concrete value proposition for agent-builder plugin"
|
||||
confidence: 0.92
|
||||
dimensions: 7
|
||||
mcp_servers_used: []
|
||||
local_agents_used: [Explore]
|
||||
external_agents_used: [WebFetch, WebSearch]
|
||||
source_code_analyzed: [paperclip, openclaw]
|
||||
target_audience: "Claude Code users who know the primitives but need help composing agent systems"
|
||||
---
|
||||
|
||||
# OpenClaw & Paperclip Agent Framework Research
|
||||
|
||||
> Generated by ultraresearch-local on 2026-04-11
|
||||
|
||||
## Research Question
|
||||
|
||||
What features, architecture patterns, and capabilities do OpenClaw and Paperclip offer, and what can we learn from them to create a Claude Code plugin that makes it easy for anyone to build genuinely useful, self-running agent systems?
|
||||
|
||||
## Executive Summary
|
||||
|
||||
OpenClaw (354k stars) excels at individual agent capability — 23+ messaging channels, 5400+ skills, proactive agent patterns with self-improvement guardrails, and 3-tier memory systems. Paperclip (51k stars) excels at organizational coordination — heartbeat scheduling, goal hierarchies, budget enforcement, and governance. Neither offers guided, agentic-assisted construction of complete agent systems, which is the unique gap our plugin fills. Confidence is high for OpenClaw (verified via docs, GitHub, and existing codebase references) and medium for Paperclip (verified via docs site, GitHub, and multiple third-party articles).
|
||||
|
||||
## Dimensions
|
||||
|
||||
### 1. Core Capabilities -- Confidence: high
|
||||
|
||||
**OpenClaw:**
|
||||
- Personal AI assistant running on your own devices
|
||||
- 23+ messaging channels (WhatsApp, Telegram, Slack, Discord, Signal, iMessage, IRC, Teams, Matrix, and more)
|
||||
- 100+ preconfigured AgentSkills for shell, file, and web automation
|
||||
- Canvas/A2UI — agent-driven visual workspace (unique capability, no Claude Code equivalent)
|
||||
- Browser control via dedicated Chrome/Chromium with CDP
|
||||
- Voice capabilities with wake words (macOS/iOS) and continuous voice (Android)
|
||||
- Device node system (camera, screen recording, location, notifications)
|
||||
- Model-agnostic: Claude, GPT, Gemini, Ollama all supported
|
||||
- Source: [GitHub](https://github.com/openclaw/openclaw), [DigitalOcean guide](https://www.digitalocean.com/resources/articles/what-is-openclaw)
|
||||
|
||||
**Paperclip:**
|
||||
- Orchestration platform for teams of AI agents ("If OpenClaw is an employee, Paperclip is the company")
|
||||
- Agent-agnostic: supports OpenClaw, Claude Code, Codex, Cursor, bash, HTTP webhooks
|
||||
- Explicitly NOT an agent framework — doesn't build agents, organizes them
|
||||
- Explicitly NOT a chatbot, workflow builder, or prompt manager
|
||||
- Ticket-based task management with threaded conversations
|
||||
- Multi-company support with complete data isolation
|
||||
- Source: [GitHub](https://github.com/paperclipai/paperclip), [paperclip.ing](https://paperclip.ing/docs)
|
||||
|
||||
### 2. Architecture & Patterns -- Confidence: high
|
||||
|
||||
**OpenClaw architecture:**
|
||||
- Gateway control plane on ws://127.0.0.1:18789
|
||||
- Channel Adapters transform protocol-specific input into unified message objects
|
||||
- Multi-agent routing: isolated sessions per agent, workspace, or sender
|
||||
- Pi agent runtime in RPC mode with tool/block streaming
|
||||
- Node.js + TypeScript, pnpm, WebSocket protocol
|
||||
- Source: [GitHub README](https://github.com/openclaw/openclaw), [Medium architecture article](https://bibek-poudel.medium.com/how-openclaw-works-understanding-ai-agents-through-a-real-architecture-5d59cc7a4764)
|
||||
|
||||
**Paperclip architecture:**
|
||||
- Node.js backend + React UI + PostgreSQL
|
||||
- Company-as-runtime model: agents modeled as employees
|
||||
- Heartbeat scheduler fires agent execution at defined intervals
|
||||
- Each beat is stateless — state lives in external storage (Postgres)
|
||||
- Atomic operations for task checkout and budget enforcement
|
||||
- Source: [GitHub](https://github.com/paperclipai/paperclip), [Towards AI article](https://pub.towardsai.net/paperclip-the-open-source-operating-system-for-zero-human-companies-2c16f3f22182)
|
||||
|
||||
### 3. Self-Learning & Autonomy -- Confidence: medium
|
||||
|
||||
**OpenClaw — Proactive Agent Skill (most sophisticated pattern found):**
|
||||
- 3-tier memory: SESSION-STATE.md (working memory), memory/YYYY-MM-DD.md (daily capture), MEMORY.md (curated long-term)
|
||||
- WAL Protocol (Write-Ahead Logging): write important details BEFORE responding
|
||||
- Working Buffer Protocol: captures exchanges in "danger zone" (60%+ context) before compaction
|
||||
- Compaction Recovery: reads buffer, session state, daily notes, then searches all sources
|
||||
- Self-improvement guardrails:
|
||||
- ADL (Anti-Drift Limits): no fake intelligence, no unverifiable mods, no novelty over stability
|
||||
- VFM (Value-First Modification): score changes on frequency, failure reduction, burden reduction, cost savings. Only implement if score > 50
|
||||
- Priority: Stability > Explainability > Reusability > Scalability > Novelty
|
||||
- Two cron types: `systemEvent` (needs attention) vs `isolated agentTurn` (true background autonomy)
|
||||
- Self-healing: try 5-10 approaches before asking for help
|
||||
- Source: [Proactive Agent Skill on GitHub](https://github.com/openclaw/skills/blob/main/skills/halthelobster/proactive-agent/SKILL.md)
|
||||
|
||||
**Paperclip:**
|
||||
- Heartbeat model with context injection (Memento Man mental model)
|
||||
- Memory doesn't live in agent session — external storage maintains continuity
|
||||
- Context packets: curated payloads with memory state, task queue, recent events, agent config
|
||||
- No explicit self-learning mechanism documented, but rich audit trail enables pattern detection
|
||||
- Skills as markdown instruction files, installable via GitHub URLs
|
||||
- Source: [MindStudio heartbeat article](https://www.mindstudio.ai/blog/heartbeat-pattern-paperclip-ai-agents-24-7)
|
||||
|
||||
### 4. User Experience & Onboarding -- Confidence: high
|
||||
|
||||
**OpenClaw:**
|
||||
- `npm install -g openclaw@latest && openclaw onboard --install-daemon`
|
||||
- Requires Node 24 (recommended) or 22.16+
|
||||
- Has "Cowork" variant specifically because core is too hard for non-developers
|
||||
- Doctor CLI for troubleshooting and migrations
|
||||
- Pairing mode for security (unknown senders get pairing codes)
|
||||
- 3 release channels: stable, beta, dev
|
||||
- Source: [GitHub README](https://github.com/openclaw/openclaw)
|
||||
|
||||
**Paperclip:**
|
||||
- `npx paperclipai onboard --yes` — quick start
|
||||
- React dashboard for agent management
|
||||
- Mobile-friendly interface
|
||||
- Requires Node 20+ and pnpm 9.15+
|
||||
- No guided construction — you configure agents manually
|
||||
- Source: [GitHub README](https://github.com/paperclipai/paperclip)
|
||||
|
||||
### 5. Multi-Agent Orchestration -- Confidence: high
|
||||
|
||||
**OpenClaw:**
|
||||
- Session tools for agent-to-agent communication: sessions_list, sessions_history, sessions_send
|
||||
- Reply-back mechanism for async coordination
|
||||
- Route channels/accounts/peers to isolated agents with dedicated workspaces
|
||||
- No organizational structure (flat, peer-to-peer)
|
||||
- Source: [GitHub README](https://github.com/openclaw/openclaw)
|
||||
|
||||
**Paperclip:**
|
||||
- Org chart with hierarchies, roles, and reporting lines
|
||||
- Cascading delegation — work flows up and down org chart automatically
|
||||
- Goal-aware task execution with full ancestry
|
||||
- Atomic task checkout prevents double-work
|
||||
- Cross-team requests delegate to best agent
|
||||
- Human as "board of directors" with override authority
|
||||
- Source: [paperclip.ing/docs](https://paperclip.ing/docs), [Medium article](https://medium.com/@creativeaininja/paperclip-the-open-source-platform-turning-ai-agents-into-an-actual-company-7348015c5bf7)
|
||||
|
||||
### 6. Extensibility & Integrations -- Confidence: medium
|
||||
|
||||
**OpenClaw:**
|
||||
- Skills marketplace with 5400+ community skills (26% flagged with vulnerabilities)
|
||||
- Skills installed via URL with auto-updating
|
||||
- Plugin system and channel adapter architecture
|
||||
- Bundled/managed/workspace skill tiers
|
||||
- Source: [VoltAgent awesome-openclaw-skills](https://github.com/VoltAgent/awesome-openclaw-skills)
|
||||
|
||||
**Paperclip:**
|
||||
- Plugin ecosystem (awesome-paperclip curated list)
|
||||
- Runtime skill injection without retraining
|
||||
- Import/export of company templates
|
||||
- Skills as markdown files
|
||||
- Source: [GitHub README](https://github.com/paperclipai/paperclip)
|
||||
|
||||
### 7. Deployment & Operations -- Confidence: high
|
||||
|
||||
**OpenClaw:**
|
||||
- Docker-based containment (agent runs inside container — blast radius limited)
|
||||
- Tailscale Serve/Funnel for remote access
|
||||
- SSH tunnels with token/password auth
|
||||
- Nix declarative configuration
|
||||
- Always-on via daemon install
|
||||
- Source: [GitHub README](https://github.com/openclaw/openclaw)
|
||||
|
||||
**Paperclip:**
|
||||
- Self-hosted, MIT, no mandatory accounts
|
||||
- Local-first: embedded Node.js + Postgres
|
||||
- Multi-company isolation on single infrastructure
|
||||
- Per-agent monthly budgets with automatic throttling
|
||||
- Immutable audit logs with full tool-call tracing
|
||||
- Config versioning with rollback
|
||||
- Source: [paperclip.ing/docs](https://paperclip.ing/docs)
|
||||
|
||||
## Synthesis
|
||||
|
||||
The critical insight is that OpenClaw and Paperclip operate at **different layers of the same stack**:
|
||||
|
||||
- **OpenClaw** = the agent runtime layer (what an individual agent can do)
|
||||
- **Paperclip** = the orchestration layer (how agents coordinate as a team)
|
||||
- **Agent Factory** = the construction layer (how you build and configure both)
|
||||
|
||||
Neither tool offers what our plugin does: a guided, interview-driven, AI-assisted workflow that generates a complete agent system from scratch. OpenClaw's "Cowork" variant exists precisely because the core tool is too hard for non-developers — this validates that there's demand for lower-barrier agent creation. Paperclip's manual configuration model means every agent needs hand-crafting before it can be "hired."
|
||||
|
||||
The most powerful patterns to incorporate:
|
||||
|
||||
1. **From OpenClaw:** 3-tier memory with WAL protocol, proactive agent pattern with self-improvement guardrails (ADL/VFM), isolated agentTurn for background autonomy
|
||||
2. **From Paperclip:** Heartbeat with context injection, goal hierarchy, budget enforcement, governance model ("autonomy is a privilege you grant")
|
||||
3. **Unique to us:** Progressive complexity (1 agent → full org), agentically-guided construction, domain-specific templates, Claude Code-native (no external infrastructure)
|
||||
|
||||
The security philosophies are complementary, not conflicting: OpenClaw uses containment (Docker — limit blast radius), our plugin uses prevention (hooks/deny — stop before it happens). Both should be available.
|
||||
|
||||
## Open Questions
|
||||
|
||||
- **Canvas/A2UI details** — What does OpenClaw's visual workspace actually generate? HTML? Native UI? Understanding this clarifies whether it's worth pursuing for Claude Code.
|
||||
- **Paperclip self-learning implementation** — The heartbeat + audit trail creates rich data, but no explicit feedback loop is documented. Is this a planned feature or deliberately excluded?
|
||||
- **OpenClaw skill security** — 26% of community skills flagged with vulnerabilities. What vetting process exists, and should we build one?
|
||||
- **Cowork UX** — What does OpenClaw's simplified non-developer experience look like? This directly informs our target audience.
|
||||
|
||||
## Recommendation
|
||||
|
||||
Build Agent Factory as a 5-phase evolution:
|
||||
|
||||
1. **v0.2:** Fix existing gaps (3 missing commands, deployment-advisor, managed-agents skill, domain templates)
|
||||
2. **v0.3:** Incorporate OpenClaw patterns (3-tier memory, WAL, proactive agent, isolated cron)
|
||||
3. **v0.4:** Incorporate Paperclip patterns (heartbeat, goal hierarchy, budgets, governance, org-chart)
|
||||
4. **v0.5:** Self-learning systems (feedback loops, performance scoring, pipeline optimization)
|
||||
5. **v1.0:** Full integration (MCP integrations, Docker deployment, templates marketplace, import/export)
|
||||
|
||||
The key differentiator throughout: every feature is accessible through guided, AI-assisted construction with progressive complexity — start simple, grow as needed.
|
||||
|
||||
## Sources
|
||||
|
||||
| # | Source | Type | Quality | Used in |
|
||||
|---|--------|------|---------|---------|
|
||||
| 1 | [OpenClaw GitHub](https://github.com/openclaw/openclaw) | official | high | 1,2,4,5,7 |
|
||||
| 2 | [Paperclip GitHub](https://github.com/paperclipai/paperclip) | official | high | 1,2,4,5,6,7 |
|
||||
| 3 | [OpenClaw Docs](https://docs.openclaw.ai) | official | high | 2,5 |
|
||||
| 4 | [Paperclip Docs](https://paperclip.ing/docs) | official | high | 2,3,5,7 |
|
||||
| 5 | [Proactive Agent Skill](https://github.com/openclaw/skills/blob/main/skills/halthelobster/proactive-agent/SKILL.md) | official | high | 3 |
|
||||
| 6 | [MindStudio Heartbeat Article](https://www.mindstudio.ai/blog/heartbeat-pattern-paperclip-ai-agents-24-7) | community | medium | 3 |
|
||||
| 7 | [DigitalOcean: What is OpenClaw](https://www.digitalocean.com/resources/articles/what-is-openclaw) | community | medium | 1 |
|
||||
| 8 | [Medium: How OpenClaw Works](https://bibek-poudel.medium.com/how-openclaw-works-understanding-ai-agents-through-a-real-architecture-5d59cc7a4764) | community | medium | 2 |
|
||||
| 9 | [Medium: Paperclip as Company](https://medium.com/@creativeaininja/paperclip-the-open-source-platform-turning-ai-agents-into-an-actual-company-7348015c5bf7) | community | medium | 5 |
|
||||
| 10 | [awesome-openclaw-skills](https://github.com/VoltAgent/awesome-openclaw-skills) | community | medium | 6 |
|
||||
| 11 | `skills/agent-system-design/references/feature-map.md` | codebase | high | 1 |
|
||||
| 12 | `skills/agent-system-design/references/security-patterns.md` | codebase | high | 7 |
|
||||
106
.claude/ultraplan-spec-2026-04-11-agent-factory.md
Normal file
106
.claude/ultraplan-spec-2026-04-11-agent-factory.md
Normal file
|
|
@ -0,0 +1,106 @@
|
|||
# Task: Agent Factory — Full Vision Realization
|
||||
|
||||
## Goal
|
||||
|
||||
Transform the existing agent-builder plugin into Agent Factory: a comprehensive,
|
||||
guided system for building autonomous agent systems using Claude Code. The plugin
|
||||
should take users from zero to a fully operational multi-agent system through a
|
||||
7-phase guided workflow, incorporating best patterns from OpenClaw (individual
|
||||
agent capability) and Paperclip (organizational coordination).
|
||||
|
||||
Success = all 5 development phases completed, delivering: foundational commands
|
||||
and agents, OpenClaw-inspired memory/autonomy patterns, Paperclip-inspired
|
||||
orchestration/governance patterns, self-learning systems, and full integration
|
||||
with import/export and bundled templates.
|
||||
|
||||
## Non-Goals
|
||||
|
||||
- Building a central registry or marketplace with community features
|
||||
- Replacing OpenClaw or Paperclip — Agent Factory is the construction layer
|
||||
- Supporting non-Claude-Code agent runtimes
|
||||
- Building a web UI or dashboard
|
||||
- Vector/embedding-based memory (OpenClaw uses sqlite-vec — we stay file-based)
|
||||
- Canvas/A2UI equivalent (confirmed as just a static file server in OpenClaw)
|
||||
|
||||
## Constraints
|
||||
|
||||
- macOS Intel (bash 3.2 compatibility for all generated scripts)
|
||||
- No external infrastructure required for core functionality (PostgreSQL, Node.js server)
|
||||
- Claude Code native: agents, skills, hooks, settings.json, /schedule
|
||||
- Plugin structure must follow Claude Code plugin conventions
|
||||
- All templates are plain files with `{{PLACEHOLDER}}` variables, replaced via
|
||||
string operations (no template engine dependency)
|
||||
- Generated hook scripts must be bash 3.2 compatible
|
||||
- Agent YAML frontmatter must be valid
|
||||
- Never write files outside the user's project directory
|
||||
- `${CLAUDE_PLUGIN_ROOT}` for all intra-plugin paths
|
||||
|
||||
## Preferences
|
||||
|
||||
- TypeScript for any scripting within the plugin itself
|
||||
- Rene .md/.sh templates with placeholder comments
|
||||
- Conventional Commits for all checkpoint commits
|
||||
- Progressive complexity: 1 agent → full org
|
||||
- Domain-specific pipeline templates (not just generic)
|
||||
- Multi-target deployment from start: /schedule, Docker, systemd
|
||||
|
||||
## Non-Functional Requirements
|
||||
|
||||
- Budget tracking via Anthropic API integration (not /usage parsing)
|
||||
- Import/export of complete agent systems (tarball format)
|
||||
- 5-10 bundled domain templates as starting points
|
||||
- All generated agents must include verification commands
|
||||
- Governance patterns must include human oversight gates
|
||||
- Self-improvement must have guardrails (ADL/VFM-inspired)
|
||||
|
||||
## Success Criteria
|
||||
|
||||
- Plugin installs and `/agent-factory build` runs the guided workflow
|
||||
- All 4 commands work: `/agent-factory build`, `/agent-factory deploy`,
|
||||
`/agent-factory evaluate`, `/agent-factory status`
|
||||
- `deployment-advisor` agent provides deployment recommendations
|
||||
- `managed-agents` skill triggers on agent-related questions
|
||||
- Generated agent systems include 3-tier memory templates
|
||||
- Generated heartbeat files parse correctly with interval tracking
|
||||
- Budget hooks log costs and alert on threshold
|
||||
- Import/export round-trips: export → import in new project → system works
|
||||
- At least 5 bundled domain templates available
|
||||
- All generated bash scripts pass `bash -n` syntax check on bash 3.2
|
||||
|
||||
## Prior Attempts
|
||||
|
||||
- v0.1.0 (current): Initial plugin with `/agent-factory build` command,
|
||||
builder agent, 2 skills, basic templates. Missing: deploy, evaluate,
|
||||
status commands. deployment-advisor agent stubbed but not implemented.
|
||||
managed-agents skill empty.
|
||||
|
||||
## Open Questions
|
||||
|
||||
- Anthropic billing API: exact endpoint and auth mechanism needs verification
|
||||
before implementation. [ASSUMPTION: API exists and is accessible with API key]
|
||||
- /schedule API stability: is the trigger interface stable enough to build on?
|
||||
[ASSUMPTION: yes, based on current Claude Code docs]
|
||||
- Docker deployment: should we generate Dockerfile or docker-compose.yml or both?
|
||||
[ASSUMPTION: docker-compose.yml with Dockerfile, matching Paperclip's approach]
|
||||
|
||||
## Research Context
|
||||
|
||||
Two research briefs inform this plan:
|
||||
1. **ultraresearch-2026-04-11-openclaw-paperclip-agent-frameworks.md** (confidence: 0.92)
|
||||
— Feature comparison, architecture, patterns, synthesis
|
||||
2. **source-code-analysis-2026-04-11.md** — Implementation-level details from
|
||||
actual source code of both projects
|
||||
|
||||
Key patterns to replicate (from research):
|
||||
- OpenClaw: 3-tier memory, WAL protocol, Working Buffer Protocol, proactive agent
|
||||
with ADL/VFM guardrails, isolated agentTurn cron, emptiness detection
|
||||
- Paperclip: Heartbeat with context injection, goal hierarchy (simple parent_id),
|
||||
budget enforcement (post-hoc), task checkout via file locking, adapter interface,
|
||||
org chart (reportsTo FK)
|
||||
|
||||
## Metadata
|
||||
|
||||
- **Created:** 2026-04-11
|
||||
- **Mode:** interview
|
||||
- **Source:** ultraplan interview
|
||||
- **Research:** 2 briefs (openclaw-paperclip frameworks + source code analysis)
|
||||
Loading…
Add table
Add a link
Reference in a new issue