agent-builder/.claude/research/ultraresearch-2026-04-11-openclaw-paperclip-agent-frameworks.md
Kjell Tore Guttormsen 7419d4283d docs(plans): Agent Factory ultraplan + execution guide
27-step plan across 8 sessions in 3 waves for transforming
agent-builder into Agent Factory v1.0.0. Includes research briefs,
spec, and wave-by-wave execution prompts with scope fences.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-11 07:35:29 +02:00

13 KiB

type created question confidence dimensions mcp_servers_used local_agents_used external_agents_used source_code_analyzed target_audience
ultraresearch-brief 2026-04-11 Research OpenClaw and Paperclip agent frameworks to find inspiration and concrete value proposition for agent-builder plugin 0.92 7
Explore
WebFetch
WebSearch
paperclip
openclaw
Claude Code users who know the primitives but need help composing agent systems

OpenClaw & Paperclip Agent Framework Research

Generated by ultraresearch-local on 2026-04-11

Research Question

What features, architecture patterns, and capabilities do OpenClaw and Paperclip offer, and what can we learn from them to create a Claude Code plugin that makes it easy for anyone to build genuinely useful, self-running agent systems?

Executive Summary

OpenClaw (354k stars) excels at individual agent capability — 23+ messaging channels, 5400+ skills, proactive agent patterns with self-improvement guardrails, and 3-tier memory systems. Paperclip (51k stars) excels at organizational coordination — heartbeat scheduling, goal hierarchies, budget enforcement, and governance. Neither offers guided, agentic-assisted construction of complete agent systems, which is the unique gap our plugin fills. Confidence is high for OpenClaw (verified via docs, GitHub, and existing codebase references) and medium for Paperclip (verified via docs site, GitHub, and multiple third-party articles).

Dimensions

1. Core Capabilities -- Confidence: high

OpenClaw:

  • Personal AI assistant running on your own devices
  • 23+ messaging channels (WhatsApp, Telegram, Slack, Discord, Signal, iMessage, IRC, Teams, Matrix, and more)
  • 100+ preconfigured AgentSkills for shell, file, and web automation
  • Canvas/A2UI — agent-driven visual workspace (unique capability, no Claude Code equivalent)
  • Browser control via dedicated Chrome/Chromium with CDP
  • Voice capabilities with wake words (macOS/iOS) and continuous voice (Android)
  • Device node system (camera, screen recording, location, notifications)
  • Model-agnostic: Claude, GPT, Gemini, Ollama all supported
  • Source: GitHub, DigitalOcean guide

Paperclip:

  • Orchestration platform for teams of AI agents ("If OpenClaw is an employee, Paperclip is the company")
  • Agent-agnostic: supports OpenClaw, Claude Code, Codex, Cursor, bash, HTTP webhooks
  • Explicitly NOT an agent framework — doesn't build agents, organizes them
  • Explicitly NOT a chatbot, workflow builder, or prompt manager
  • Ticket-based task management with threaded conversations
  • Multi-company support with complete data isolation
  • Source: GitHub, paperclip.ing

2. Architecture & Patterns -- Confidence: high

OpenClaw architecture:

  • Gateway control plane on ws://127.0.0.1:18789
  • Channel Adapters transform protocol-specific input into unified message objects
  • Multi-agent routing: isolated sessions per agent, workspace, or sender
  • Pi agent runtime in RPC mode with tool/block streaming
  • Node.js + TypeScript, pnpm, WebSocket protocol
  • Source: GitHub README, Medium architecture article

Paperclip architecture:

  • Node.js backend + React UI + PostgreSQL
  • Company-as-runtime model: agents modeled as employees
  • Heartbeat scheduler fires agent execution at defined intervals
  • Each beat is stateless — state lives in external storage (Postgres)
  • Atomic operations for task checkout and budget enforcement
  • Source: GitHub, Towards AI article

3. Self-Learning & Autonomy -- Confidence: medium

OpenClaw — Proactive Agent Skill (most sophisticated pattern found):

  • 3-tier memory: SESSION-STATE.md (working memory), memory/YYYY-MM-DD.md (daily capture), MEMORY.md (curated long-term)
  • WAL Protocol (Write-Ahead Logging): write important details BEFORE responding
  • Working Buffer Protocol: captures exchanges in "danger zone" (60%+ context) before compaction
  • Compaction Recovery: reads buffer, session state, daily notes, then searches all sources
  • Self-improvement guardrails:
    • ADL (Anti-Drift Limits): no fake intelligence, no unverifiable mods, no novelty over stability
    • VFM (Value-First Modification): score changes on frequency, failure reduction, burden reduction, cost savings. Only implement if score > 50
    • Priority: Stability > Explainability > Reusability > Scalability > Novelty
  • Two cron types: systemEvent (needs attention) vs isolated agentTurn (true background autonomy)
  • Self-healing: try 5-10 approaches before asking for help
  • Source: Proactive Agent Skill on GitHub

Paperclip:

  • Heartbeat model with context injection (Memento Man mental model)
  • Memory doesn't live in agent session — external storage maintains continuity
  • Context packets: curated payloads with memory state, task queue, recent events, agent config
  • No explicit self-learning mechanism documented, but rich audit trail enables pattern detection
  • Skills as markdown instruction files, installable via GitHub URLs
  • Source: MindStudio heartbeat article

4. User Experience & Onboarding -- Confidence: high

OpenClaw:

  • npm install -g openclaw@latest && openclaw onboard --install-daemon
  • Requires Node 24 (recommended) or 22.16+
  • Has "Cowork" variant specifically because core is too hard for non-developers
  • Doctor CLI for troubleshooting and migrations
  • Pairing mode for security (unknown senders get pairing codes)
  • 3 release channels: stable, beta, dev
  • Source: GitHub README

Paperclip:

  • npx paperclipai onboard --yes — quick start
  • React dashboard for agent management
  • Mobile-friendly interface
  • Requires Node 20+ and pnpm 9.15+
  • No guided construction — you configure agents manually
  • Source: GitHub README

5. Multi-Agent Orchestration -- Confidence: high

OpenClaw:

  • Session tools for agent-to-agent communication: sessions_list, sessions_history, sessions_send
  • Reply-back mechanism for async coordination
  • Route channels/accounts/peers to isolated agents with dedicated workspaces
  • No organizational structure (flat, peer-to-peer)
  • Source: GitHub README

Paperclip:

  • Org chart with hierarchies, roles, and reporting lines
  • Cascading delegation — work flows up and down org chart automatically
  • Goal-aware task execution with full ancestry
  • Atomic task checkout prevents double-work
  • Cross-team requests delegate to best agent
  • Human as "board of directors" with override authority
  • Source: paperclip.ing/docs, Medium article

6. Extensibility & Integrations -- Confidence: medium

OpenClaw:

  • Skills marketplace with 5400+ community skills (26% flagged with vulnerabilities)
  • Skills installed via URL with auto-updating
  • Plugin system and channel adapter architecture
  • Bundled/managed/workspace skill tiers
  • Source: VoltAgent awesome-openclaw-skills

Paperclip:

  • Plugin ecosystem (awesome-paperclip curated list)
  • Runtime skill injection without retraining
  • Import/export of company templates
  • Skills as markdown files
  • Source: GitHub README

7. Deployment & Operations -- Confidence: high

OpenClaw:

  • Docker-based containment (agent runs inside container — blast radius limited)
  • Tailscale Serve/Funnel for remote access
  • SSH tunnels with token/password auth
  • Nix declarative configuration
  • Always-on via daemon install
  • Source: GitHub README

Paperclip:

  • Self-hosted, MIT, no mandatory accounts
  • Local-first: embedded Node.js + Postgres
  • Multi-company isolation on single infrastructure
  • Per-agent monthly budgets with automatic throttling
  • Immutable audit logs with full tool-call tracing
  • Config versioning with rollback
  • Source: paperclip.ing/docs

Synthesis

The critical insight is that OpenClaw and Paperclip operate at different layers of the same stack:

  • OpenClaw = the agent runtime layer (what an individual agent can do)
  • Paperclip = the orchestration layer (how agents coordinate as a team)
  • Agent Factory = the construction layer (how you build and configure both)

Neither tool offers what our plugin does: a guided, interview-driven, AI-assisted workflow that generates a complete agent system from scratch. OpenClaw's "Cowork" variant exists precisely because the core tool is too hard for non-developers — this validates that there's demand for lower-barrier agent creation. Paperclip's manual configuration model means every agent needs hand-crafting before it can be "hired."

The most powerful patterns to incorporate:

  1. From OpenClaw: 3-tier memory with WAL protocol, proactive agent pattern with self-improvement guardrails (ADL/VFM), isolated agentTurn for background autonomy
  2. From Paperclip: Heartbeat with context injection, goal hierarchy, budget enforcement, governance model ("autonomy is a privilege you grant")
  3. Unique to us: Progressive complexity (1 agent → full org), agentically-guided construction, domain-specific templates, Claude Code-native (no external infrastructure)

The security philosophies are complementary, not conflicting: OpenClaw uses containment (Docker — limit blast radius), our plugin uses prevention (hooks/deny — stop before it happens). Both should be available.

Open Questions

  • Canvas/A2UI details — What does OpenClaw's visual workspace actually generate? HTML? Native UI? Understanding this clarifies whether it's worth pursuing for Claude Code.
  • Paperclip self-learning implementation — The heartbeat + audit trail creates rich data, but no explicit feedback loop is documented. Is this a planned feature or deliberately excluded?
  • OpenClaw skill security — 26% of community skills flagged with vulnerabilities. What vetting process exists, and should we build one?
  • Cowork UX — What does OpenClaw's simplified non-developer experience look like? This directly informs our target audience.

Recommendation

Build Agent Factory as a 5-phase evolution:

  1. v0.2: Fix existing gaps (3 missing commands, deployment-advisor, managed-agents skill, domain templates)
  2. v0.3: Incorporate OpenClaw patterns (3-tier memory, WAL, proactive agent, isolated cron)
  3. v0.4: Incorporate Paperclip patterns (heartbeat, goal hierarchy, budgets, governance, org-chart)
  4. v0.5: Self-learning systems (feedback loops, performance scoring, pipeline optimization)
  5. v1.0: Full integration (MCP integrations, Docker deployment, templates marketplace, import/export)

The key differentiator throughout: every feature is accessible through guided, AI-assisted construction with progressive complexity — start simple, grow as needed.

Sources

# Source Type Quality Used in
1 OpenClaw GitHub official high 1,2,4,5,7
2 Paperclip GitHub official high 1,2,4,5,6,7
3 OpenClaw Docs official high 2,5
4 Paperclip Docs official high 2,3,5,7
5 Proactive Agent Skill official high 3
6 MindStudio Heartbeat Article community medium 3
7 DigitalOcean: What is OpenClaw community medium 1
8 Medium: How OpenClaw Works community medium 2
9 Medium: Paperclip as Company community medium 5
10 awesome-openclaw-skills community medium 6
11 skills/agent-system-design/references/feature-map.md codebase high 1
12 skills/agent-system-design/references/security-patterns.md codebase high 7