feat: initial agent-builder plugin (v0.1.0)

Build complete autonomous agent systems with Claude Code. 7-phase guided workflow: map work, CLAUDE.md, agent team, pipeline, security, deployment, test. Components: - commands/build.md: main guided workflow - agents/builder.md: scaffolding agent - skills/agent-system-design: architecture knowledge + 4 references - scripts/templates: hooks, automation, launchd, systemd Covers 22 OpenClaw capabilities across 4 deployment targets (local, Mac Mini, VPS, Managed Agents). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-10 19:10:54 +02:00 · 2026-04-10 19:10:54 +02:00 · 075383990f
commit 075383990f
17 changed files with 1895 additions and 0 deletions
--- a/skills/agent-system-design/SKILL.md
+++ b/skills/agent-system-design/SKILL.md
@ -0,0 +1,115 @@
+---
+name: agent-system-design
+description: |
+  This skill should be used when the user asks about "building an agent",
+  "autonomous agent system", "agent architecture", "OpenClaw alternative",
+  "always-on agent", "personal AI agent", "complete agent system",
+  "agent that runs itself", "agent pipeline design", "multi-agent system",
+  "how to build an agent with Claude Code"
+version: 0.1.0
+---
+
+## What is an autonomous agent system
+
+An autonomous agent system is a set of Claude Code components that work together to execute multi-step workflows with minimal human intervention. The system runs on a schedule, responds to triggers, processes inputs, and produces outputs — all orchestrated through Claude Code's native primitives.
+
+You do not need a separate orchestration framework. Claude Code provides everything required: subagents, skills, hooks, and automation scripts.
+
+## Core architecture pattern
+
+The foundational pattern is a three-agent pipeline with a skill that chains them:
+
+```
+Trigger (launchd/cron/manual)
+  → Pipeline skill
+    → Agent 1: Researcher  (gather and structure inputs)
+    → Agent 2: Writer      (produce primary output)
+    → Agent 3: Reviewer    (evaluate and approve or revise)
+  → Save outputs
+  → Update memory
+```
+
+The pipeline skill is a `.claude/skills/<name>/SKILL.md` file with step-by-step instructions. Each step invokes an agent using the `Agent` tool. Agents are defined in `.claude/agents/*.md`.
+
+This pattern scales from simple (one agent, one step) to complex (six agents, branching logic, parallel execution).
+
+## System components
+
+A complete agent system has seven components:
+
+| Component | Location | Purpose |
+|-----------|----------|---------|
+| CLAUDE.md | project root | Context, rules, and memory pointers for the project |
+| Agents | `.claude/agents/*.md` | Specialized subagents with defined roles and tool access |
+| Pipeline skills | `.claude/skills/*/SKILL.md` | Orchestration sequences that chain agents |
+| Knowledge skills | `.claude/skills/*/SKILL.md` | Reference knowledge auto-injected by topic |
+| Hooks | `hooks/*.sh` | Pre/post tool use guards for automated safety |
+| Settings | `.claude/settings.json` | Permissions, tool allowlists, and hook wiring |
+| Automation | `scripts/` + `launchd/` | Scheduled execution (Mac: launchd, Linux: systemd/cron) |
+| Memory | `memory/` or `data/` | Persistent state files updated each run |
+
+Not all components are required for every system. Start with agents + one pipeline skill. Add hooks and automation when you move to scheduled/unattended operation.
+
+## Design principles
+
+**Start with 2-3 agents.** A researcher and a writer cover most content workflows. A reviewer adds quality gates. Resist the urge to create more agents than you need — each agent adds latency and cost.
+
+**Pipeline skills chain agents.** The skill file is the orchestrator. It contains the sequencing logic, error handling instructions, and output routing. Keep agents dumb and focused; put the workflow intelligence in the skill.
+
+**Hooks protect automated runs.** When Claude runs unattended, hooks are your circuit breakers. Use `PreToolUse` hooks to block writes to sensitive paths, enforce naming conventions, or validate inputs before destructive operations. Without hooks, an unattended run has no guardrails.
+
+**Memory persists state between runs.** Agents cannot remember previous sessions by default. A memory file (e.g., `memory/MEMORY.md` or `data/run-state.json`) gives the system continuity. Update it at the end of every pipeline run.
+
+**One concern per agent.** Agents that do too many things are hard to tune and debug. A researcher should research. A writer should write. Mixing concerns makes prompt engineering harder and output quality lower.
+
+## Deployment options
+
+| Platform | Best for | Notes |
+|----------|----------|-------|
+| Local workstation | Development, on-demand runs | Use launchd (Mac) or cron (Linux) for scheduling |
+| Mac Mini (always-on) | Personal production pipelines | Runs overnight; launchd + wake schedule |
+| VPS (Hetzner, DO) | Server-side automation, webhooks | systemd service; pair with SSH access |
+| Managed Agents | High-volume, API-driven workflows | Claude API `/v1/agents` endpoint; no local shell needed |
+
+For most personal or small-team use cases, a Mac Mini or VPS running launchd/systemd is sufficient and much cheaper than Managed Agents at scale.
+
+## OpenClaw capability coverage
+
+This architecture covers 22 OpenClaw capabilities. 13 are a full match via native Claude Code primitives. 8 use a different approach but achieve the same outcome. 1 gap remains.
+
+For the detailed mapping, see:
+`${CLAUDE_PLUGIN_ROOT}/skills/agent-system-design/references/feature-map.md`
+
+## Agent frontmatter fields
+
+```yaml
+name: <slug>           # required, used for routing
+description: |         # required, must include <example> blocks
+  ...
+model: sonnet|opus     # required
+tools: [...]           # required, explicit allowlist
+color: <color>         # optional, UI hint
+```
+
+The `description` field is what triggers agent selection. Write it with concrete user phrases, not abstract capability descriptions.
+
+## Common mistakes
+
+- **No examples in agent description** — Claude cannot reliably select the agent without seeing what user messages should trigger it
+- **Hooks skipped in automation** — unattended runs need guards; add hooks before scheduling
+- **Memory not updated** — next run starts blind; always write state at the end of a pipeline
+- **One giant agent** — harder to tune, harder to debug; split by role
+- **Wrong model for the job** — opus for synthesis and narrative, sonnet for retrieval and transformation
+
+## Getting started
+
+Run `/agent-builder:build` for the guided 7-phase workflow. It interviews you about your use case, selects the right pattern, and generates all files.
+
+For pipeline design patterns and agent role templates, see:
+`${CLAUDE_PLUGIN_ROOT}/skills/agent-system-design/references/pipeline-patterns.md`
+
+For scheduling and deployment configuration, see:
+`${CLAUDE_PLUGIN_ROOT}/skills/agent-system-design/references/deployment-config.md`
+
+For the OpenClaw feature map, see:
+`${CLAUDE_PLUGIN_ROOT}/skills/agent-system-design/references/feature-map.md`
--- a/skills/agent-system-design/references/deployment-targets.md
+++ b/skills/agent-system-design/references/deployment-targets.md
@ -0,0 +1,189 @@
+# Deployment Targets
+
+Reference for the agent-system-design skill. Covers the four deployment platforms
+available for Claude Code agents. Use this to guide target selection and scaffold
+the correct infrastructure files.
+
+---
+
+## 1. Local (cron/launchd)
+
+**How it works:** Claude Code CLI invoked by the host scheduler. No persistent process;
+each run starts a fresh Claude Code session and exits.
+
+**Setup files to scaffold:**
+- `automation.sh` -- wrapper script that sets env vars and invokes `claude`
+- Cron entry comment block in README, or `launchd.plist` for macOS
+
+**Cron pattern:**
+```
+0 5 * * * /path/to/project/automation.sh >> /var/log/agent.log 2>&1
+```
+
+**launchd pattern (macOS):**
+```xml
+<key>StartCalendarInterval</key>
+<dict>
+  <key>Hour</key><integer>5</integer>
+  <key>Minute</key><integer>0</integer>
+</dict>
+```
+
+**Pros:**
+- Simple setup, no infrastructure cost
+- Full tool access (filesystem, shell, MCP)
+- Easy to test locally before any deployment
+
+**Cons:**
+- Not always-on between scheduled runs
+- No mobile access without a channel plugin
+- Log rotation must be managed manually
+
+**Best for:** Development, testing, personal daily pipelines (nightly batch, ecosystem pulse).
+
+---
+
+## 2. Mac Mini (launchd + channels)
+
+**How it works:** A dedicated Mac runs Claude Code inside a persistent `tmux` session.
+`launchd` ensures the session restarts on reboot. Channel plugins (iMessage, Telegram)
+provide mobile access.
+
+**Setup files to scaffold:**
+- `launchd.plist` -- keeps the tmux session alive across reboots
+- `tmux-start.sh` -- creates/attaches the named session
+- `.mcp.json` -- channel MCP server (Telegram or iMessage)
+- `channels-guide.md` -- instructions for mobile interaction
+
+**tmux session pattern:**
+```bash
+tmux new-session -d -s agent -x 220 -y 50
+tmux send-keys -t agent "claude --model claude-sonnet-4-6" Enter
+```
+
+**Pros:**
+- Always-on between scheduled runs
+- Native iMessage support (macOS-only feature)
+- Full GUI access for Computer Use (Desktop app)
+- Local hardware, no cloud dependency
+
+**Cons:**
+- Requires dedicated physical hardware
+- iMessage and Computer Use are macOS-only
+- Single machine; no horizontal scaling
+
+**Best for:** Personal always-on agent with phone access, Computer Use workflows,
+home automation pipelines.
+
+---
+
+## 3. VPS (systemd + cron)
+
+**How it works:** Linux server runs Claude Code headless. `systemd` manages the service
+lifecycle. Cron handles scheduled tasks. Telegram or Slack provides the channel layer.
+
+**Setup files to scaffold:**
+- `systemd/agent.service` -- service unit for the agent process
+- `systemd/agent.timer` -- timer unit for scheduled runs (alternative to cron)
+- `automation.sh` -- invocation wrapper
+- `.mcp.json` -- Telegram or Slack MCP server
+- `channels-guide.md` -- instructions for team interaction
+
+**systemd service pattern:**
+```ini
+[Service]
+ExecStart=/path/to/automation.sh
+Restart=on-failure
+User=agent
+Environment=HOME=/home/agent
+```
+
+**systemd timer pattern:**
+```ini
+[Timer]
+OnCalendar=*-*-* 05:00:00
+Persistent=true
+```
+
+**Pros:**
+- True always-on with automatic restart on failure
+- Scalable: multiple agents on one server or across servers
+- Team access via shared Telegram/Slack channel
+- No hardware to manage locally
+
+**Cons:**
+- No iMessage (Linux)
+- No Computer Use (no display server in headless mode)
+- Requires server management (security patches, disk, logs)
+
+**Best for:** Server-side agents, team pipelines, production workloads, nightly batch
+that must run reliably even when the developer's laptop is off.
+
+---
+
+## 4. Managed Agents (Anthropic API)
+
+**How it works:** Anthropic hosts the agent runtime. The builder deploys via the
+`/v1/agents` and `/v1/sessions` REST API, or uses `@anthropic-ai/sdk` in TypeScript/Python.
+No local Claude Code installation required.
+
+**Setup files to scaffold:**
+- `agent.ts` or `agent.py` -- SDK code defining the agent
+- `sessions.ts` -- session management helpers
+- `.env.template` -- API key template
+- `README.md` -- deployment and configuration instructions
+
+**TypeScript pattern:**
+```typescript
+import Anthropic from "@anthropic-ai/sdk";
+const client = new Anthropic();
+const session = await client.sessions.create({ agent_id: "ag_..." });
+```
+
+**Pros:**
+- Cloud-native, zero infrastructure to manage
+- Scales automatically across sessions
+- Persistent sessions via the API
+- Integrates directly into SaaS products or APIs
+
+**Cons:**
+- Different architecture from Claude Code CLI (no local filesystem by default)
+- API costs per token, no flat rate
+- Less direct filesystem access than CLI agents
+- Feature set tied to what the Managed Agents API exposes
+
+**Best for:** Production deployment at scale, SaaS product integration, agents that
+must be accessible to end users without CLI access.
+
+---
+
+## Comparison Matrix
+
+| Dimension | Local (cron) | Mac Mini | VPS | Managed API |
+|-----------|-------------|----------|-----|-------------|
+| Always-on | No | Yes | Yes | Yes |
+| Computer Use | No | Yes (Desktop) | No | No |
+| iMessage | No | Yes | No | No |
+| Telegram/Slack | Via MCP | Via MCP | Via MCP | Via integration |
+| Infrastructure cost | Zero | Hardware | VPS fee | API token cost |
+| Setup complexity | Low | Medium | Medium | Low (SDK) |
+| Team access | No | Limited | Yes | Yes |
+| Filesystem access | Full | Full | Full | Limited |
+| Horizontal scaling | No | No | Manual | Automatic |
+| Best environment | Dev/test | Personal | Team/prod | SaaS/prod |
+
+## Scaffold Decision Guide
+
+1. **Is this a personal agent for one developer?** Start with Local. Upgrade to Mac Mini
+   if always-on or iMessage is required.
+
+2. **Does the agent need Computer Use?** Mac Mini with Claude Code Desktop is the only
+   option today. Document this constraint if the user is on Linux or Windows.
+
+3. **Is this a team or production workload?** VPS with systemd. Add Telegram/Slack channel.
+
+4. **Is this going into a product or SaaS?** Managed API. Scaffold TypeScript SDK code,
+   not a CLAUDE.md agent.
+
+5. **Do not mix targets in one scaffold.** Pick one. Document the alternatives in the
+   `## Deployment` section of the generated README.
--- a/skills/agent-system-design/references/feature-map.md
+++ b/skills/agent-system-design/references/feature-map.md
@ -0,0 +1,80 @@
+# OpenClaw vs Claude Code Feature Map
+
+Builder reference for the agent-system-design skill. For each capability, the table shows
+what to scaffold when Claude Code is the target runtime.
+
+## Capability Coverage
+
+| # | Capability | Status | What to scaffold | Min CC version |
+|---|-----------|--------|-----------------|---------------|
+| 1 | Agent Runtime | OK | `CLAUDE.md` + `settings.json` | v2.1.84 |
+| 2 | Shell Execution | OK | `hooks/pre-tool-use.sh` + deny list in `settings.json` | v2.1.78 |
+| 3 | File I/O | OK | `settings.json` allow list under `permissions.allow` | baseline |
+| 4 | Web Search | OK | `settings.json` allow list (WebSearch tool) | baseline |
+| 5 | Browser | OK | `.mcp.json` with Playwright server entry | external |
+| 6 | Computer Use | Docs | README note: Desktop app required, not available headless | v2.1.86 |
+| 7 | Memory | Partial | `memory/MEMORY.md` + memory block in `CLAUDE.md` | v2.1.32 |
+| 8 | Multi-Agent | OK | `.claude/agents/*.md` subagent definitions | v2.1.32 |
+| 9 | Messaging | Partial | `.mcp.json` Slack server + channels guide in README | v2.1.80 |
+| 10 | Model Providers | Partial | `model:` frontmatter in agent `.md` files | baseline |
+| 11 | Cron/Automation | OK | `automation.sh` wrapper + `launchd.plist` or `crontab` entry | v2.1.71 |
+| 12 | Always-On | Partial | `launchd`/`systemd` service + `tmux` session guide | infra |
+| 13 | Plugin System | OK | Plugin manifest (`plugin.json` + `CLAUDE.md`) | v2.1.84 |
+| 14 | Skills | OK | `.claude/skills/*.md` skill definitions | baseline |
+| 15 | Security | OK | Hooks + permissions deny list + audit log | v2.1.78 |
+| 16 | Voice/TTS | Docs | README note: MCP-based approach, no native support | N/A |
+| 17 | Companion Apps | Docs | README refs to Desktop app and Dispatch channel | v2.1.85 |
+| 18 | Gateway | Partial | `/schedule` skill + HTTP webhook hooks | v2.1.63 |
+| 19 | Canvas/A2UI | Gap | Playwright workaround only, no native equivalent | N/A |
+| 20 | Configuration | OK | `settings.json` + `CLAUDE.md` hierarchy | v2.1.84 |
+| 21 | Chat Commands | OK | `.claude/skills/*.md` (slash commands) | baseline |
+| 22 | CLI | OK | Wrapper scripts (`automation.sh`, `run-agent.sh`) | baseline |
+
+**Score: 13 full OK (59%) | 8 different approach/Partial/Docs (36%) | 1 gap (5%)**
+**Minimum version for full coverage: v2.1.86** (Computer Use requires Desktop app)
+
+## Status Key
+
+| Status | Meaning |
+|--------|---------|
+| OK | Native Claude Code equivalent, scaffold directly |
+| Partial | Functional but requires workaround or external integration |
+| Docs | No runtime equivalent; document the limitation and alternative |
+| Gap | No practical equivalent in Claude Code today |
+
+## Scaffold Actions by Status
+
+**OK** -- Generate the file(s) listed in "What to scaffold". Standard templates apply.
+
+**Partial** -- Scaffold what exists, add a `## Limitations` section to the README noting
+the gap and the workaround. Do not promise feature parity.
+
+**Docs** -- Add a `## Notes` section to the README only. Do not scaffold non-existent
+infrastructure. Link to the relevant Anthropic documentation or issue.
+
+**Gap** -- Add a `## Known Gaps` section. Acknowledge the gap, document the workaround
+(Playwright for Canvas/A2UI), and note if it is on the roadmap.
+
+## Claude Code Ecosystem Map
+
+How Claude Code components map to the OpenClaw product family:
+
+| Claude Code component | OpenClaw equivalent | Notes |
+|----------------------|--------------------|----|
+| Claude Code CLI | OpenClaw core agent | Headless, full tool access |
+| Claude Code Desktop | OpenClaw + macOS app | Adds Computer Use, GUI |
+| Cowork | OpenClaw for non-developers | Simplified UX, no CLI |
+| Dispatch | Telegram/WhatsApp channels | Mobile access layer |
+| `/schedule` skill | HEARTBEAT.md cron | Scheduled agent triggers |
+| Anthropic Agent SDK | OpenClaw API | Managed agents via `/v1/agents` |
+
+## Version Compatibility Notes
+
+- **baseline**: Available since first public Claude Code release; no version gate.
+- **external**: Depends on MCP server availability, not Claude Code version.
+- **infra**: Depends on the deployment host (macOS/Linux), not Claude Code version.
+- **N/A**: Not applicable to Claude Code; alternative approach required.
+
+When scaffolding for a specific Claude Code version, check that all required capabilities
+meet the min version. If the user's version is below v2.1.86, exclude Computer Use from
+the feature set and document it under Known Gaps.
--- a/skills/agent-system-design/references/pipeline-patterns.md
+++ b/skills/agent-system-design/references/pipeline-patterns.md
@ -0,0 +1,357 @@
+# Pipeline Patterns Reference
+
+Detailed patterns for designing multi-agent pipelines in Claude Code.
+
+---
+
+## The 3-agent pattern
+
+The foundational pattern for autonomous content and analysis workflows.
+
+**Roles:**
+- **Researcher** — gathers inputs, structures knowledge, produces a brief
+- **Writer** — produces primary output from the brief
+- **Reviewer** — evaluates output against criteria, approves or requests revision
+
+**When to use:** Any workflow where you need sourced input, generated output, and a quality gate. Content production, report generation, code review pipelines, competitive analysis.
+
+**How to customize:**
+
+| Domain | Researcher focus | Writer focus | Reviewer criteria |
+|--------|-----------------|--------------|-------------------|
+| Content | Web sources, existing articles, reader questions | Article draft matching voice and format | Accuracy, engagement, brand voice |
+| Engineering | Codebase patterns, issue context, API docs | Implementation or PR description | Correctness, style, test coverage |
+| Consulting | Client data, market research, precedents | Recommendation or slide content | Evidence quality, actionability |
+| Operations | Logs, metrics, incident history | Incident report or runbook update | Completeness, clarity, ownership |
+
+**Scaling the pattern:**
+
+- Add a **Finalizer** agent between Reviewer and output for polish steps (SEO, formatting, compliance checks)
+- Add a **Distributor** agent after output for routing (email, Slack, WordPress, Linear)
+- Replace the single Researcher with a **parallel research team** (two agents gathering different source types simultaneously)
+- Add a **Memory Manager** agent that reads and writes state files, keeping other agents focused on their domain
+
+---
+
+## The 9-step pipeline template
+
+This is the canonical sequence for a full pipeline skill. Adapt as needed — not all steps are required for every workflow.
+
+```
+Step 1: Read project context
+  - Read CLAUDE.md and any project-specific config
+  - Establish constraints before any agent runs
+
+Step 2: Read memory / previous state
+  - Load memory/MEMORY.md or data/run-state.json
+  - Pass relevant state to downstream agents as context
+
+Step 3: Agent 1 — Researcher
+  - Invoke with: topic, constraints, memory context
+  - Output: structured research brief (markdown or JSON)
+
+Step 4: Agent 2 — Writer
+  - Invoke with: research brief, output format spec, voice guidelines
+  - Output: primary draft
+
+Step 5: Agent 3 — Reviewer
+  - Invoke with: draft, scoring rubric, acceptance criteria
+  - Output: score + pass/fail + specific revision requests
+
+Step 6: Revision loop (conditional)
+  - If reviewer score < threshold: invoke Writer again with feedback
+  - Max 2 revision passes before escalating to human
+  - If max passes exceeded: save draft with NEEDS_REVIEW flag
+
+Step 7: Save outputs
+  - Write final output to designated location
+  - Publish if automated publishing is configured
+
+Step 8: Update memory
+  - Append run summary to memory file
+  - Update counters, timestamps, last-processed markers
+
+Step 9: Confirm and report
+  - Print summary of what was produced
+  - List any items that need human attention
+```
+
+**Revision loop implementation note:** The loop should be explicit in the skill file. Do not rely on the agent to decide whether to loop — tell it exactly: "If the reviewer score is below 70, invoke the writer agent again with the reviewer's feedback. Do this at most twice."
+
+---
+
+## Agent role templates
+
+Copy these as starting points. Replace bracketed values.
+
+### Researcher
+
+```markdown
+---
+name: researcher
+description: |
+  Use this agent to gather and structure information before writing or analysis.
+
+  <example>
+  Context: Pipeline needs sourced input before writing
+  user: "Research [topic] for this week's report"
+  assistant: "I'll use the researcher agent to gather sources and produce a brief."
+  <commentary>
+  Research request before production triggers the researcher.
+  </commentary>
+  </example>
+model: sonnet
+tools: ["Read", "Glob", "Grep", "WebSearch", "Bash"]
+---
+
+## How you work
+
+You produce research briefs, not finished content. Your output is always structured
+for a downstream writer to consume.
+
+1. Read any existing memory or prior research on this topic
+2. Gather sources using available tools (web search, local files, MCP servers)
+3. Extract the 5-7 most relevant facts, quotes, or data points
+4. Note source reliability and any gaps in coverage
+5. Produce a brief with sections: Background, Key Points, Sources, Gaps
+
+## Rules
+
+- Never fabricate sources or quotes
+- Mark unverified claims explicitly
+- Keep briefs under 800 words unless the topic demands more
+- List every source URL or file path used
+
+## Output format
+
+```
+## Research Brief: [Topic]
+Date: [date]
+
+### Background
+[2-3 sentences of context]
+
+### Key Points
+- [point 1] (source: [url/file])
+- [point 2] (source: [url/file])
+...
+
+### Sources
+[list of all sources consulted]
+
+### Gaps
+[what could not be verified or found]
+```
+```
+
+### Writer
+
+```markdown
+---
+name: writer
+description: |
+  Use this agent to produce primary written output from a research brief or spec.
+
+  <example>
+  Context: Research brief is ready, article needs to be written
+  user: "Write the article from this brief"
+  assistant: "I'll use the writer agent to draft from the research brief."
+  <commentary>
+  Production request with existing brief triggers the writer.
+  </commentary>
+  </example>
+model: opus
+tools: ["Read", "Write", "Glob"]
+---
+
+## How you work
+
+You produce first drafts from structured inputs. You do not research — you write.
+
+1. Read the research brief and any style/voice guidelines
+2. Read examples of approved past output for voice calibration
+3. Draft the primary output following the specified format
+4. Do not add information not present in the brief
+5. Flag any gaps where the brief was insufficient
+
+## Rules
+
+- Follow the voice and format guidelines exactly
+- Never add claims not supported by the brief
+- Keep within specified word count ±10%
+- End with a concrete takeaway or call to action
+
+## Output format
+
+[Specify the exact output format for your domain]
+```
+
+### Reviewer
+
+```markdown
+---
+name: reviewer
+description: |
+  Use this agent to evaluate output quality and approve or request revisions.
+
+  <example>
+  Context: Draft is ready for quality check
+  user: "Review this draft before publishing"
+  assistant: "I'll use the reviewer agent to score and evaluate the draft."
+  <commentary>
+  Quality evaluation request triggers the reviewer.
+  </commentary>
+  </example>
+model: opus
+tools: ["Read"]
+---
+
+## How you work
+
+You evaluate drafts against defined criteria and produce a scored assessment.
+
+1. Read the draft and the original brief or requirements
+2. Score against each dimension in the rubric (see Output format)
+3. Note specific issues with line references where possible
+4. Produce a pass/fail decision with justification
+
+## Rules
+
+- Score honestly — do not inflate to avoid revision cycles
+- Be specific: "paragraph 3 is vague" not "needs more detail"
+- Pass threshold is 70/100 overall with no dimension below 50
+
+## Output format
+
+```
+## Review: [Draft title]
+
+### Scores
+- Accuracy: [0-25] — [one sentence justification]
+- Clarity: [0-25] — [one sentence justification]
+- Completeness: [0-25] — [one sentence justification]
+- Format/Voice: [0-25] — [one sentence justification]
+
+### Overall: [total]/100
+
+### Decision: PASS | REVISE | REJECT
+
+### Revision requests (if REVISE)
+1. [specific request]
+2. [specific request]
+```
+```
+
+---
+
+## Quality gates: 4-level scoring rubric
+
+Use this rubric in reviewer agents and pipeline acceptance criteria.
+
+| Dimension | 0-12 (Poor) | 13-18 (Acceptable) | 19-22 (Good) | 23-25 (Excellent) |
+|-----------|-------------|-------------------|--------------|-------------------|
+| **Accuracy** | Multiple errors or unsupported claims | Minor errors, mostly supported | All claims verifiable | Fully sourced, no errors |
+| **Clarity** | Hard to follow, jargon-heavy | Mostly clear, some confusion | Clear throughout | Immediately clear, no ambiguity |
+| **Completeness** | Major gaps, incomplete | Covers main points, some gaps | Thorough coverage | Nothing missing |
+| **Format/Voice** | Wrong format or tone | Mostly correct, minor deviations | Correct format and tone | Perfect fit for context |
+
+**Thresholds:**
+- 90-100: Publish immediately
+- 70-89: Publish with minor edits
+- 50-69: Revise and re-review
+- Below 50: Reject, start over or escalate to human
+
+---
+
+## Pipeline skill format
+
+Pipeline skills live in `.claude/skills/<name>/SKILL.md`. They are invoked as `/plugin:skill-name` or triggered by the agent system automatically.
+
+```markdown
+---
+name: weekly-report
+description: |
+  Run the weekly report pipeline. Triggers on: "run weekly report",
+  "generate this week's report", "weekly pipeline"
+version: 0.1.0
+---
+
+## Weekly Report Pipeline
+
+Run these steps in order. Do not skip steps. If a step fails, stop and report the error.
+
+### Step 1: Load context
+Read `CLAUDE.md` and `memory/MEMORY.md`. Note the last run date and any pending items.
+
+### Step 2: Research
+Use the Agent tool to invoke the `researcher` agent with this prompt:
+"Research [topic] for the period [date range]. Focus on [specific angle]."
+Save the research brief to `data/research-brief-[date].md`.
+
+### Step 3: Write
+Use the Agent tool to invoke the `writer` agent with this prompt:
+"Write the weekly report from [path to brief]. Follow the format in [style guide path]."
+Save the draft to `drafts/weekly-[date].md`.
+
+### Step 4: Review
+Use the Agent tool to invoke the `reviewer` agent with this prompt:
+"Review the draft at [path]. Use the standard 4-dimension rubric."
+
+### Step 5: Handle review result
+- If score >= 70: proceed to Step 6
+- If score < 70 and revision count < 2: invoke writer again with reviewer feedback, then re-review
+- If score < 70 after 2 revisions: save draft with NEEDS_REVIEW flag, skip to Step 7
+
+### Step 6: Finalize
+[Publishing or distribution steps]
+
+### Step 7: Update memory
+Append to `memory/MEMORY.md`:
+- Date of run
+- Output file path
+- Review score
+- Any items needing human attention
+```
+
+---
+
+## Agent frontmatter: all valid fields
+
+```yaml
+name: <string>          # required — slug, used for routing and invocation
+description: |          # required — trigger text + examples
+  <string>
+model: sonnet|opus      # required — model for this agent's runs
+tools: [<string>, ...]  # required — explicit tool allowlist
+color: <string>         # optional — UI color hint (green, blue, red, yellow, purple)
+```
+
+Tools available for agents: `Read`, `Write`, `Edit`, `Glob`, `Grep`, `Bash`, `WebSearch`, `WebFetch`, `Agent`, `AskUserQuestion`, and any MCP tool by its full name (e.g., `mcp__tavily__tavily_search`).
+
+---
+
+## How agents communicate
+
+**Agent tool (sequential):** The orchestrating skill or parent agent uses the `Agent` tool to invoke a subagent. The subagent runs to completion and returns its output. This is the standard pattern for pipeline steps.
+
+```
+Agent tool call:
+  agent: researcher
+  prompt: "Research X and produce a brief in the format..."
+→ researcher runs, returns brief text
+→ parent continues with Step 2
+```
+
+**SendMessage (async / worktree):** For parallel execution, agents can be spawned in separate worktrees. Each worktree runs independently; results are assembled by the orchestrator after all complete. Use this when steps have no dependencies on each other (e.g., researching two topics simultaneously).
+
+**Worktree isolation:** When an agent runs in a worktree, it has its own working copy of the repository. It cannot see changes made by other agents running simultaneously. Use a shared output directory (outside the worktrees) or a coordination file to merge results.
+
+**File-based handoff (simple and reliable):** The most robust communication pattern is file-based. Each agent writes its output to a designated path; the next agent reads from that path. This works in any execution mode and produces an audit trail of intermediate outputs.
+
+```
+researcher → data/brief-2026-04-10.md
+writer     → reads data/brief-2026-04-10.md → drafts/article-2026-04-10.md
+reviewer   → reads drafts/article-2026-04-10.md → data/review-2026-04-10.md
+```
+
+For most personal and small-team pipelines, sequential execution with file-based handoff is the right choice. It is simpler to debug, easier to resume after failure, and produces a clear audit trail.
--- a/skills/agent-system-design/references/security-patterns.md
+++ b/skills/agent-system-design/references/security-patterns.md
@ -0,0 +1,280 @@
+# Security Patterns for Autonomous Agents
+
+Reference for the agent-system-design skill. Covers permission modes, hook-based
+guardrails, settings.json configuration, and a checklist for hardening autonomous agents.
+
+---
+
+## 1. Permission Modes
+
+Four levels of approval, from most restrictive to least:
+
+### Default (interactive approval)
+Claude asks the user before every tool call. Suitable for exploratory sessions.
+No configuration required.
+
+### Auto-edit (AcceptEdits mode)
+File edits are auto-approved. Bash commands still require approval.
+Enables faster iteration on code without allowing arbitrary shell execution.
+
+```json
+{
+  "autoApprove": ["Edit", "Write"]
+}
+```
+
+### Auto Mode (AI classifier)
+An AI classifier evaluates each tool call and approves or blocks it based on
+predicted risk. Reported metrics: 0.4% false positive rate (safe calls blocked),
+5.7% false negative rate (unsafe calls approved).
+
+Suitable for: attended automation where the developer is nearby and can intervene.
+Not suitable for: fully unattended production agents.
+
+```json
+{
+  "autoApprove": ["auto"]
+}
+```
+
+### Bypass (--dangerously-skip-permissions)
+All tool calls are auto-approved without review. Must only be used inside a
+sandbox (Docker, VM, or a throwaway environment with no access to production systems).
+Never use on the host machine with access to real credentials or filesystems.
+
+```bash
+claude --dangerously-skip-permissions -p "run the pipeline"
+```
+
+**Decision rule:** Match the permission mode to the blast radius. The more access
+the agent has, the more restrictive the approval mode must be.
+
+---
+
+## 2. Hook-Based Guardrails
+
+Hooks run synchronously before and after tool calls. A non-zero exit from
+`PreToolUse` blocks the tool call entirely. Five standard patterns:
+
+### Pattern 1: Destructive Command Blocking
+Block commands that cannot be undone.
+
+```bash
+# hooks/pre-tool-use.sh
+BLOCKED_PATTERNS=(
+  "rm -rf"
+  "mkfs"
+  "dd if="
+  ":(){ :|:& };:"   # fork bomb
+  "> /dev/"
+)
+COMMAND="$CLAUDE_TOOL_INPUT_COMMAND"
+for pattern in "${BLOCKED_PATTERNS[@]}"; do
+  if echo "$COMMAND" | grep -qF "$pattern"; then
+    echo "BLOCKED: destructive pattern detected: $pattern" >&2
+    exit 1
+  fi
+done
+```
+
+### Pattern 2: Piped Script Execution Blocking
+Block patterns that download and execute code in a single pipeline.
+
+```bash
+PIPED_EXEC_PATTERNS=(
+  "curl.*|.*bash"
+  "curl.*|.*sh"
+  "wget.*|.*bash"
+  "wget.*|.*sh"
+)
+for pattern in "${PIPED_EXEC_PATTERNS[@]}"; do
+  if echo "$COMMAND" | grep -qE "$pattern"; then
+    echo "BLOCKED: piped script execution detected" >&2
+    exit 1
+  fi
+done
+```
+
+### Pattern 3: Privilege Escalation Blocking
+Block commands that elevate privileges or weaken file permissions.
+
+```bash
+PRIV_PATTERNS=(
+  "sudo"
+  "chmod 777"
+  "chmod a+x /etc"
+  "shutdown"
+  "reboot"
+  "init 0"
+)
+for pattern in "${PRIV_PATTERNS[@]}"; do
+  if echo "$COMMAND" | grep -qF "$pattern"; then
+    echo "BLOCKED: privilege escalation detected" >&2
+    exit 1
+  fi
+done
+```
+
+### Pattern 4: Path Restriction
+Prevent writes outside the project directory.
+
+```bash
+PROJECT_DIR="$(cd "$(dirname "$0")/.." && pwd)"
+TOOL_PATH="$CLAUDE_TOOL_INPUT_FILE_PATH"
+if [ -n "$TOOL_PATH" ]; then
+  REAL_PATH="$(realpath "$TOOL_PATH" 2>/dev/null || echo "$TOOL_PATH")"
+  if [[ "$REAL_PATH" != "$PROJECT_DIR"* ]]; then
+    echo "BLOCKED: write outside project dir: $REAL_PATH" >&2
+    exit 1
+  fi
+fi
+```
+
+### Pattern 5: Audit Logging
+Log every tool call with timestamp for post-hoc review.
+
+```bash
+# hooks/post-tool-use.sh
+LOG_FILE="$PROJECT_DIR/logs/audit.log"
+mkdir -p "$(dirname "$LOG_FILE")"
+echo "$(date -u +"%Y-%m-%dT%H:%M:%SZ") TOOL=$CLAUDE_TOOL_NAME INPUT=$CLAUDE_TOOL_INPUT" >> "$LOG_FILE"
+```
+
+Combine all five patterns into a single `pre-tool-use.sh` and a separate
+`post-tool-use.sh`. Keep them under 80 lines each so they are auditable at a glance.
+
+---
+
+## 3. settings.json Security Configuration
+
+The full security configuration surface in `settings.json`:
+
+```json
+{
+  "permissions": {
+    "allow": [
+      "Bash(git:*)",
+      "Bash(npm run *)",
+      "Read(**)",
+      "Write(src/**)",
+      "Edit(src/**)"
+    ],
+    "deny": [
+      "Bash(rm -rf *)",
+      "Bash(sudo *)",
+      "Bash(curl * | *)",
+      "Bash(wget * | *)",
+      "Write(/etc/*)",
+      "Write(~/.ssh/*)",
+      "Write(~/.zshenv)"
+    ]
+  },
+  "hooks": {
+    "PreToolUse": [
+      {
+        "matcher": "Bash",
+        "hooks": [
+          {
+            "type": "command",
+            "command": "./hooks/pre-tool-use.sh"
+          }
+        ]
+      }
+    ],
+    "PostToolUse": [
+      {
+        "matcher": "*",
+        "hooks": [
+          {
+            "type": "command",
+            "command": "./hooks/post-tool-use.sh"
+          }
+        ]
+      }
+    ]
+  }
+}
+```
+
+**Allow list principle:** Prefer an explicit allow list over a deny list alone.
+The deny list catches known-bad patterns; the allow list enforces least privilege.
+Both together provide defense in depth.
+
+**Glob patterns in deny list:** Use `*` conservatively. `Bash(sudo *)` blocks
+all sudo invocations. `Write(/etc/*)` blocks all writes to system config.
+
+---
+
+## 4. Security Checklist for Autonomous Agents
+
+Verify each item before declaring an agent production-ready:
+
+- [ ] `PreToolUse` hook is present and blocks destructive commands (Pattern 1-3 above)
+- [ ] `PostToolUse` audit log is enabled and written to a persistent location
+- [ ] `permissions.deny` list covers: `rm -rf *`, `sudo *`, `curl * | *`, `wget * | *`
+- [ ] `permissions.allow` list is as narrow as the agent's task requires
+- [ ] MEMORY.md does not contain API keys, tokens, or passwords
+- [ ] `.env` is in `.gitignore`; secrets are loaded from environment, not from files tracked by git
+- [ ] Deployment target matches the blast radius (see `deployment-targets.md`)
+- [ ] If always-on: phone approval via permission relay is configured (v2.1.81+)
+- [ ] Audit log is rotated (logrotate or launchd-managed)
+- [ ] Hook scripts are executable (`chmod +x hooks/*.sh`) and checked into version control
+
+### Phone Approval (v2.1.81+)
+
+For unattended agents that must occasionally escalate to a human:
+
+```json
+{
+  "hooks": {
+    "PreToolUse": [
+      {
+        "matcher": "Bash(sudo *)",
+        "hooks": [
+          {
+            "type": "prompt",
+            "prompt": "This command requires elevated privileges. Approve? (yes/no)",
+            "channel": "telegram"
+          }
+        ]
+      }
+    ]
+  }
+}
+```
+
+The agent pauses and sends the approval request to the configured channel.
+The operator approves or rejects from their phone. If no response within the
+timeout, the hook exits non-zero and the command is blocked.
+
+---
+
+## 5. OpenClaw vs Claude Code: Security Philosophy
+
+| Dimension | OpenClaw | Claude Code |
+|-----------|---------|-------------|
+| Primary mechanism | Docker sandbox (containment) | Hooks + deny list (prevention) |
+| Approach | Contain the damage after the fact | Prevent the action before it runs |
+| Escape risk | Container escape (low, not zero) | Hook bypass if hook is misconfigured |
+| Auditability | Container logs | Audit log hook (Pattern 5) |
+| Operator control | Docker network/volume flags | settings.json permissions |
+| Mobile escalation | Native | Permission relay via channel MCP (v2.1.81+) |
+
+**Containment vs Prevention:** OpenClaw runs the agent inside a Docker container with
+limited network and volume mounts. If the agent does something harmful, the damage is
+contained to the container. Claude Code instead prevents harmful actions from running
+at all, via hooks and the permissions deny list.
+
+**Tradeoff:** Containment is harder to escape but allows the harmful action to attempt
+execution. Prevention stops it earlier but requires the hook to be correctly configured
+and maintained. For production agents, combine both: run Claude Code inside Docker AND
+configure hooks and deny lists.
+
+**Combined approach (recommended for production):**
+1. Run `claude` inside a Docker container with minimal volume mounts
+2. Configure `PreToolUse` hooks with Patterns 1-4
+3. Enable `PostToolUse` audit logging (Pattern 5)
+4. Use an explicit `permissions.deny` list
+5. Do not use `--dangerously-skip-permissions` outside the container
+
+This gives containment as a last resort and prevention as the primary defense.