agent-builder/.claude/plans/blueprints/session-8-finalization.md
Kjell Tore Guttormsen 1a776bdeb2 docs(plans): create session blueprints for Agent Factory execution
8 session blueprints covering all 27 steps across 3 waves:
- Session 1: Foundation (rename + commands, Steps 1-5)
- Session 2: Skills and templates (Steps 6-7)
- Session 3: OpenClaw patterns (memory/heartbeat/proactive/cron, Steps 9-12)
- Session 4: Paperclip patterns (context/goals/budget/governance/org-chart, Steps 14-18)
- Session 5: Self-learning (feedback/optimization, Steps 20-21)
- Session 6: Integration (Docker/transfer/5 more domains, Steps 22-24)
- Session 7: Skill updates (memory/autonomy/orchestration/governance/MCP refs, Steps 13,19,25)
- Session 8: Finalization (build command integration + v1.0, Steps 8,26,27)

Also updates plan assumptions table with verified findings.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-11 11:21:17 +02:00

41 KiB

Session 8: Build Command Integration and Finalization

Steps 8, 26, 27 | Wave 3 | Depends on: Sessions 1, 2, 7

Dependencies

Entry condition: Session 1 complete (commands renamed to /agent-factory:*), Session 2 complete (5 domain templates exist in scripts/templates/domains/), Session 7 complete (skill references updated in skills/agent-system-design/)

Scope Fence

Touch:

  • commands/build.md
  • .claude-plugin/plugin.json
  • CLAUDE.md
  • README.md

Never touch:

  • scripts/templates/ (read-only — already created in Sessions 3-6)
  • skills/ (read-only — already updated in Session 7)
  • agents/ (no changes needed)

Step 8: Update build command to use domain templates and new features

Files to modify

commands/build.md — Apply the following changes:

Change 1 — Rename command reference in line 7:

Old:

You are running `/agent-builder:build` — a guided 7-phase workflow for building a complete autonomous agent system with Claude Code.

New:

You are running `/agent-factory:build` — a guided 7-phase workflow for building a complete autonomous agent system with Claude Code.

Change 2 — Add Phase 0 before Phase 1:

Insert the following section immediately before the ## Phase 1: Map Your Work heading:

---

## Phase 0: Choose a Starting Point

Goal: Offer domain templates to accelerate the design process.

Ask the user using AskUserQuestion:
"Would you like to start from a domain template? Templates pre-populate the agent roles and pipeline structure for common use cases.

Available templates:
1. **content-pipeline** — Articles, newsletters, reports. Agents: researcher, writer, reviewer.
2. **code-review** — Automated PR review. Agents: analyzer, review-writer, standards-checker.
3. **monitoring** — System/service monitoring. Agents: monitor-checker, incident-reporter, remediation-advisor.
4. **research-synthesis** — Research and analysis. Agents: source-gatherer, synthesizer, fact-checker.
5. **data-processing** — Data transformation. Agents: data-validator, transformer, quality-checker.
6. **custom** — Blank start. Answer the Phase 1 questions yourself.

Enter a template name or 'custom'."

If a template is chosen (not 'custom'):
- Read `${CLAUDE_PLUGIN_ROOT}/scripts/templates/domains/{{TEMPLATE}}.md` where `{{TEMPLATE}}` is the chosen name
- Extract the agent definitions and pipeline skill template from the file
- Pre-populate the Phase 1 design sketch using the template's roles and pipeline steps
- Tell the user: "I've loaded the [template] template. In Phase 1, I'll show you the pre-built design and you can customize it."
- Skip Phase 1's interview questions and go directly to showing the pre-populated design sketch
- Ask the user: "Does this design sketch match your use case? What would you change?"
- Incorporate any changes before proceeding to Phase 2.

If 'custom' is chosen:
- Proceed normally to Phase 1.

Change 3 — Replace Phase 6 (Deployment) entirely:

Replace the entire ## Phase 6: Deployment section with:

## Phase 6: Deployment

Goal: Configure automated scheduling for the pipeline.

Use the `/agent-factory:deploy` command to handle deployment configuration. Tell the user:
"I'll hand off to the deploy command to configure your deployment target."

Then invoke `/agent-factory:deploy` passing the pipeline name and agent list as context.

If the user wants to configure deployment inline without switching commands, ask using AskUserQuestion:

"Where will this pipeline run? Choose a deployment target:

1. **/schedule (cloud)** — Anthropic's cloud scheduler. Needs GitHub repo. No local file access. 1-hour minimum interval. Simplest option if your project is on GitHub.
2. **Desktop scheduled tasks** — Runs on your machine via Claude Code Desktop. Local file access. 1-minute minimum interval. Best for local automation.
3. **Local (cron/launchd)** — Traditional scheduler. Runs headless via `claude -p`. Full local access. Best for personal daily pipelines.
4. **VPS (systemd)** — Linux server with systemd service + timer. Always-on. Best for team pipelines and production workloads.
5. **Docker** — Containerized agent. Portable, isolated, reproducible. Best for consistent environments and security isolation."

For each target, follow the deployment steps from `${CLAUDE_PLUGIN_ROOT}/commands/deploy.md`.

Tell the user what files were created and the exact commands needed to activate the schedule.

Change 4 — Update Summary section:

Replace the summary block at the end of the file with:

## Summary

After all phases are complete (or the user stops early), print a summary:

AGENT SYSTEM BUILT

Agents: .claude/agents/[list] Pipeline: .claude/skills/[name].md Hooks: .claude/hooks/pre-tool-use.sh, post-tool-use.sh Settings: .claude/settings.json Schedule: automation/[script or config]

Run your pipeline: /[pipeline-name] [your topic] Review logs: .claude/hooks/audit.log Evaluate system: /agent-factory:evaluate Check status: /agent-factory:status


If the user stopped early, tell them which phase to continue from and remind them that this command can be re-invoked at any time.

Next steps:
- Run `/agent-factory:evaluate` to score your system against the 22 agent capabilities
- Run `/agent-factory:status` for a quick health check of all components
- Run `/agent-factory:deploy` to configure or change your deployment target

Verify

grep -c "agent-factory" /Users/ktg/repos/agent-builder/commands/build.md

Expected: >= 3

On failure

revert — build command is critical path

Checkpoint

git commit -m "feat(commands): integrate domain templates and new commands into build workflow"

Step 26: Update build command to integrate all Phase 2-5 features

Files to modify

commands/build.md — Build on Step 8's changes. Apply the following additions:

Change 1 — Add Phase 2.5 after Phase 2:

Insert the following section immediately after the ## Phase 2: Operating Manual section (after the line "Tell the user: "Created CLAUDE.md. Review it and edit freely — this is your operating manual, not mine.""):

---

## Phase 2.5: Memory Setup

Goal: Optionally configure 3-tier memory for agents that need to remember state across runs.

Ask the user using AskUserQuestion:
"Would you like to set up 3-tier memory for your agents? This lets agents remember context across sessions.

Memory tiers:
- **Session state** (hot) — Working memory, updated every turn. Prevents data loss on crash. Agents write here before responding.
- **Daily logs** (warm) — One file per day. Captures decisions, files modified, carry-forward items.
- **Long-term memory** (cold) — Curated knowledge that persists indefinitely. Agents read this on startup.

This is recommended for pipelines that run daily and need to track progress over time. Skip if this is a one-shot or stateless pipeline."

If yes:
1. Create the `memory/` directory in the user's project
2. Read `${CLAUDE_PLUGIN_ROOT}/scripts/templates/memory/SESSION-STATE.md` and copy to `memory/SESSION-STATE.md`, replacing `{{AGENT_NAME}}` with the main pipeline agent name
3. Read `${CLAUDE_PLUGIN_ROOT}/scripts/templates/memory/MEMORY.md` and copy to `memory/MEMORY.md`
4. Read `${CLAUDE_PLUGIN_ROOT}/scripts/templates/memory/DAILY-LOG.md` — explain to the user that daily log files are created automatically with the pattern `memory/YYYY-MM-DD.md`
5. Tell each agent to read `memory/SESSION-STATE.md` first on startup by adding this instruction to each agent's "How you work" section:
   "1. Read memory/SESSION-STATE.md for current task context (write your intent here before responding)"
6. Tell the user: "Memory configured. Agents will now persist state across runs. The WAL protocol ensures no data loss on crash."

If no: skip and continue to Phase 3.

Change 2 — Add Phase 3.5 after Phase 3:

Insert the following section immediately after the ## Phase 3: Agent Team section (after the "Make adjustments before moving to Phase 4" line):

---

## Phase 3.5: Skills and Custom Components

Goal: Wire up specialized skills and configure the pause/resume mechanism for missing dependencies.

Ask the user using AskUserQuestion:
"Do your agents need any specialized knowledge or behaviors beyond what's been generated? For example: writing style guides, domain-specific rules, external API patterns, or custom tool workflows."

If yes, for each described need:
1. Check if a matching skill already exists in the project: `Glob for .claude/skills/*.md` and `.claude/skills/*/SKILL.md`
2. If a matching skill exists: note it and confirm it will be auto-loaded by Claude Code
3. If no matching skill exists, ask: "For [described need], I can:
   (a) Generate a skill skeleton now that you fill in
   (b) Pause and let you build it first, then resume the build
   Which do you prefer?"

If user chooses (a): generate `.claude/skills/[name].md` with basic frontmatter and placeholder content

If user chooses (b) — pause and resume:
1. Write `build-state.json` to the project root with the current state:
```json
{
  "phase": "3.5",
  "completed": ["0", "1", "2", "2.5", "3"],
  "choices": {
    "template": "[chosen template or 'custom']",
    "agents": ["[agent names]"],
    "memory": true,
    "pipeline_name": "[name]"
  },
  "paused_reason": "User building custom skill: [skill name]",
  "resume_instructions": "Run /agent-factory:build --resume when the skill is ready"
}
  1. Tell the user: "Build paused. Create your skill at .claude/skills/[name].md, then run /agent-factory:build --resume to continue from Phase 3.5."
  2. Stop — do not proceed further until resumed.

On --resume (when $ARGUMENTS contains "--resume"):

  1. Read build-state.json from the project root
  2. Print a summary: "Resuming build. Completed phases: [list]. Paused at: [phase]. Reason: [reason]."
  3. Continue from the saved phase using the saved choices.

**Change 3 — Add Phase 3.7 after Phase 3.5:**

Insert the following section immediately after the Phase 3.5 section:

```markdown
---

## Phase 3.7: Proactive Agent (Optional)

Goal: Enable self-improving behavior for agents that should operate autonomously over time.

Ask the user using AskUserQuestion:
"Should any of your agents be proactive — able to identify improvements to their own behavior and implement them within guardrails?

Proactive agents use two protection mechanisms:
- **ADL (Anti-Drift Limits)** — Prevents agents from faking capabilities, making unverifiable changes, or expanding scope without approval
- **VFM (Value-First Modification)** — Requires each proposed change to score >50/100 on: frequency (0-25), failure reduction (0-25), burden reduction (0-25), cost savings (0-25)

This is recommended for long-running pipelines where you want the agent to improve itself over time. Skip if you want full manual control."

If yes:
1. Ask: "Which agent(s) should be proactive?" and list the agents from Phase 3
2. For each chosen agent:
   a. Read `${CLAUDE_PLUGIN_ROOT}/scripts/templates/proactive/ADL-RULES.md` and append its contents to the agent's `.claude/agents/[name].md` as a new "## Anti-Drift Limits" section
   b. Read `${CLAUDE_PLUGIN_ROOT}/scripts/templates/proactive/VFM-SCORING.md` and append as a new "## Value-First Modification" section
3. Show the user an example VFM scoring calculation: "Before adding a new step to my pipeline, I score it: Frequency (how often this gap causes issues): 20/25. Failure reduction: 18/25. Burden reduction: 15/25. Cost savings: 5/25. Total: 58/100 → implement."
4. Tell the user: "Proactive guardrails added. Agents will self-improve within ADL/VFM boundaries."

If no: skip and continue to Phase 4.

Change 4 — Add Phase 4.5 after Phase 4:

Insert the following section immediately after the ## Phase 4: Pipeline section (after "Tell the user: "Created .claude/skills/[pipeline-name].md. Run it with /[pipeline-name] [your topic].""):

---

## Phase 4.5: Integrations and MCP Servers

Goal: Connect agents to external services they need to do real work.

Ask the user using AskUserQuestion:
"What external services do your agents need to interact with? For example: Slack, GitHub, Linear, databases, REST APIs, file systems, browsers."

If none: skip and continue to Phase 5.

For each named service:
1. Check if a known MCP server exists for it:
   - Slack → `@anthropic-ai/mcp-server-slack`
   - GitHub → available as MCP server
   - Playwright/browser → `@anthropic-ai/mcp-server-playwright`
   - PostgreSQL/SQLite → community MCP servers available
   - Linear, Jira, Notion → community MCP servers available
   - Read `${CLAUDE_PLUGIN_ROOT}/skills/agent-system-design/references/mcp-integrations.md` for the full list

2. If MCP server exists:
   a. Generate a `.mcp.json` entry for it (create or append to `.mcp.json` in project root):
```json
{
  "mcpServers": {
    "[service-name]": {
      "command": "npx",
      "args": ["-y", "@anthropic-ai/mcp-server-[name]"],
      "env": {
        "[SERVICE]_API_KEY": "${[SERVICE]_API_KEY}"
      }
    }
  }
}

b. Tell the user: "Add [SERVICE]_API_KEY to your environment. For local runs: export it in ~/.zshenv. For Docker: add it to .env."

  1. If no MCP server exists for the service: a. Explain: "There's no pre-built MCP server for [service]. You have three options: (a) Use the Bash tool directly — simpler but less structured. Good for REST APIs with curl. (b) Find a community MCP server — search: https://github.com/modelcontextprotocol/servers (c) Build a custom MCP server — I can generate a skeleton using the /mcp-builder skill." b. Ask which option they prefer c. If option (a): add Bash tool to the agent's tools list and note the API pattern in the agent's system prompt d. If option (b): pause and provide search instructions e. If option (c) — pause and resume:
    • Write build-state.json with phase "4.5", add current MCP choices to "choices"
    • Tell the user: "Build paused. Build your MCP server, then run /agent-factory:build --resume."
    • Stop.

After all integrations:

  1. Validate .mcp.json syntax: python3 -c "import json; json.load(open('.mcp.json'))" — tell user if invalid
  2. Tell the user: "MCP integrations configured. Note: /schedule cloud tasks cannot use MCP servers. For cloud deployment, agents must use Bash with direct API calls instead."

**Change 5 — Add Phase 5.5 after Phase 5:**

Insert the following section immediately after the `## Phase 5: Security` section (after "Tell the user: "Created hooks and settings.json. The audit log will be written to .claude/hooks/audit.log after each tool call.""):

```markdown
---

## Phase 5.5: Governance

Goal: Define how much autonomy the agent system has and where humans stay in the loop.

Ask the user using AskUserQuestion:
"What autonomy level do you want for your agent system?

- **Level 0** — Full manual approval. Every tool call requires your OK.
- **Level 1** — Auto-approve reads only. (Read, Glob, Grep run freely. All writes need approval.)
- **Level 2** — Auto-approve file operations within the project. (Writes to project dir run freely.)
- **Level 3** — Auto-approve all except destructive operations. (Bash with non-destructive commands runs freely.)
- **Level 4** — Full autonomy with hooks as guardrails. (Everything runs unless blocked by a hook rule.)

For most personal pipelines: Level 3 or 4. For shared or production systems: Level 1 or 2."

Based on the chosen level:
1. Read `${CLAUDE_PLUGIN_ROOT}/scripts/templates/governance/GOVERNANCE.md`
2. Copy to `GOVERNANCE.md` in the project root
3. Replace `{{AUTONOMY_LEVEL}}` with the chosen level
4. Replace `{{PROJECT_NAME}}` with the project name from Phase 2
5. Ask: "Are there any specific approval gates — operations that must always pause for your review? For example: sending emails, deleting files, committing to main branch."
6. For each gate: add an entry to the Approval Gates section in GOVERNANCE.md

Ask using AskUserQuestion:
"Do you want to track agent spending? Budget tracking records estimated API costs per run and can pause agents that exceed a monthly limit."

If yes:
1. Read `${CLAUDE_PLUGIN_ROOT}/scripts/templates/budget/BUDGET.md` and copy to `budget/BUDGET.md`
2. Read `${CLAUDE_PLUGIN_ROOT}/scripts/templates/budget/budget-hook.sh` and copy to `.claude/hooks/budget-hook.sh`
3. Make budget-hook.sh executable: `chmod +x .claude/hooks/budget-hook.sh`
4. Ask: "What monthly budget limit in USD?" and set `{{BUDGET_LIMIT_CENTS}}` to (USD * 100)
5. Add budget-hook.sh to PostToolUse hooks in `.claude/settings.json`
6. Tell the user: "Budget tracking configured. Costs are estimated from token counts. A warning triggers at 80% and agents pause at 100% of the monthly limit."

Change 6 — Add Phase 5.7 after Phase 5.5:

Insert the following section immediately after the Phase 5.5 section:

---

## Phase 5.7: Goals and Org Chart (Multi-Agent Systems)

Goal: For systems with 3 or more agents, define the hierarchy and shared goals.

Count the agents created in Phase 3. If fewer than 3: skip this phase entirely.

If 3 or more agents:

Ask the user using AskUserQuestion:
"Your system has [N] agents. Would you like to define:
(a) A **goal hierarchy** — what this system is trying to achieve at the company, project, and task levels
(b) An **org chart** — which agents report to which, and who has what authority
(c) Both
(d) Skip — I'll manage coordination manually"

If goals (a or c):
1. Read `${CLAUDE_PLUGIN_ROOT}/scripts/templates/goals/GOALS.md` and copy to `GOALS.md` in the project root
2. Ask: "What is the top-level goal this agent system serves? (e.g., 'Publish 3 high-quality articles per week')"
3. For each agent, ask: "What does [agent name] contribute toward that goal?"
4. Populate the Goals file with the hierarchy using dot-notation IDs: G1 (company goal) → G1.1 (project goal) → G1.1.1 (this agent's task goal)
5. Tell the user: "GOALS.md created. Agents can reference their goal ID to stay aligned with the overall mission."

If org chart (b or c):
1. Read `${CLAUDE_PLUGIN_ROOT}/scripts/templates/org-chart/ORG-CHART.md` and copy to `ORG-CHART.md` in the project root
2. Ask: "Which agent is the top-level orchestrator? (The one that coordinates others.)"
3. For remaining agents: ask "Does [agent] report to [orchestrator] or another agent?"
4. Populate the ORG-CHART.md table with `reportsTo` references
5. Tell the user: "ORG-CHART.md created. You (the human operator) are always the 'board' with override authority on all agents."

Change 7 — Update Phase 6 deployment options (replace the Phase 6 written in Step 8 with this expanded version):

Replace the ## Phase 6: Deployment section written in Step 8 with:

## Phase 6: Deployment

Goal: Configure automated scheduling for the pipeline. Present ALL deployment options with clear trade-offs.

Use the `/agent-factory:deploy` command to handle deployment configuration. Tell the user:
"I'll hand off to the deploy command to configure your deployment target."

Then invoke `/agent-factory:deploy` passing the pipeline name, agent list, and any MCP integrations from Phase 4.5 as context.

If the user wants to configure deployment inline, ask using AskUserQuestion:

"Choose a deployment target. Here are the trade-offs:

| Target | Needs | Minimum interval | Local files | Best for |
|--------|-------|-----------------|-------------|----------|
| /schedule (cloud) | GitHub repo | 1 hour | No | Repo maintenance, CI triage, PR review |
| Desktop tasks | Desktop app running | 1 minute | Yes | Local automation while you work |
| Local cron/launchd | Machine on | Any | Yes | Personal daily pipelines |
| VPS systemd | Linux server | Any | Yes | Team pipelines, production |
| Docker | Docker installed | Any | Via volumes | Isolated, portable, reproducible |

Which target? (You can configure multiple.)"

For `/schedule (cloud)`:
1. Check GitHub connection. If not connected: explain the GitHub repo requirement.
2. Warn: "Cloud tasks cannot access local files or MCP servers. Only GitHub repo contents and allowed MCP connectors are accessible."
3. Generate a task prompt from the pipeline skill. Guide the user through `/schedule` to create the task.

For Desktop scheduled tasks:
1. Explain: "Desktop tasks run on your machine with full local file access and MCP servers available."
2. Guide through the Desktop app Schedule page. Recommend `bypassPermissionsModeAllowed: true` for unattended runs.

For Local (cron/launchd):
1. Read `${CLAUDE_PLUGIN_ROOT}/scripts/templates/automation.sh`. Copy to `automation/run-pipeline.sh` and customize.
2. Ask: "What schedule? (e.g., daily at 07:00)"
3. For macOS: read `${CLAUDE_PLUGIN_ROOT}/scripts/templates/launchd.plist`, customize, save to `automation/`
4. Provide activation: `launchctl load ~/Library/LaunchAgents/[label].plist`

For VPS (systemd):
1. Read `${CLAUDE_PLUGIN_ROOT}/scripts/templates/systemd-service.unit`. Copy and customize to `automation/claude-pipeline.service`.
2. Ask: "What Linux user runs the agent?" and "Absolute path to project?"
3. Provide activation: `systemctl enable --now claude-pipeline.timer`

For Docker:
1. Read `${CLAUDE_PLUGIN_ROOT}/scripts/templates/docker/Dockerfile` and `docker-compose.yml`. Copy and customize to project root.
2. Warn: "Never bake ANTHROPIC_API_KEY into the image. Use .env file or environment injection."
3. Provide activation: `docker compose up -d`

Tell the user what files were created and the exact commands needed to activate the schedule.

Change 8 — Update Phase 7:

Append the following to the end of the ## Phase 7: Test and Iterate section (after "Implement that one change before closing the session."):


After the test run, ask using AskUserQuestion:
"Would you like to set up a feedback loop? This records reviewer scores and recurring issues so you can track pipeline quality over time and identify which agents need tuning."

If yes:
1. Read `${CLAUDE_PLUGIN_ROOT}/scripts/templates/feedback/FEEDBACK.md` and copy to `feedback/FEEDBACK.md`
2. Tell the user: "Log pipeline results to feedback/FEEDBACK.md after each run. After your first week, run `scripts/templates/optimization/pipeline-optimizer.sh` to get specific recommendations for improving your agents."

Change 9 — Update the Summary section (replace the summary from Step 8 with the full version):

Replace the ## Summary section written in Step 8 with:

## Summary

After all phases are complete (or the user stops early), print a summary:

AGENT SYSTEM BUILT

Agents: .claude/agents/[list] Pipeline: .claude/skills/[name].md Hooks: .claude/hooks/pre-tool-use.sh, post-tool-use.sh Settings: .claude/settings.json Schedule: automation/[script or config] Memory: memory/ (3-tier: SESSION-STATE, daily logs, MEMORY.md) Governance: GOVERNANCE.md (Level [N]) Budget: budget/BUDGET.md + .claude/hooks/budget-hook.sh Goals: GOALS.md Org chart: ORG-CHART.md MCP config: .mcp.json Feedback: feedback/FEEDBACK.md

Run your pipeline: /[pipeline-name] [your topic] Review logs: .claude/hooks/audit.log Evaluate system: /agent-factory:evaluate Check status: /agent-factory:status Deploy/reschedule: /agent-factory:deploy


If build was paused (build-state.json exists):
- Tell the user: "Build paused at Phase [phase]. Run `/agent-factory:build --resume` to continue."
- List completed phases and remaining phases.

If the user stopped early without pausing:
- Tell them which phase to continue from and remind them that this command can be re-invoked at any time.

Next steps after first week:
- Run `scripts/templates/optimization/pipeline-optimizer.sh` after collecting feedback
- Run `/agent-factory:evaluate` to score your system against all 22 capabilities

Change 10 — Add build-state.json resume handling at top of file:

Add the following paragraph immediately after the opening description paragraph (after "Work through phases sequentially..."):

**Resume mode:** If $ARGUMENTS contains `--resume`, read `build-state.json` from the current directory. Print: "Resuming build — completed phases: [list]. Paused at Phase [phase]. Reason: [reason]. Continuing..." Then skip all completed phases and continue from the saved phase using the saved choices. If build-state.json does not exist, tell the user: "No saved build state found. Starting fresh." and proceed normally.

**build-state.json schema:**
```json
{
  "phase": "3.5",
  "completed": ["0", "1", "2", "2.5", "3"],
  "choices": {
    "template": "monitoring",
    "agents": ["monitor-checker", "incident-reporter", "remediation-advisor"],
    "pipeline_name": "my-monitoring",
    "memory": true,
    "proactive": false,
    "mcp_servers": ["slack"],
    "autonomy_level": 3,
    "budget": true,
    "goals": false,
    "org_chart": false,
    "deployment_target": "local"
  },
  "paused_reason": "User creating custom MCP server for internal ticketing system",
  "resume_instructions": "Run /agent-factory:build --resume when MCP server is ready"
}

### Verify

```bash
wc -l /Users/ktg/repos/agent-builder/commands/build.md

Expected: >= 600 (was 390 before Step 8, Step 26 should bring it to 600+)

Also verify the resume mechanism is present:

grep -c "build-state.json" /Users/ktg/repos/agent-builder/commands/build.md

Expected: >= 3

On failure

revert — build command is critical path, must be valid

Checkpoint

git commit -m "feat(commands): integrate all Phase 2-5 features into build workflow"

Step 27: Update plugin.json, CLAUDE.md, README.md for v1.0

Files to modify

.claude-plugin/plugin.json — Replace content with:

{
  "name": "agent-factory",
  "description": "Build and manage autonomous agent systems with Claude Code. Guided workflow with 3-tier memory, heartbeat scheduling, budget tracking, governance, org-chart, 10 domain templates, Docker deployment, and import/export. Inspired by OpenClaw and Paperclip patterns.",
  "version": "1.0.0",
  "author": {
    "name": "Kjell Tore Guttormsen",
    "url": "https://fromaitochitta.com"
  },
  "repository": "https://git.fromaitochitta.com/open/agent-factory",
  "license": "MIT",
  "keywords": [
    "agent",
    "autonomous",
    "pipeline",
    "automation",
    "hooks",
    "security",
    "deployment",
    "memory",
    "heartbeat",
    "budget",
    "governance",
    "org-chart",
    "templates",
    "import",
    "export"
  ]
}

CLAUDE.md — Replace content with:

# Agent Factory Plugin

Plugin that guides users through building complete autonomous agent systems
using Claude Code. Install via `/install agent-factory` or `--plugin-dir`.

## What this plugin does

Guides users through building their own multi-agent system end to end:
agents, skills, hooks, 3-tier memory, heartbeat scheduling, budget
tracking, governance, org-chart, MCP integrations, and deployment.
The `/agent-factory:build` command runs a guided workflow covering
all major agent system capabilities.

## Plugin structure

- `commands/` — 4 user-invoked slash commands
  - `build.md` — Guided build workflow (10 phases including memory, governance, integrations)
  - `deploy.md` — Configure deployment (5 targets: cloud, desktop, cron/launchd, systemd, Docker)
  - `evaluate.md` — Score system against 22 agent capabilities
  - `status.md` — Quick health check of all infrastructure
- `agents/` — 2 plugin agents
  - `builder.md` — Guided build orchestrator
  - `deployment-advisor.md` — Deployment target recommendation agent
- `skills/` — 2 auto-triggering knowledge skills
  - `agent-system-design/` — 22-capability feature map, pipeline patterns, security, deployment, memory, autonomy, orchestration, governance, MCP references
  - `managed-agents/` — Anthropic API managed agents patterns and SDK examples
- `scripts/templates/` — File templates the builder copies into the user's project

## Template directories

All templates are in `scripts/templates/`:

| Directory | Contents |
|-----------|----------|
| `memory/` | 3-tier memory: SESSION-STATE.md, DAILY-LOG.md, MEMORY.md |
| `heartbeat/` | Heartbeat runner, HEARTBEAT.md, context-packet.md, wake-prompt.md |
| `proactive/` | PROACTIVE-AGENT.md, ADL-RULES.md, VFM-SCORING.md |
| `cron/` | agent-turn.sh (isolated agentTurn), system-event.sh |
| `goals/` | GOALS.md goal hierarchy, goal-tracker.sh |
| `budget/` | BUDGET.md policy, budget-hook.sh, budget-report.sh |
| `governance/` | GOVERNANCE.md, approval-gate.sh |
| `org-chart/` | ORG-CHART.md, org-manager.sh |
| `docker/` | Dockerfile, docker-compose.yml, docker-entrypoint.sh |
| `domains/` | 10 domain pipeline templates (content, code-review, monitoring, research, data-processing, customer-support, devops, legal, sales, security) |
| `transfer/` | export-system.sh, import-system.sh, MANIFEST.md |
| `feedback/` | FEEDBACK.md, feedback-collector.sh, performance-scorer.sh |
| `optimization/` | pipeline-optimizer.sh, self-healing.sh |

## Rules

- Never write files outside the user's project directory
- Always ask before overwriting existing files
- Hook templates must be bash 3.2 compatible (Intel Mac)
- Generated agents must have valid YAML frontmatter with `<example>` blocks
- Use `${CLAUDE_PLUGIN_ROOT}` for all intra-plugin paths
- Domain templates use `{{PLACEHOLDER}}` syntax (plain string replace, no engine)

README.md — Complete rewrite with:

# Agent Factory

> Install one plugin. Build complete agent systems that do real work.

Agent Factory is a Claude Code plugin that guides you through building
autonomous agent systems from scratch — covering every layer from
agents and pipelines to memory, heartbeat scheduling, governance, and
deployment.

**Recommended installation:** via [ktg-plugin-marketplace](https://git.fromaitochitta.com/open/ktg-plugin-marketplace), which bundles Agent Factory with the ultra-suite (ultraplan, ultraresearch, ultraexecute).

```bash
# Install from marketplace (recommended)
/install ktg-plugin-marketplace

# Or install standalone
/install agent-factory
# or
claude --plugin-dir ./agent-factory

Commands

Command What it does
/agent-factory:build Guided build workflow — 10 phases from blank to deployed system
/agent-factory:deploy Configure deployment for any target (cloud, local, VPS, Docker)
/agent-factory:evaluate Score your system against all 22 agent capabilities
/agent-factory:status Quick health check: agents, skills, hooks, deployment, memory

What it builds

A complete autonomous agent system with:

Core infrastructure

  • Agent files (.claude/agents/*.md) with reliable trigger patterns
  • Pipeline skill that chains agents end to end
  • Pre-tool-use and post-tool-use security hooks
  • Settings with granular permission allow/deny rules

Memory (OpenClaw pattern)

  • Session state (hot) — Working memory with WAL protocol. Written before responding; survives crashes.
  • Daily logs (warm) — One file per day; captures decisions and carry-forward items.
  • Long-term memory (cold) — Curated knowledge that persists indefinitely.

Heartbeat scheduling (OpenClaw + Paperclip)

  • HEARTBEAT.md — Defines scheduled tasks and intervals
  • heartbeat-runner.sh — Bash 3.2 script with emptiness detection (no API call if nothing to do), startup catchup (catches up to 5 missed tasks after downtime), and task state tracking
  • Context injection — context-packet.md + wake-prompt.md give agents their full state on each wakeup

Proactive agents (OpenClaw)

  • Anti-Drift Limits (ADL) — Prevents agents from faking capabilities, expanding scope without approval, or making unverifiable changes
  • Value-First Modification (VFM) — Requires proposed self-improvements to score >50/100 before implementation

Goal hierarchy (Paperclip)

  • GOALS.md — Three-level goal tree: company → project → task
  • Simple parent_id references (not recursive traversal — matches actual Paperclip implementation)
  • Goal IDs in agent prompts for alignment

Budget tracking (Paperclip)

  • BUDGET.md — Monthly budget policy per agent and per project
  • budget-hook.sh — PostToolUse hook that logs cost events, warns at 80%, pauses at 100%
  • budget-report.sh — Per-agent cost breakdown and projection

Governance (Paperclip)

  • Autonomy levels 0-4 (from "all calls need approval" to "full autonomy with hook guardrails")
  • Approval gates — specific operations that always pause for human review
  • approval-gate.sh — PreToolUse hook implementing the gate logic
  • Audit trail: tool calls, budget events, approval decisions

Org chart (Paperclip)

  • ORG-CHART.md — Agent hierarchy with reportsTo references
  • Delegation rules, cross-team routing, human override authority
  • org-manager.sh — Validates hierarchy, generates org tree visualization

MCP integrations

  • .mcp.json configuration for Slack, GitHub, Playwright, databases
  • Pause/resume build state when MCP server creation is needed
  • Trade-off guidance: MCP vs Bash vs custom server

10 domain templates

Start from a pre-built system rather than blank:

Template Domain Agents
content-pipeline Articles, newsletters, reports researcher, writer, reviewer
code-review Automated PR review analyzer, review-writer, standards-checker
monitoring System/service monitoring monitor-checker, incident-reporter, remediation-advisor
research-synthesis Research and analysis source-gatherer, synthesizer, fact-checker
data-processing Data transformation data-validator, transformer, quality-checker
customer-support Ticket handling classifier, response-drafter, escalation-checker
devops-automation Deployment and incidents deploy-checker, incident-detector, runbook-executor
legal-review Document review clause-extractor, risk-assessor, compliance-checker
sales-intelligence Prospect research prospect-researcher, pitch-customizer, follow-up-tracker
security-audit Security posture config-scanner, vulnerability-checker, remediation-advisor

Deployment (5 targets)

Target Needs Local files Best for
/schedule (cloud) GitHub repo No Repo maintenance, PR review, CI triage
Desktop tasks Desktop app Yes Local automation while you work
cron/launchd Machine on Yes Personal daily pipelines
VPS systemd Linux server Yes Team pipelines, production
Docker Docker installed Via volumes Isolated, portable, reproducible

Import/export

  • export-system.sh — Package your agent system into a versioned tarball with MANIFEST.md
  • import-system.sh — Import a system tarball, check for conflicts, validate all components
  • MANIFEST.md — Component list with checksums

Feedback and self-learning

  • FEEDBACK.md — Records pipeline scores and recurring issues
  • feedback-collector.sh — PostToolUse hook that detects patterns after 3+ occurrences
  • performance-scorer.sh — Per-agent metrics and improvement trends
  • pipeline-optimizer.sh — VFM-scored recommendations for bottleneck agents
  • self-healing.sh — Categorized error recovery with backoff and max-attempt limits

Quick start

# Install and run
/install agent-factory
/agent-factory:build

# Start from a domain template
/agent-factory:build monitoring

# Resume an interrupted build
/agent-factory:build --resume

# Evaluate what you've built
/agent-factory:evaluate

# Check system health
/agent-factory:status

Architecture overview

Phase 0: Choose template (or custom)
Phase 1: Map your work — pipeline design sketch
Phase 2: Operating manual — CLAUDE.md
Phase 2.5: Memory setup — 3-tier memory
Phase 3: Agent team — .claude/agents/*.md
Phase 3.5: Skills — wire up or generate skill files
Phase 3.7: Proactive agent — ADL/VFM guardrails
Phase 4: Pipeline — .claude/skills/[name].md
Phase 4.5: Integrations — .mcp.json + MCP servers
Phase 5: Security — hooks + settings.json
Phase 5.5: Governance — GOVERNANCE.md + autonomy level
Phase 5.7: Goals and org chart — for 3+ agent systems
Phase 6: Deployment — choose and configure target
Phase 7: Test and iterate — feedback loop setup

Pattern reference

Patterns implemented by this plugin are documented in the skills:

  • 22 agent capabilitiesskills/agent-system-design/references/feature-map.md
  • Memory patternsskills/agent-system-design/references/memory-patterns.md
  • Autonomy patternsskills/agent-system-design/references/autonomy-patterns.md
  • Orchestration patternsskills/agent-system-design/references/orchestration-patterns.md
  • Governance patternsskills/agent-system-design/references/governance-patterns.md
  • MCP integrationsskills/agent-system-design/references/mcp-integrations.md
  • Managed Agents (API)skills/managed-agents/SKILL.md

Version history

Version What changed
1.0.0 Full vision: 10 templates, memory, heartbeat, governance, budget, org-chart, Docker, import/export
0.2.0 Rename to agent-factory, add deploy/evaluate/status commands, managed-agents skill, 5 domain templates
0.1.0 Initial release: build command, builder agent, agent-system-design skill, 3 hook templates

Value proposition

Install one plugin. Build complete agent systems that do real work. No servers, no databases, no configuration files beyond what the agents themselves manage. Everything runs in Claude Code with shell scripts you can read, edit, and understand.


### Verify

```bash
python3 -c "import json; d=json.load(open('/Users/ktg/repos/agent-builder/.claude-plugin/plugin.json')); assert d['version']=='1.0.0'; assert d['name']=='agent-factory'; print('OK')"

Expected: OK

Also verify CLAUDE.md lists all 13 template directories:

grep -c "memory/\|heartbeat/\|proactive/\|cron/\|goals/\|budget/\|governance/\|org-chart/\|docker/\|domains/\|transfer/\|feedback/\|optimization/" /Users/ktg/repos/agent-builder/CLAUDE.md

Expected: 13

Also verify README.md has the value proposition line:

grep -c "No servers, no databases, no configuration" /Users/ktg/repos/agent-builder/README.md

Expected: 1

On failure

revert — manifest must be valid JSON with correct version

Checkpoint

git commit -m "feat: Agent Factory v1.0.0 — full vision realized"

Exit Condition

  • grep -c "agent-factory" /Users/ktg/repos/agent-builder/commands/build.md>= 3
  • wc -l /Users/ktg/repos/agent-builder/commands/build.md | awk '{print ($1 >= 600)}'1
  • grep -c "build-state.json" /Users/ktg/repos/agent-builder/commands/build.md>= 3
  • grep -c "Phase 2.5\|Phase 3.5\|Phase 3.7\|Phase 4.5\|Phase 5.5\|Phase 5.7" /Users/ktg/repos/agent-builder/commands/build.md6
  • python3 -c "import json; d=json.load(open('/Users/ktg/repos/agent-builder/.claude-plugin/plugin.json')); assert d['version']=='1.0.0'; assert d['name']=='agent-factory'" → no error
  • grep -c "memory/\|heartbeat/\|proactive/\|cron/\|goals/\|budget/\|governance/\|org-chart/\|docker/\|domains/\|transfer/\|feedback/\|optimization/" /Users/ktg/repos/agent-builder/CLAUDE.md13
  • grep -c "No servers, no databases, no configuration" /Users/ktg/repos/agent-builder/README.md1
  • grep -c "1.0.0" /Users/ktg/repos/agent-builder/README.md>= 1

Quality Criteria

  • commands/build.md has all 6 new phases (2.5, 3.5, 3.7, 4.5, 5.5, 5.7) in correct phase-number order
  • Phase 0 domain template selection asks by name and reads the correct template file via ${CLAUDE_PLUGIN_ROOT}/scripts/templates/domains/{{TEMPLATE}}.md
  • Phase 3.5 pause/resume writes and reads build-state.json with the exact schema specified
  • Phase 4.5 covers all 3 options for missing MCP servers: Bash fallback, community search, custom build with pause
  • Phase 5.5 governance includes both GOVERNANCE.md generation and the optional budget tracking setup
  • Phase 5.7 is conditional: only shown for systems with 3+ agents
  • Phase 6 deployment table shows all 5 targets with Local files and Minimum interval columns
  • Resume logic (--resume in $ARGUMENTS) reads build-state.json and skips completed phases
  • plugin.json version is exactly "1.0.0", name is "agent-factory", has all 14 keywords including "memory", "heartbeat", "budget", "governance", "org-chart", "templates", "import", "export"
  • CLAUDE.md lists all 13 template directories in the Template directories table
  • README.md command table has all 4 commands
  • README.md deployment table has all 5 targets with trade-offs
  • README.md domain template table has all 10 templates
  • README.md ends with the value proposition: "No servers, no databases, no configuration..."
  • Steps 8 and 26 are composable: Step 26 builds on Step 8's Phase 0 and Phase 6 changes without reverting them