# Example 14: Build Your Personal Agent System Everything before this was a demo. This is where you build something real. Examples 01-13 showed you individual capabilities: research, file management, web search, agents, pipelines, messaging, automation, security. Individually, each one is useful. Together, they form something fundamentally different: a personal agent system that handles your recurring work, runs on schedule, and delivers results to your phone. This is not "create a skill and connect your phone." You could have done that after reading GETTING-STARTED.md. This is: design a system of agents that understands your domain, automate a complete workflow, add safety guardrails, and test it on real work from your actual week. **Time needed:** 90 minutes for the full build. You will iterate on it over the first week. Most people say it clicks around day 3. **What you need:** - A project directory (new or existing) where you will set up your personal agent system. Everything you build here is portable. - The **plugin-dev** plugin installed in Claude Code. This is the tool that creates skills, agents, and hooks with correct structure and best practices. If you do not have it: ``` /install plugin-dev ``` --- ## Phase 1: Map your work (10 minutes, no computer needed) Before you build anything, think. The difference between a useful agent system and a novelty is whether you designed it around how you actually work. ### The three questions **1. What do you do every week that follows a pattern?** Not creative work. Not relationship work. The structured, repeatable tasks that eat your time because they require gathering, processing, formatting, and delivering information. Examples: - A weekly status report that pulls from multiple sources - Competitor monitoring that requires searching, comparing, summarizing - Meeting preparation that needs research, context, and a briefing doc - Client updates that combine project data with a professional narrative Write down 3-5 tasks. Be specific about inputs, processing, and outputs. **2. What roles would you hire for if you could?** Not "an assistant." Specific roles with specific expertise: - A **researcher** who tracks your industry and surfaces what matters - A **writer** who turns your notes into polished documents - A **reviewer** who catches errors before anything goes out - A **briefing officer** who summarizes what you need to know each morning - An **analyst** who processes data and highlights patterns Write down 2-3 roles. For each: what do they know, what do they produce, what quality standard must they meet? **3. What must never happen automatically?** The safety question. When your system runs at 07:00 on Monday without you watching: - What actions should be blocked? (sending emails, deleting files, accessing certain APIs) - What should always be logged? (all external calls, all file writes, all web searches) - What requires your approval before executing? (anything touching production, anything external) Write these down. They become your hooks in Phase 5. ### Design sketch Before touching Claude Code, sketch your system on paper or in your head: ``` [Your recurring task] | v [Agent 1: gather information] --> raw data | v [Agent 2: process and draft] --> draft output | v [Agent 3: review and verify] --> verified output | v [Save to files] + [Send to phone] + [Log to memory] ``` This is the same architecture as Example 10, adapted to your work. You are not inventing something new. You are configuring a proven pattern for your specific context. --- ## Phase 2: Build your operating manual (15 minutes) Open your project directory in Claude Code. Your first step is not writing a CLAUDE.md yourself. It is asking Claude to interview you and write one. ### The prompt ``` Interview me to create a CLAUDE.md for this project. Ask me questions about: 1. Who I am and what I do 2. How I like to communicate (format, tone, length) 3. What I am working on right now (top 3-5 priorities) 4. What tools and services I use 5. What you should never do without my permission Ask one category at a time. After all five, write a CLAUDE.md that would let you handle my Monday morning without additional context. ``` Claude will ask you questions, listen to your answers, and produce a CLAUDE.md tailored to what you said. This is better than writing it yourself because Claude knows what information it needs to be effective. ### What makes a great CLAUDE.md After Claude writes it, check: | Criteria | Test | |----------|------| | Specific enough | Could someone handle your Monday with only this file? | | Priorities current | Does it reflect this week's work, not last month's? | | Boundaries clear | Are the "never do" rules unambiguous? | | Tools documented | Does it list what MCP servers and channels are available? | | Format preferences stated | Will Claude produce output in the shape you actually want? | If any answer is "no," tell Claude what to fix. Iterate until it passes all five. This is the foundation of everything that follows. **Pattern from Example 05:** CLAUDE.md loads at every session start. Every agent you create in Phase 3 inherits this context automatically. --- ## Phase 3: Create your agent team (20 minutes) Now build the 2-3 agents you identified in Phase 1. You use the **plugin-dev** plugin to create them. It knows the correct file format, frontmatter fields, system prompt structure, and triggering conditions. You describe what you need; it builds the agent properly. ### Create your first agent Pick the role you need most. Run: ``` /agent-development ``` The plugin-dev skill will guide you through creating the agent: what it does, what tools it needs, when it should trigger, and what its system prompt should contain. Describe your agent clearly: ``` I need an agent called "[role-name]" that [what it does]. It should: - [Specific capability 1] - [Specific capability 2] - [Quality standard it must meet] It should NOT: - [Boundary 1] - [Boundary 2] ``` The plugin creates the file in `.claude/agents/` with proper frontmatter (name, description, model hint, tools) and a focused system prompt. **Concrete examples for three different people:** A marketing manager might say: ``` /agent-development I need an agent called "market-scanner" that monitors competitor activity. It should search the web for news about [competitor names], extract pricing changes, new feature launches, and partnership announcements. It should produce a structured report with source URLs for every claim. It should not include rumors or unverified information. ``` An engineering lead might say: ``` /agent-development I need an agent called "tech-researcher" that evaluates technologies for our stack. It should search documentation and GitHub repos, compare features, check community health (stars, issues, release frequency), and produce a recommendation with pros, cons, and risks. It should not recommend anything without checking the last 3 months of GitHub issues for dealbreakers. ``` A consultant might say: ``` /agent-development I need an agent called "client-briefer" that prepares meeting briefs. It should research the client's company (recent news, financials, leadership changes), review my notes in memory/, and produce a one-page briefing with talking points and potential questions they will ask. It should not include anything I cannot verify in the meeting. ``` ### Create your second agent Every good system needs a quality gate. Run `/agent-development` again for a reviewer: ``` I need an agent called "[domain]-reviewer" that reviews output from my other agents. It should check for: - [Your specific accuracy requirements] - [Your formatting standards] - [Common mistakes in your domain] When it finds issues, it should list them with specific fix instructions, not vague feedback. ``` ### Create your third agent (optional but powerful) A writer/formatter that takes raw output and shapes it for your audience. Run `/agent-development`: ``` I need an agent called "[output]-writer" that transforms research and analysis into [format your audience expects]. It should follow [your organization's style/tone]. It should be concise: [your word limit]. ``` ### Test each agent individually Before combining them, verify each one works alone: ``` Use the [agent-name] agent to [a specific task from your real work]. Show me the full output. ``` Does the output meet your standards? If not, tell Claude what to fix in the agent file. Iterate until each agent produces work you would use. This is worth the time. A weak agent in a pipeline produces weak pipeline output. **Pattern from Example 06:** The researcher-writer-reviewer pattern works for almost any domain. Your agents are the same pattern with your domain expertise built in. --- ## Phase 4: Build your pipeline (20 minutes) Now the payoff. You combine your agents into a pipeline skill that handles a complete workflow. ### Create the pipeline skill Run `/skill-development` to create the pipeline skill. This is more complex than a simple task skill because it orchestrates multiple agents, so the plugin-dev guided workflow helps you get the frontmatter, description, and triggering conditions right. Describe your pipeline: ``` I need a skill called "weekly-[your-workflow]" that runs my complete [task name] workflow. Steps: 1. Read CLAUDE.md for current priorities and context 2. Read memory/[your-state-file].md for what happened last time 3. Use the [agent-1] agent to [gather/research/analyze] 4. Use the [agent-2] agent to [draft/format/write] 5. Use the [agent-3] agent to [review/verify/check] 6. If the reviewer finds issues, send back to [agent-2] for fixes 7. Save the final output to pipeline-output/[your-output]-[date].md 8. Update memory/[your-state-file].md with what was done and when 9. Show me the first 10 lines of the output to confirm ``` After the skill is created, run `/skill-reviewer` to check its quality. The reviewer catches common issues: vague descriptions, missing triggering conditions, incomplete step definitions. **Concrete example for a marketing manager:** ``` /skill-development I need a skill called "weekly-competitive-intel" that runs my Monday morning competitive intelligence workflow. Steps: 1. Read CLAUDE.md for the list of competitors I track 2. Read memory/competitive-intel-state.md for what was covered last week 3. Use the market-scanner agent to research each competitor's activity this week (news, product changes, pricing, partnerships) 4. Use the report-writer agent to draft a structured briefing: - Executive summary (3 bullets) - Per-competitor section with changes and source URLs - "So what" section: what this means for our strategy 5. Use the quality-reviewer agent to verify all claims have sources and no speculation is presented as fact 6. If the reviewer finds issues, send back to report-writer for fixes 7. Save to pipeline-output/competitive-intel-[date].md 8. Update memory/competitive-intel-state.md with today's date and what was found 9. Show me the executive summary ``` ### Test the pipeline Run it on a real task: ``` /weekly-[your-workflow] ``` Watch the agents hand off to each other. This is the same orchestration you saw in Example 10, but configured for your work. The pipeline should take 2-5 minutes depending on how much research is involved. ### Evaluate the output Read the output as if your manager, client, or colleague sent it to you. | Question | If no | |----------|-------| | Is it accurate? | Tighten the reviewer agent's instructions | | Is it the right format? | Adjust the writer agent's output template | | Is it the right depth? | Change the researcher agent's scope | | Would you send this as-is? | Identify what's missing and update the pipeline | Iterate. Run the pipeline again after each adjustment. Most people need 2-3 rounds to get output they are genuinely satisfied with. **Pattern from Example 10:** The pipeline is a recipe with interchangeable ingredients. Swap agents, change topics, adjust the output format. The orchestration stays the same. --- ## Phase 5: Add your safety layer (10 minutes) Your pipeline will soon run automatically. Before that happens, set up the guardrails you identified in Phase 1. ### Create your custom hooks Run `/hook-development` for each hook. The plugin-dev guided workflow handles the hook event type (PreToolUse, PostToolUse), the matching logic, registration in settings.json, and testing. **For blocking dangerous commands:** ``` /hook-development I need a PreToolUse hook that blocks Bash commands containing: - [pattern 1: e.g., "rm -rf", "DROP TABLE"] - [pattern 2: e.g., commands targeting my production directories] - [pattern 3: e.g., curl commands with API tokens in arguments] ``` **For audit logging:** ``` /hook-development I need a PostToolUse hook that logs every tool call to hooks/audit.log with timestamp, tool name, and a summary of what was done. ``` The plugin creates the shell scripts and registers them in `.claude/settings.json` so they run on every tool call. ### Verify the hooks work ``` Try running: [a command your hook should block] Then check hooks/audit.log to confirm it was blocked and logged. ``` **Pattern from Example 09:** Hooks run on every tool call, including inside agent invocations and automated pipeline runs. What you configure here protects every automated run from Phase 6 onward. --- ## Phase 6: Connect and automate (10 minutes) ### Set up your phone channel ``` claude --channels ``` Follow the setup for your chosen channel (Telegram recommended for first setup). See `messaging/` in this repo for detailed guides. Test: send "hello" from your phone. If Claude responds, you are connected. ### Schedule your pipeline Tell Claude: ``` Create a cron job that runs /weekly-[your-workflow] every [day] at [time]. Show me the cron entry before creating it. ``` Or for remote scheduling: ``` /schedule "Run /weekly-[your-workflow] and send me the executive summary via Telegram" at [date]T[time]:00 ``` ### Test the full flow The real test. From your phone, send: ``` Run /weekly-[your-workflow] ``` Wait for the result. When the pipeline output arrives on your phone, your system is working end-to-end: phone trigger, agent orchestration, quality review, file output, delivery. **Pattern from Example 08:** /loop for testing, CronCreate for production, /schedule for remote triggers. --- ## Phase 7: Run it on real work (15 minutes) This is the moment that matters. Everything up to here was setup. Now you test it on an actual task from your actual week. ### Pick a real task Not a test. Not a hypothetical. Something you need to do this week that your pipeline is designed to handle. A real report, a real briefing, a real analysis. ### Run it ``` /weekly-[your-workflow] ``` ### Evaluate honestly Read the output. Compare it to what you would have produced manually. **The scoring rubric:** | Score | Meaning | What to do | |-------|---------|------------| | "I would send this as-is" | Your system works. Ship it. | Minor tweaks over time | | "Close, needs 10 min of editing" | Your system saves 80% of the work. | Refine agent instructions for the gaps | | "The structure is right but content is off" | Agents need better domain context | Add more detail to CLAUDE.md and agent files | | "This is not usable" | Pipeline design mismatch | Go back to Phase 1 and re-map the workflow | Most people land on "close, needs 10 minutes of editing" on the first real run. By the third run, the agents have been refined enough that the output is send-ready. --- ## What you built | Component | What it does | How to change it | |-----------|-------------|-----------------| | CLAUDE.md | Operating manual for your domain | Update weekly with priorities | | 2-3 agents | Specialists for your recurring tasks | Refine when output misses the mark | | Pipeline skill | Chains agents into a complete workflow | Add steps, swap agents, change output | | Custom hooks | Protects automated runs | Add patterns as you automate more | | Scheduling | Runs the pipeline on autopilot | Adjust frequency based on need | | Phone channel | Access from anywhere | Add more channels over time | | Memory | Tracks what was done and when | Grows automatically with each run | This is not a configured tool. This is a system. A GETTING-STARTED reader has a Claude Code setup with preferences. You have a multi-agent pipeline that handles a complete recurring workflow, runs on schedule, reviews its own output for quality, logs every action for safety, and delivers results to your phone. --- ## Your first week **Day 1:** Run the pipeline manually. Fix what is wrong with the output. Refine agent instructions. **Day 2:** Run it again. Compare to yesterday. Is it better? Add domain knowledge to CLAUDE.md that would have prevented yesterday's mistakes. **Day 3:** Let the cron job run it. Check the output when it arrives on your phone. The moment it produces something useful without you touching anything: that is the moment it clicks. **Day 4-5:** Add a second pipeline skill for another recurring task. You already have agents. Creating a new pipeline that reuses them takes 10 minutes. **Day 6-7:** Review your hooks and audit log. Are the right things being blocked? Is the log useful? Tighten security now that you trust the system to run autonomously. **After one week:** You have 1-2 automated pipelines, 2-3 specialized agents, and a growing memory of your work context. Routine Monday tasks take one phone message instead of two hours. You will know whether this changed how you work. --- ## Growing from here **More agents.** As you notice new recurring patterns, run `/agent-development` to create agents for them. Each agent you add is a building block that any pipeline skill can use. Run `/skill-reviewer` on existing agents periodically to catch quality drift. **More pipelines.** A pipeline is just a skill that calls agents in sequence. Run `/skill-development` to create new pipelines. Once you have 3-4 agents, creating a new pipeline for a different workflow takes minutes. **More integrations.** MCP servers connect your agents to external services: Slack, Google Drive, databases, APIs. Run `/mcp-integration` to add new servers with correct configuration. Each integration multiplies what your pipelines can do without manual data gathering. **Tighter security.** As automation increases, so should guardrails. Run `/hook-development` to add domain-specific hook patterns. Review audit logs. Set up alerts for blocked actions that might indicate a prompt going wrong. **Validate your setup.** Run `/plugin-validator` periodically to check that all your files have correct structure, frontmatter, and cross-references. Catches problems before they cause silent failures in automated runs. **Shared agents.** The agents and pipelines you build are markdown files. You can share them with colleagues, version control them, or use them as templates for other projects. The system is portable. --- ## Honest assessment This system will not replace your expertise, your relationships, or your judgment. It replaces the scaffolding: gathering, formatting, cross-checking, delivering, and remembering. The work that has to happen before and after the real thinking. The people who get the most from it share three traits: 1. They are specific about their domain in CLAUDE.md 2. They iterate on agent instructions instead of accepting first-draft output 3. They actually use it for real work, not just demos The time investment is real: 90 minutes to build, 15 minutes a week to maintain and improve. The return scales with how well you defined your workflow in Phase 1. A system designed around vague goals produces vague results. A system designed around "every Monday I need X from sources Y in format Z" produces something you actually use. If you followed the examples and built this system, you understand more about personal AI agent architecture than most people in the industry. Not because the technology is hard. Because the thinking in Phase 1, designing your work as a system of agents, is the part most people skip.