Skills, agents, and hooks should be created via plugin-dev plugin, not generic "tell Claude" or manual file writing. Example 14: /agent-development for agents, /skill-development for pipeline skills, /hook-development for hooks, /skill-reviewer for quality checks, /plugin-validator for setup validation. plugin-dev added as prerequisite. GETTING-STARTED.md: /skill-development for skill creation, /skill-reviewer for iteration, /mcp-integration for MCP servers. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
600 lines
20 KiB
Markdown
600 lines
20 KiB
Markdown
# Example 14: Build Your Personal Agent System
|
|
|
|
Everything before this was a demo. This is where you build
|
|
something real.
|
|
|
|
Examples 01-13 showed you individual capabilities: research, file
|
|
management, web search, agents, pipelines, messaging, automation,
|
|
security. Individually, each one is useful. Together, they form
|
|
something fundamentally different: a personal agent system that
|
|
handles your recurring work, runs on schedule, and delivers results
|
|
to your phone.
|
|
|
|
This is not "create a skill and connect your phone." You could have
|
|
done that after reading GETTING-STARTED.md. This is: design a system
|
|
of agents that understands your domain, automate a complete workflow,
|
|
add safety guardrails, and test it on real work from your actual week.
|
|
|
|
**Time needed:** 90 minutes for the full build. You will iterate on
|
|
it over the first week. Most people say it clicks around day 3.
|
|
|
|
**What you need:**
|
|
- A project directory (new or existing) where you will set up your
|
|
personal agent system. Everything you build here is portable.
|
|
- The **plugin-dev** plugin installed in Claude Code. This is the
|
|
tool that creates skills, agents, and hooks with correct structure
|
|
and best practices. If you do not have it:
|
|
```
|
|
/install plugin-dev
|
|
```
|
|
|
|
---
|
|
|
|
## Phase 1: Map your work (10 minutes, no computer needed)
|
|
|
|
Before you build anything, think. The difference between a useful
|
|
agent system and a novelty is whether you designed it around how
|
|
you actually work.
|
|
|
|
### The three questions
|
|
|
|
**1. What do you do every week that follows a pattern?**
|
|
|
|
Not creative work. Not relationship work. The structured, repeatable
|
|
tasks that eat your time because they require gathering, processing,
|
|
formatting, and delivering information.
|
|
|
|
Examples:
|
|
- A weekly status report that pulls from multiple sources
|
|
- Competitor monitoring that requires searching, comparing, summarizing
|
|
- Meeting preparation that needs research, context, and a briefing doc
|
|
- Client updates that combine project data with a professional narrative
|
|
|
|
Write down 3-5 tasks. Be specific about inputs, processing, and outputs.
|
|
|
|
**2. What roles would you hire for if you could?**
|
|
|
|
Not "an assistant." Specific roles with specific expertise:
|
|
|
|
- A **researcher** who tracks your industry and surfaces what matters
|
|
- A **writer** who turns your notes into polished documents
|
|
- A **reviewer** who catches errors before anything goes out
|
|
- A **briefing officer** who summarizes what you need to know each morning
|
|
- An **analyst** who processes data and highlights patterns
|
|
|
|
Write down 2-3 roles. For each: what do they know, what do they produce,
|
|
what quality standard must they meet?
|
|
|
|
**3. What must never happen automatically?**
|
|
|
|
The safety question. When your system runs at 07:00 on Monday without
|
|
you watching:
|
|
|
|
- What actions should be blocked? (sending emails, deleting files, accessing certain APIs)
|
|
- What should always be logged? (all external calls, all file writes, all web searches)
|
|
- What requires your approval before executing? (anything touching production, anything external)
|
|
|
|
Write these down. They become your hooks in Phase 5.
|
|
|
|
### Design sketch
|
|
|
|
Before touching Claude Code, sketch your system on paper or in your head:
|
|
|
|
```
|
|
[Your recurring task]
|
|
|
|
|
v
|
|
[Agent 1: gather information] --> raw data
|
|
|
|
|
v
|
|
[Agent 2: process and draft] --> draft output
|
|
|
|
|
v
|
|
[Agent 3: review and verify] --> verified output
|
|
|
|
|
v
|
|
[Save to files] + [Send to phone] + [Log to memory]
|
|
```
|
|
|
|
This is the same architecture as Example 10, adapted to your work.
|
|
You are not inventing something new. You are configuring a proven
|
|
pattern for your specific context.
|
|
|
|
---
|
|
|
|
## Phase 2: Build your operating manual (15 minutes)
|
|
|
|
Open your project directory in Claude Code. Your first step is not
|
|
writing a CLAUDE.md yourself. It is asking Claude to interview you
|
|
and write one.
|
|
|
|
### The prompt
|
|
|
|
```
|
|
Interview me to create a CLAUDE.md for this project. Ask me questions
|
|
about:
|
|
1. Who I am and what I do
|
|
2. How I like to communicate (format, tone, length)
|
|
3. What I am working on right now (top 3-5 priorities)
|
|
4. What tools and services I use
|
|
5. What you should never do without my permission
|
|
|
|
Ask one category at a time. After all five, write a CLAUDE.md that
|
|
would let you handle my Monday morning without additional context.
|
|
```
|
|
|
|
Claude will ask you questions, listen to your answers, and produce
|
|
a CLAUDE.md tailored to what you said. This is better than writing
|
|
it yourself because Claude knows what information it needs to be
|
|
effective.
|
|
|
|
### What makes a great CLAUDE.md
|
|
|
|
After Claude writes it, check:
|
|
|
|
| Criteria | Test |
|
|
|----------|------|
|
|
| Specific enough | Could someone handle your Monday with only this file? |
|
|
| Priorities current | Does it reflect this week's work, not last month's? |
|
|
| Boundaries clear | Are the "never do" rules unambiguous? |
|
|
| Tools documented | Does it list what MCP servers and channels are available? |
|
|
| Format preferences stated | Will Claude produce output in the shape you actually want? |
|
|
|
|
If any answer is "no," tell Claude what to fix. Iterate until it
|
|
passes all five. This is the foundation of everything that follows.
|
|
|
|
**Pattern from Example 05:** CLAUDE.md loads at every session start.
|
|
Every agent you create in Phase 3 inherits this context automatically.
|
|
|
|
---
|
|
|
|
## Phase 3: Create your agent team (20 minutes)
|
|
|
|
Now build the 2-3 agents you identified in Phase 1. You use the
|
|
**plugin-dev** plugin to create them. It knows the correct file
|
|
format, frontmatter fields, system prompt structure, and triggering
|
|
conditions. You describe what you need; it builds the agent properly.
|
|
|
|
### Create your first agent
|
|
|
|
Pick the role you need most. Run:
|
|
|
|
```
|
|
/agent-development
|
|
```
|
|
|
|
The plugin-dev skill will guide you through creating the agent:
|
|
what it does, what tools it needs, when it should trigger, and
|
|
what its system prompt should contain. Describe your agent clearly:
|
|
|
|
```
|
|
I need an agent called "[role-name]" that [what it does].
|
|
|
|
It should:
|
|
- [Specific capability 1]
|
|
- [Specific capability 2]
|
|
- [Quality standard it must meet]
|
|
|
|
It should NOT:
|
|
- [Boundary 1]
|
|
- [Boundary 2]
|
|
```
|
|
|
|
The plugin creates the file in `.claude/agents/` with proper
|
|
frontmatter (name, description, model hint, tools) and a focused
|
|
system prompt.
|
|
|
|
**Concrete examples for three different people:**
|
|
|
|
A marketing manager might say:
|
|
```
|
|
/agent-development
|
|
|
|
I need an agent called "market-scanner" that monitors competitor
|
|
activity. It should search the web for news about [competitor names],
|
|
extract pricing changes, new feature launches, and partnership
|
|
announcements. It should produce a structured report with source URLs
|
|
for every claim. It should not include rumors or unverified information.
|
|
```
|
|
|
|
An engineering lead might say:
|
|
```
|
|
/agent-development
|
|
|
|
I need an agent called "tech-researcher" that evaluates technologies
|
|
for our stack. It should search documentation and GitHub repos, compare
|
|
features, check community health (stars, issues, release frequency),
|
|
and produce a recommendation with pros, cons, and risks. It should not
|
|
recommend anything without checking the last 3 months of GitHub issues
|
|
for dealbreakers.
|
|
```
|
|
|
|
A consultant might say:
|
|
```
|
|
/agent-development
|
|
|
|
I need an agent called "client-briefer" that prepares meeting briefs.
|
|
It should research the client's company (recent news, financials,
|
|
leadership changes), review my notes in memory/, and produce a
|
|
one-page briefing with talking points and potential questions they
|
|
will ask. It should not include anything I cannot verify in the
|
|
meeting.
|
|
```
|
|
|
|
### Create your second agent
|
|
|
|
Every good system needs a quality gate. Run `/agent-development`
|
|
again for a reviewer:
|
|
|
|
```
|
|
I need an agent called "[domain]-reviewer" that reviews output from
|
|
my other agents. It should check for:
|
|
- [Your specific accuracy requirements]
|
|
- [Your formatting standards]
|
|
- [Common mistakes in your domain]
|
|
|
|
When it finds issues, it should list them with specific fix
|
|
instructions, not vague feedback.
|
|
```
|
|
|
|
### Create your third agent (optional but powerful)
|
|
|
|
A writer/formatter that takes raw output and shapes it for your
|
|
audience. Run `/agent-development`:
|
|
|
|
```
|
|
I need an agent called "[output]-writer" that transforms research
|
|
and analysis into [format your audience expects]. It should follow
|
|
[your organization's style/tone]. It should be concise: [your word
|
|
limit].
|
|
```
|
|
|
|
### Test each agent individually
|
|
|
|
Before combining them, verify each one works alone:
|
|
|
|
```
|
|
Use the [agent-name] agent to [a specific task from your real work].
|
|
Show me the full output.
|
|
```
|
|
|
|
Does the output meet your standards? If not, tell Claude what to fix
|
|
in the agent file. Iterate until each agent produces work you would
|
|
use. This is worth the time. A weak agent in a pipeline produces
|
|
weak pipeline output.
|
|
|
|
**Pattern from Example 06:** The researcher-writer-reviewer pattern
|
|
works for almost any domain. Your agents are the same pattern with
|
|
your domain expertise built in.
|
|
|
|
---
|
|
|
|
## Phase 4: Build your pipeline (20 minutes)
|
|
|
|
Now the payoff. You combine your agents into a pipeline skill that
|
|
handles a complete workflow.
|
|
|
|
### Create the pipeline skill
|
|
|
|
Run `/skill-development` to create the pipeline skill. This is more
|
|
complex than a simple task skill because it orchestrates multiple
|
|
agents, so the plugin-dev guided workflow helps you get the
|
|
frontmatter, description, and triggering conditions right.
|
|
|
|
Describe your pipeline:
|
|
|
|
```
|
|
I need a skill called "weekly-[your-workflow]" that runs my complete
|
|
[task name] workflow.
|
|
|
|
Steps:
|
|
1. Read CLAUDE.md for current priorities and context
|
|
2. Read memory/[your-state-file].md for what happened last time
|
|
3. Use the [agent-1] agent to [gather/research/analyze]
|
|
4. Use the [agent-2] agent to [draft/format/write]
|
|
5. Use the [agent-3] agent to [review/verify/check]
|
|
6. If the reviewer finds issues, send back to [agent-2] for fixes
|
|
7. Save the final output to pipeline-output/[your-output]-[date].md
|
|
8. Update memory/[your-state-file].md with what was done and when
|
|
9. Show me the first 10 lines of the output to confirm
|
|
```
|
|
|
|
After the skill is created, run `/skill-reviewer` to check its
|
|
quality. The reviewer catches common issues: vague descriptions,
|
|
missing triggering conditions, incomplete step definitions.
|
|
|
|
**Concrete example for a marketing manager:**
|
|
|
|
```
|
|
/skill-development
|
|
|
|
I need a skill called "weekly-competitive-intel" that runs my
|
|
Monday morning competitive intelligence workflow.
|
|
|
|
Steps:
|
|
1. Read CLAUDE.md for the list of competitors I track
|
|
2. Read memory/competitive-intel-state.md for what was covered last week
|
|
3. Use the market-scanner agent to research each competitor's
|
|
activity this week (news, product changes, pricing, partnerships)
|
|
4. Use the report-writer agent to draft a structured briefing:
|
|
- Executive summary (3 bullets)
|
|
- Per-competitor section with changes and source URLs
|
|
- "So what" section: what this means for our strategy
|
|
5. Use the quality-reviewer agent to verify all claims have sources
|
|
and no speculation is presented as fact
|
|
6. If the reviewer finds issues, send back to report-writer for fixes
|
|
7. Save to pipeline-output/competitive-intel-[date].md
|
|
8. Update memory/competitive-intel-state.md with today's date
|
|
and what was found
|
|
9. Show me the executive summary
|
|
```
|
|
|
|
### Test the pipeline
|
|
|
|
Run it on a real task:
|
|
|
|
```
|
|
/weekly-[your-workflow]
|
|
```
|
|
|
|
Watch the agents hand off to each other. This is the same orchestration
|
|
you saw in Example 10, but configured for your work. The pipeline
|
|
should take 2-5 minutes depending on how much research is involved.
|
|
|
|
### Evaluate the output
|
|
|
|
Read the output as if your manager, client, or colleague sent it to you.
|
|
|
|
| Question | If no |
|
|
|----------|-------|
|
|
| Is it accurate? | Tighten the reviewer agent's instructions |
|
|
| Is it the right format? | Adjust the writer agent's output template |
|
|
| Is it the right depth? | Change the researcher agent's scope |
|
|
| Would you send this as-is? | Identify what's missing and update the pipeline |
|
|
|
|
Iterate. Run the pipeline again after each adjustment. Most people
|
|
need 2-3 rounds to get output they are genuinely satisfied with.
|
|
|
|
**Pattern from Example 10:** The pipeline is a recipe with
|
|
interchangeable ingredients. Swap agents, change topics, adjust the
|
|
output format. The orchestration stays the same.
|
|
|
|
---
|
|
|
|
## Phase 5: Add your safety layer (10 minutes)
|
|
|
|
Your pipeline will soon run automatically. Before that happens,
|
|
set up the guardrails you identified in Phase 1.
|
|
|
|
### Create your custom hooks
|
|
|
|
Run `/hook-development` for each hook. The plugin-dev guided workflow
|
|
handles the hook event type (PreToolUse, PostToolUse), the matching
|
|
logic, registration in settings.json, and testing.
|
|
|
|
**For blocking dangerous commands:**
|
|
|
|
```
|
|
/hook-development
|
|
|
|
I need a PreToolUse hook that blocks Bash commands containing:
|
|
- [pattern 1: e.g., "rm -rf", "DROP TABLE"]
|
|
- [pattern 2: e.g., commands targeting my production directories]
|
|
- [pattern 3: e.g., curl commands with API tokens in arguments]
|
|
```
|
|
|
|
**For audit logging:**
|
|
|
|
```
|
|
/hook-development
|
|
|
|
I need a PostToolUse hook that logs every tool call to
|
|
hooks/audit.log with timestamp, tool name, and a summary of
|
|
what was done.
|
|
```
|
|
|
|
The plugin creates the shell scripts and registers them in
|
|
`.claude/settings.json` so they run on every tool call.
|
|
|
|
### Verify the hooks work
|
|
|
|
```
|
|
Try running: [a command your hook should block]
|
|
Then check hooks/audit.log to confirm it was blocked and logged.
|
|
```
|
|
|
|
**Pattern from Example 09:** Hooks run on every tool call, including
|
|
inside agent invocations and automated pipeline runs. What you
|
|
configure here protects every automated run from Phase 6 onward.
|
|
|
|
---
|
|
|
|
## Phase 6: Connect and automate (10 minutes)
|
|
|
|
### Set up your phone channel
|
|
|
|
```
|
|
claude --channels
|
|
```
|
|
|
|
Follow the setup for your chosen channel (Telegram recommended
|
|
for first setup). See `messaging/` in this repo for detailed guides.
|
|
|
|
Test: send "hello" from your phone. If Claude responds, you are connected.
|
|
|
|
### Schedule your pipeline
|
|
|
|
Tell Claude:
|
|
|
|
```
|
|
Create a cron job that runs /weekly-[your-workflow] every
|
|
[day] at [time]. Show me the cron entry before creating it.
|
|
```
|
|
|
|
Or for remote scheduling:
|
|
|
|
```
|
|
/schedule "Run /weekly-[your-workflow] and send me the executive
|
|
summary via Telegram" at [date]T[time]:00
|
|
```
|
|
|
|
### Test the full flow
|
|
|
|
The real test. From your phone, send:
|
|
|
|
```
|
|
Run /weekly-[your-workflow]
|
|
```
|
|
|
|
Wait for the result. When the pipeline output arrives on your
|
|
phone, your system is working end-to-end: phone trigger, agent
|
|
orchestration, quality review, file output, delivery.
|
|
|
|
**Pattern from Example 08:** /loop for testing, CronCreate for
|
|
production, /schedule for remote triggers.
|
|
|
|
---
|
|
|
|
## Phase 7: Run it on real work (15 minutes)
|
|
|
|
This is the moment that matters. Everything up to here was setup.
|
|
Now you test it on an actual task from your actual week.
|
|
|
|
### Pick a real task
|
|
|
|
Not a test. Not a hypothetical. Something you need to do this week
|
|
that your pipeline is designed to handle. A real report, a real
|
|
briefing, a real analysis.
|
|
|
|
### Run it
|
|
|
|
```
|
|
/weekly-[your-workflow]
|
|
```
|
|
|
|
### Evaluate honestly
|
|
|
|
Read the output. Compare it to what you would have produced manually.
|
|
|
|
**The scoring rubric:**
|
|
|
|
| Score | Meaning | What to do |
|
|
|-------|---------|------------|
|
|
| "I would send this as-is" | Your system works. Ship it. | Minor tweaks over time |
|
|
| "Close, needs 10 min of editing" | Your system saves 80% of the work. | Refine agent instructions for the gaps |
|
|
| "The structure is right but content is off" | Agents need better domain context | Add more detail to CLAUDE.md and agent files |
|
|
| "This is not usable" | Pipeline design mismatch | Go back to Phase 1 and re-map the workflow |
|
|
|
|
Most people land on "close, needs 10 minutes of editing" on the
|
|
first real run. By the third run, the agents have been refined
|
|
enough that the output is send-ready.
|
|
|
|
---
|
|
|
|
## What you built
|
|
|
|
| Component | What it does | How to change it |
|
|
|-----------|-------------|-----------------|
|
|
| CLAUDE.md | Operating manual for your domain | Update weekly with priorities |
|
|
| 2-3 agents | Specialists for your recurring tasks | Refine when output misses the mark |
|
|
| Pipeline skill | Chains agents into a complete workflow | Add steps, swap agents, change output |
|
|
| Custom hooks | Protects automated runs | Add patterns as you automate more |
|
|
| Scheduling | Runs the pipeline on autopilot | Adjust frequency based on need |
|
|
| Phone channel | Access from anywhere | Add more channels over time |
|
|
| Memory | Tracks what was done and when | Grows automatically with each run |
|
|
|
|
This is not a configured tool. This is a system.
|
|
|
|
A GETTING-STARTED reader has a Claude Code setup with preferences.
|
|
You have a multi-agent pipeline that handles a complete recurring
|
|
workflow, runs on schedule, reviews its own output for quality,
|
|
logs every action for safety, and delivers results to your phone.
|
|
|
|
---
|
|
|
|
## Your first week
|
|
|
|
**Day 1:** Run the pipeline manually. Fix what is wrong with the
|
|
output. Refine agent instructions.
|
|
|
|
**Day 2:** Run it again. Compare to yesterday. Is it better?
|
|
Add domain knowledge to CLAUDE.md that would have prevented
|
|
yesterday's mistakes.
|
|
|
|
**Day 3:** Let the cron job run it. Check the output when it arrives
|
|
on your phone. The moment it produces something useful without you
|
|
touching anything: that is the moment it clicks.
|
|
|
|
**Day 4-5:** Add a second pipeline skill for another recurring task.
|
|
You already have agents. Creating a new pipeline that reuses them
|
|
takes 10 minutes.
|
|
|
|
**Day 6-7:** Review your hooks and audit log. Are the right things
|
|
being blocked? Is the log useful? Tighten security now that you
|
|
trust the system to run autonomously.
|
|
|
|
**After one week:** You have 1-2 automated pipelines, 2-3 specialized
|
|
agents, and a growing memory of your work context. Routine Monday
|
|
tasks take one phone message instead of two hours. You will know
|
|
whether this changed how you work.
|
|
|
|
---
|
|
|
|
## Growing from here
|
|
|
|
**More agents.** As you notice new recurring patterns, run
|
|
`/agent-development` to create agents for them. Each agent you add
|
|
is a building block that any pipeline skill can use. Run
|
|
`/skill-reviewer` on existing agents periodically to catch quality
|
|
drift.
|
|
|
|
**More pipelines.** A pipeline is just a skill that calls agents in
|
|
sequence. Run `/skill-development` to create new pipelines. Once you
|
|
have 3-4 agents, creating a new pipeline for a different workflow
|
|
takes minutes.
|
|
|
|
**More integrations.** MCP servers connect your agents to external
|
|
services: Slack, Google Drive, databases, APIs. Run
|
|
`/mcp-integration` to add new servers with correct configuration.
|
|
Each integration multiplies what your pipelines can do without
|
|
manual data gathering.
|
|
|
|
**Tighter security.** As automation increases, so should guardrails.
|
|
Run `/hook-development` to add domain-specific hook patterns. Review
|
|
audit logs. Set up alerts for blocked actions that might indicate a
|
|
prompt going wrong.
|
|
|
|
**Validate your setup.** Run `/plugin-validator` periodically to
|
|
check that all your files have correct structure, frontmatter, and
|
|
cross-references. Catches problems before they cause silent failures
|
|
in automated runs.
|
|
|
|
**Shared agents.** The agents and pipelines you build are markdown
|
|
files. You can share them with colleagues, version control them, or
|
|
use them as templates for other projects. The system is portable.
|
|
|
|
---
|
|
|
|
## Honest assessment
|
|
|
|
This system will not replace your expertise, your relationships, or
|
|
your judgment. It replaces the scaffolding: gathering, formatting,
|
|
cross-checking, delivering, and remembering. The work that has to
|
|
happen before and after the real thinking.
|
|
|
|
The people who get the most from it share three traits:
|
|
1. They are specific about their domain in CLAUDE.md
|
|
2. They iterate on agent instructions instead of accepting first-draft output
|
|
3. They actually use it for real work, not just demos
|
|
|
|
The time investment is real: 90 minutes to build, 15 minutes a week
|
|
to maintain and improve. The return scales with how well you defined
|
|
your workflow in Phase 1. A system designed around vague goals
|
|
produces vague results. A system designed around "every Monday I need
|
|
X from sources Y in format Z" produces something you actually use.
|
|
|
|
If you followed the examples and built this system, you understand
|
|
more about personal AI agent architecture than most people in the
|
|
industry. Not because the technology is hard. Because the thinking
|
|
in Phase 1, designing your work as a system of agents, is the part
|
|
most people skip.
|