1
0
Fork 0
claude-code-complete-agent/examples/14-build-your-agent/prompt.md
Kjell Tore Guttormsen 866e8a5f3b fix: reference plugin-dev throughout Example 14 and GETTING-STARTED
Skills, agents, and hooks should be created via plugin-dev plugin,
not generic "tell Claude" or manual file writing.

Example 14: /agent-development for agents, /skill-development for
pipeline skills, /hook-development for hooks, /skill-reviewer for
quality checks, /plugin-validator for setup validation. plugin-dev
added as prerequisite.

GETTING-STARTED.md: /skill-development for skill creation,
/skill-reviewer for iteration, /mcp-integration for MCP servers.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 21:43:15 +01:00

600 lines
20 KiB
Markdown

# Example 14: Build Your Personal Agent System
Everything before this was a demo. This is where you build
something real.
Examples 01-13 showed you individual capabilities: research, file
management, web search, agents, pipelines, messaging, automation,
security. Individually, each one is useful. Together, they form
something fundamentally different: a personal agent system that
handles your recurring work, runs on schedule, and delivers results
to your phone.
This is not "create a skill and connect your phone." You could have
done that after reading GETTING-STARTED.md. This is: design a system
of agents that understands your domain, automate a complete workflow,
add safety guardrails, and test it on real work from your actual week.
**Time needed:** 90 minutes for the full build. You will iterate on
it over the first week. Most people say it clicks around day 3.
**What you need:**
- A project directory (new or existing) where you will set up your
personal agent system. Everything you build here is portable.
- The **plugin-dev** plugin installed in Claude Code. This is the
tool that creates skills, agents, and hooks with correct structure
and best practices. If you do not have it:
```
/install plugin-dev
```
---
## Phase 1: Map your work (10 minutes, no computer needed)
Before you build anything, think. The difference between a useful
agent system and a novelty is whether you designed it around how
you actually work.
### The three questions
**1. What do you do every week that follows a pattern?**
Not creative work. Not relationship work. The structured, repeatable
tasks that eat your time because they require gathering, processing,
formatting, and delivering information.
Examples:
- A weekly status report that pulls from multiple sources
- Competitor monitoring that requires searching, comparing, summarizing
- Meeting preparation that needs research, context, and a briefing doc
- Client updates that combine project data with a professional narrative
Write down 3-5 tasks. Be specific about inputs, processing, and outputs.
**2. What roles would you hire for if you could?**
Not "an assistant." Specific roles with specific expertise:
- A **researcher** who tracks your industry and surfaces what matters
- A **writer** who turns your notes into polished documents
- A **reviewer** who catches errors before anything goes out
- A **briefing officer** who summarizes what you need to know each morning
- An **analyst** who processes data and highlights patterns
Write down 2-3 roles. For each: what do they know, what do they produce,
what quality standard must they meet?
**3. What must never happen automatically?**
The safety question. When your system runs at 07:00 on Monday without
you watching:
- What actions should be blocked? (sending emails, deleting files, accessing certain APIs)
- What should always be logged? (all external calls, all file writes, all web searches)
- What requires your approval before executing? (anything touching production, anything external)
Write these down. They become your hooks in Phase 5.
### Design sketch
Before touching Claude Code, sketch your system on paper or in your head:
```
[Your recurring task]
|
v
[Agent 1: gather information] --> raw data
|
v
[Agent 2: process and draft] --> draft output
|
v
[Agent 3: review and verify] --> verified output
|
v
[Save to files] + [Send to phone] + [Log to memory]
```
This is the same architecture as Example 10, adapted to your work.
You are not inventing something new. You are configuring a proven
pattern for your specific context.
---
## Phase 2: Build your operating manual (15 minutes)
Open your project directory in Claude Code. Your first step is not
writing a CLAUDE.md yourself. It is asking Claude to interview you
and write one.
### The prompt
```
Interview me to create a CLAUDE.md for this project. Ask me questions
about:
1. Who I am and what I do
2. How I like to communicate (format, tone, length)
3. What I am working on right now (top 3-5 priorities)
4. What tools and services I use
5. What you should never do without my permission
Ask one category at a time. After all five, write a CLAUDE.md that
would let you handle my Monday morning without additional context.
```
Claude will ask you questions, listen to your answers, and produce
a CLAUDE.md tailored to what you said. This is better than writing
it yourself because Claude knows what information it needs to be
effective.
### What makes a great CLAUDE.md
After Claude writes it, check:
| Criteria | Test |
|----------|------|
| Specific enough | Could someone handle your Monday with only this file? |
| Priorities current | Does it reflect this week's work, not last month's? |
| Boundaries clear | Are the "never do" rules unambiguous? |
| Tools documented | Does it list what MCP servers and channels are available? |
| Format preferences stated | Will Claude produce output in the shape you actually want? |
If any answer is "no," tell Claude what to fix. Iterate until it
passes all five. This is the foundation of everything that follows.
**Pattern from Example 05:** CLAUDE.md loads at every session start.
Every agent you create in Phase 3 inherits this context automatically.
---
## Phase 3: Create your agent team (20 minutes)
Now build the 2-3 agents you identified in Phase 1. You use the
**plugin-dev** plugin to create them. It knows the correct file
format, frontmatter fields, system prompt structure, and triggering
conditions. You describe what you need; it builds the agent properly.
### Create your first agent
Pick the role you need most. Run:
```
/agent-development
```
The plugin-dev skill will guide you through creating the agent:
what it does, what tools it needs, when it should trigger, and
what its system prompt should contain. Describe your agent clearly:
```
I need an agent called "[role-name]" that [what it does].
It should:
- [Specific capability 1]
- [Specific capability 2]
- [Quality standard it must meet]
It should NOT:
- [Boundary 1]
- [Boundary 2]
```
The plugin creates the file in `.claude/agents/` with proper
frontmatter (name, description, model hint, tools) and a focused
system prompt.
**Concrete examples for three different people:**
A marketing manager might say:
```
/agent-development
I need an agent called "market-scanner" that monitors competitor
activity. It should search the web for news about [competitor names],
extract pricing changes, new feature launches, and partnership
announcements. It should produce a structured report with source URLs
for every claim. It should not include rumors or unverified information.
```
An engineering lead might say:
```
/agent-development
I need an agent called "tech-researcher" that evaluates technologies
for our stack. It should search documentation and GitHub repos, compare
features, check community health (stars, issues, release frequency),
and produce a recommendation with pros, cons, and risks. It should not
recommend anything without checking the last 3 months of GitHub issues
for dealbreakers.
```
A consultant might say:
```
/agent-development
I need an agent called "client-briefer" that prepares meeting briefs.
It should research the client's company (recent news, financials,
leadership changes), review my notes in memory/, and produce a
one-page briefing with talking points and potential questions they
will ask. It should not include anything I cannot verify in the
meeting.
```
### Create your second agent
Every good system needs a quality gate. Run `/agent-development`
again for a reviewer:
```
I need an agent called "[domain]-reviewer" that reviews output from
my other agents. It should check for:
- [Your specific accuracy requirements]
- [Your formatting standards]
- [Common mistakes in your domain]
When it finds issues, it should list them with specific fix
instructions, not vague feedback.
```
### Create your third agent (optional but powerful)
A writer/formatter that takes raw output and shapes it for your
audience. Run `/agent-development`:
```
I need an agent called "[output]-writer" that transforms research
and analysis into [format your audience expects]. It should follow
[your organization's style/tone]. It should be concise: [your word
limit].
```
### Test each agent individually
Before combining them, verify each one works alone:
```
Use the [agent-name] agent to [a specific task from your real work].
Show me the full output.
```
Does the output meet your standards? If not, tell Claude what to fix
in the agent file. Iterate until each agent produces work you would
use. This is worth the time. A weak agent in a pipeline produces
weak pipeline output.
**Pattern from Example 06:** The researcher-writer-reviewer pattern
works for almost any domain. Your agents are the same pattern with
your domain expertise built in.
---
## Phase 4: Build your pipeline (20 minutes)
Now the payoff. You combine your agents into a pipeline skill that
handles a complete workflow.
### Create the pipeline skill
Run `/skill-development` to create the pipeline skill. This is more
complex than a simple task skill because it orchestrates multiple
agents, so the plugin-dev guided workflow helps you get the
frontmatter, description, and triggering conditions right.
Describe your pipeline:
```
I need a skill called "weekly-[your-workflow]" that runs my complete
[task name] workflow.
Steps:
1. Read CLAUDE.md for current priorities and context
2. Read memory/[your-state-file].md for what happened last time
3. Use the [agent-1] agent to [gather/research/analyze]
4. Use the [agent-2] agent to [draft/format/write]
5. Use the [agent-3] agent to [review/verify/check]
6. If the reviewer finds issues, send back to [agent-2] for fixes
7. Save the final output to pipeline-output/[your-output]-[date].md
8. Update memory/[your-state-file].md with what was done and when
9. Show me the first 10 lines of the output to confirm
```
After the skill is created, run `/skill-reviewer` to check its
quality. The reviewer catches common issues: vague descriptions,
missing triggering conditions, incomplete step definitions.
**Concrete example for a marketing manager:**
```
/skill-development
I need a skill called "weekly-competitive-intel" that runs my
Monday morning competitive intelligence workflow.
Steps:
1. Read CLAUDE.md for the list of competitors I track
2. Read memory/competitive-intel-state.md for what was covered last week
3. Use the market-scanner agent to research each competitor's
activity this week (news, product changes, pricing, partnerships)
4. Use the report-writer agent to draft a structured briefing:
- Executive summary (3 bullets)
- Per-competitor section with changes and source URLs
- "So what" section: what this means for our strategy
5. Use the quality-reviewer agent to verify all claims have sources
and no speculation is presented as fact
6. If the reviewer finds issues, send back to report-writer for fixes
7. Save to pipeline-output/competitive-intel-[date].md
8. Update memory/competitive-intel-state.md with today's date
and what was found
9. Show me the executive summary
```
### Test the pipeline
Run it on a real task:
```
/weekly-[your-workflow]
```
Watch the agents hand off to each other. This is the same orchestration
you saw in Example 10, but configured for your work. The pipeline
should take 2-5 minutes depending on how much research is involved.
### Evaluate the output
Read the output as if your manager, client, or colleague sent it to you.
| Question | If no |
|----------|-------|
| Is it accurate? | Tighten the reviewer agent's instructions |
| Is it the right format? | Adjust the writer agent's output template |
| Is it the right depth? | Change the researcher agent's scope |
| Would you send this as-is? | Identify what's missing and update the pipeline |
Iterate. Run the pipeline again after each adjustment. Most people
need 2-3 rounds to get output they are genuinely satisfied with.
**Pattern from Example 10:** The pipeline is a recipe with
interchangeable ingredients. Swap agents, change topics, adjust the
output format. The orchestration stays the same.
---
## Phase 5: Add your safety layer (10 minutes)
Your pipeline will soon run automatically. Before that happens,
set up the guardrails you identified in Phase 1.
### Create your custom hooks
Run `/hook-development` for each hook. The plugin-dev guided workflow
handles the hook event type (PreToolUse, PostToolUse), the matching
logic, registration in settings.json, and testing.
**For blocking dangerous commands:**
```
/hook-development
I need a PreToolUse hook that blocks Bash commands containing:
- [pattern 1: e.g., "rm -rf", "DROP TABLE"]
- [pattern 2: e.g., commands targeting my production directories]
- [pattern 3: e.g., curl commands with API tokens in arguments]
```
**For audit logging:**
```
/hook-development
I need a PostToolUse hook that logs every tool call to
hooks/audit.log with timestamp, tool name, and a summary of
what was done.
```
The plugin creates the shell scripts and registers them in
`.claude/settings.json` so they run on every tool call.
### Verify the hooks work
```
Try running: [a command your hook should block]
Then check hooks/audit.log to confirm it was blocked and logged.
```
**Pattern from Example 09:** Hooks run on every tool call, including
inside agent invocations and automated pipeline runs. What you
configure here protects every automated run from Phase 6 onward.
---
## Phase 6: Connect and automate (10 minutes)
### Set up your phone channel
```
claude --channels
```
Follow the setup for your chosen channel (Telegram recommended
for first setup). See `messaging/` in this repo for detailed guides.
Test: send "hello" from your phone. If Claude responds, you are connected.
### Schedule your pipeline
Tell Claude:
```
Create a cron job that runs /weekly-[your-workflow] every
[day] at [time]. Show me the cron entry before creating it.
```
Or for remote scheduling:
```
/schedule "Run /weekly-[your-workflow] and send me the executive
summary via Telegram" at [date]T[time]:00
```
### Test the full flow
The real test. From your phone, send:
```
Run /weekly-[your-workflow]
```
Wait for the result. When the pipeline output arrives on your
phone, your system is working end-to-end: phone trigger, agent
orchestration, quality review, file output, delivery.
**Pattern from Example 08:** /loop for testing, CronCreate for
production, /schedule for remote triggers.
---
## Phase 7: Run it on real work (15 minutes)
This is the moment that matters. Everything up to here was setup.
Now you test it on an actual task from your actual week.
### Pick a real task
Not a test. Not a hypothetical. Something you need to do this week
that your pipeline is designed to handle. A real report, a real
briefing, a real analysis.
### Run it
```
/weekly-[your-workflow]
```
### Evaluate honestly
Read the output. Compare it to what you would have produced manually.
**The scoring rubric:**
| Score | Meaning | What to do |
|-------|---------|------------|
| "I would send this as-is" | Your system works. Ship it. | Minor tweaks over time |
| "Close, needs 10 min of editing" | Your system saves 80% of the work. | Refine agent instructions for the gaps |
| "The structure is right but content is off" | Agents need better domain context | Add more detail to CLAUDE.md and agent files |
| "This is not usable" | Pipeline design mismatch | Go back to Phase 1 and re-map the workflow |
Most people land on "close, needs 10 minutes of editing" on the
first real run. By the third run, the agents have been refined
enough that the output is send-ready.
---
## What you built
| Component | What it does | How to change it |
|-----------|-------------|-----------------|
| CLAUDE.md | Operating manual for your domain | Update weekly with priorities |
| 2-3 agents | Specialists for your recurring tasks | Refine when output misses the mark |
| Pipeline skill | Chains agents into a complete workflow | Add steps, swap agents, change output |
| Custom hooks | Protects automated runs | Add patterns as you automate more |
| Scheduling | Runs the pipeline on autopilot | Adjust frequency based on need |
| Phone channel | Access from anywhere | Add more channels over time |
| Memory | Tracks what was done and when | Grows automatically with each run |
This is not a configured tool. This is a system.
A GETTING-STARTED reader has a Claude Code setup with preferences.
You have a multi-agent pipeline that handles a complete recurring
workflow, runs on schedule, reviews its own output for quality,
logs every action for safety, and delivers results to your phone.
---
## Your first week
**Day 1:** Run the pipeline manually. Fix what is wrong with the
output. Refine agent instructions.
**Day 2:** Run it again. Compare to yesterday. Is it better?
Add domain knowledge to CLAUDE.md that would have prevented
yesterday's mistakes.
**Day 3:** Let the cron job run it. Check the output when it arrives
on your phone. The moment it produces something useful without you
touching anything: that is the moment it clicks.
**Day 4-5:** Add a second pipeline skill for another recurring task.
You already have agents. Creating a new pipeline that reuses them
takes 10 minutes.
**Day 6-7:** Review your hooks and audit log. Are the right things
being blocked? Is the log useful? Tighten security now that you
trust the system to run autonomously.
**After one week:** You have 1-2 automated pipelines, 2-3 specialized
agents, and a growing memory of your work context. Routine Monday
tasks take one phone message instead of two hours. You will know
whether this changed how you work.
---
## Growing from here
**More agents.** As you notice new recurring patterns, run
`/agent-development` to create agents for them. Each agent you add
is a building block that any pipeline skill can use. Run
`/skill-reviewer` on existing agents periodically to catch quality
drift.
**More pipelines.** A pipeline is just a skill that calls agents in
sequence. Run `/skill-development` to create new pipelines. Once you
have 3-4 agents, creating a new pipeline for a different workflow
takes minutes.
**More integrations.** MCP servers connect your agents to external
services: Slack, Google Drive, databases, APIs. Run
`/mcp-integration` to add new servers with correct configuration.
Each integration multiplies what your pipelines can do without
manual data gathering.
**Tighter security.** As automation increases, so should guardrails.
Run `/hook-development` to add domain-specific hook patterns. Review
audit logs. Set up alerts for blocked actions that might indicate a
prompt going wrong.
**Validate your setup.** Run `/plugin-validator` periodically to
check that all your files have correct structure, frontmatter, and
cross-references. Catches problems before they cause silent failures
in automated runs.
**Shared agents.** The agents and pipelines you build are markdown
files. You can share them with colleagues, version control them, or
use them as templates for other projects. The system is portable.
---
## Honest assessment
This system will not replace your expertise, your relationships, or
your judgment. It replaces the scaffolding: gathering, formatting,
cross-checking, delivering, and remembering. The work that has to
happen before and after the real thinking.
The people who get the most from it share three traits:
1. They are specific about their domain in CLAUDE.md
2. They iterate on agent instructions instead of accepting first-draft output
3. They actually use it for real work, not just demos
The time investment is real: 90 minutes to build, 15 minutes a week
to maintain and improve. The return scales with how well you defined
your workflow in Phase 1. A system designed around vague goals
produces vague results. A system designed around "every Monday I need
X from sources Y in format Z" produces something you actually use.
If you followed the examples and built this system, you understand
more about personal AI agent architecture than most people in the
industry. Not because the technology is hard. Because the thinking
in Phase 1, designing your work as a system of agents, is the part
most people skip.