Kjell Tore Guttormsen ccf2749127 refactor: rewrite Example 14 as genuine capstone and fix skill creation

Example 14 completely rewritten. Was: GETTING-STARTED.md repeated
(one skill + phone + cron). Now: a 7-phase system design that
produces a complete personal agent ecosystem (custom agents,
multi-agent pipeline, custom hooks, automation, phone delivery).

Requires accumulated knowledge from examples 01-13. Includes:
- Phase 1: Map your work (design before building)
- Phase 3: Custom agent team created via Claude (not manually)
- Phase 4: Pipeline skill chaining agents into complete workflow
- Phase 5: Custom security hooks for user's context
- Phase 7: Test on real work with evaluation rubric
- Three concrete persona examples (marketing, engineering, consulting)

GETTING-STARTED.md Step 4: replaced manual file creation with
"tell Claude to create the skill" workflow. Skills, agents, and
hooks should always be created by asking Claude, not by hand.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-26 21:26:29 +01:00

18 KiB

Raw Blame History

Example 14: Build Your Personal Agent System

Everything before this was a demo. This is where you build something real.

Examples 01-13 showed you individual capabilities: research, file management, web search, agents, pipelines, messaging, automation, security. Individually, each one is useful. Together, they form something fundamentally different: a personal agent system that handles your recurring work, runs on schedule, and delivers results to your phone.

This is not "create a skill and connect your phone." You could have done that after reading GETTING-STARTED.md. This is: design a system of agents that understands your domain, automate a complete workflow, add safety guardrails, and test it on real work from your actual week.

Time needed: 90 minutes for the full build. You will iterate on it over the first week. Most people say it clicks around day 3.

What you need: A project directory (new or existing) where you will set up your personal agent system. Everything you build here is portable. Move it to any project later.

Phase 1: Map your work (10 minutes, no computer needed)

Before you build anything, think. The difference between a useful agent system and a novelty is whether you designed it around how you actually work.

The three questions

1. What do you do every week that follows a pattern?

Not creative work. Not relationship work. The structured, repeatable tasks that eat your time because they require gathering, processing, formatting, and delivering information.

Examples:

A weekly status report that pulls from multiple sources
Competitor monitoring that requires searching, comparing, summarizing
Meeting preparation that needs research, context, and a briefing doc
Client updates that combine project data with a professional narrative

Write down 3-5 tasks. Be specific about inputs, processing, and outputs.

2. What roles would you hire for if you could?

Not "an assistant." Specific roles with specific expertise:

A researcher who tracks your industry and surfaces what matters
A writer who turns your notes into polished documents
A reviewer who catches errors before anything goes out
A briefing officer who summarizes what you need to know each morning
An analyst who processes data and highlights patterns

Write down 2-3 roles. For each: what do they know, what do they produce, what quality standard must they meet?

3. What must never happen automatically?

The safety question. When your system runs at 07:00 on Monday without you watching:

What actions should be blocked? (sending emails, deleting files, accessing certain APIs)
What should always be logged? (all external calls, all file writes, all web searches)
What requires your approval before executing? (anything touching production, anything external)

Write these down. They become your hooks in Phase 5.

Design sketch

Before touching Claude Code, sketch your system on paper or in your head:

[Your recurring task]
    |
    v
[Agent 1: gather information] --> raw data
    |
    v
[Agent 2: process and draft] --> draft output
    |
    v
[Agent 3: review and verify] --> verified output
    |
    v
[Save to files] + [Send to phone] + [Log to memory]

This is the same architecture as Example 10, adapted to your work. You are not inventing something new. You are configuring a proven pattern for your specific context.

Phase 2: Build your operating manual (15 minutes)

Open your project directory in Claude Code. Your first step is not writing a CLAUDE.md yourself. It is asking Claude to interview you and write one.

The prompt

Interview me to create a CLAUDE.md for this project. Ask me questions
about:
1. Who I am and what I do
2. How I like to communicate (format, tone, length)
3. What I am working on right now (top 3-5 priorities)
4. What tools and services I use
5. What you should never do without my permission

Ask one category at a time. After all five, write a CLAUDE.md that
would let you handle my Monday morning without additional context.

Claude will ask you questions, listen to your answers, and produce a CLAUDE.md tailored to what you said. This is better than writing it yourself because Claude knows what information it needs to be effective.

What makes a great CLAUDE.md

After Claude writes it, check:

Criteria	Test
Specific enough	Could someone handle your Monday with only this file?
Priorities current	Does it reflect this week's work, not last month's?
Boundaries clear	Are the "never do" rules unambiguous?
Tools documented	Does it list what MCP servers and channels are available?
Format preferences stated	Will Claude produce output in the shape you actually want?

If any answer is "no," tell Claude what to fix. Iterate until it passes all five. This is the foundation of everything that follows.

Pattern from Example 05: CLAUDE.md loads at every session start. Every agent you create in Phase 3 inherits this context automatically.

Phase 3: Create your agent team (20 minutes)

Now build the 2-3 agents you identified in Phase 1. You do not write agent files by hand. You tell Claude what each agent should do and let it create the files.

Create your first agent

Pick the role you need most. Tell Claude:

Create an agent called "[role-name]" that [what it does].

It should:
- [Specific capability 1]
- [Specific capability 2]
- [Quality standard it must meet]

It should NOT:
- [Boundary 1]
- [Boundary 2]

Save it to .claude/agents/[role-name].md

Concrete examples for three different people:

A marketing manager might say:

Create an agent called "market-scanner" that monitors competitor
activity. It should search the web for news about [competitor names],
extract pricing changes, new feature launches, and partnership
announcements. It should produce a structured report with source URLs
for every claim. It should not include rumors or unverified information.
Save it to .claude/agents/market-scanner.md

An engineering lead might say:

Create an agent called "tech-researcher" that evaluates technologies
for our stack. It should search documentation and GitHub repos, compare
features, check community health (stars, issues, release frequency),
and produce a recommendation with pros, cons, and risks. It should not
recommend anything without checking the last 3 months of GitHub issues
for dealbreakers. Save it to .claude/agents/tech-researcher.md

A consultant might say:

Create an agent called "client-briefer" that prepares meeting briefs.
It should research the client's company (recent news, financials,
leadership changes), review my notes in memory/, and produce a
one-page briefing with talking points and potential questions they
will ask. It should not include anything I cannot verify in the
meeting. Save it to .claude/agents/client-briefer.md

Create your second agent

Every good system needs a quality gate. Create a reviewer:

Create an agent called "[domain]-reviewer" that reviews output from
my other agents. It should check for:
- [Your specific accuracy requirements]
- [Your formatting standards]
- [Common mistakes in your domain]

When it finds issues, it should list them with specific fix
instructions, not vague feedback. Save it to .claude/agents/[name].md

Create your third agent (optional but powerful)

A writer/formatter that takes raw output and shapes it for your audience:

Create an agent called "[output]-writer" that transforms research
and analysis into [format your audience expects]. It should follow
[your organization's style/tone]. It should be concise: [your word
limit]. Save it to .claude/agents/[name].md

Test each agent individually

Before combining them, verify each one works alone:

Use the [agent-name] agent to [a specific task from your real work].
Show me the full output.

Does the output meet your standards? If not, tell Claude what to fix in the agent file. Iterate until each agent produces work you would use. This is worth the time. A weak agent in a pipeline produces weak pipeline output.

Pattern from Example 06: The researcher-writer-reviewer pattern works for almost any domain. Your agents are the same pattern with your domain expertise built in.

Phase 4: Build your pipeline (20 minutes)

Now the payoff. You combine your agents into a pipeline skill that handles a complete workflow.

Create the pipeline skill

Tell Claude to create a skill that orchestrates your agents:

Create a skill called "weekly-[your-workflow]" that runs my complete
[task name] workflow.

Steps:
1. Read CLAUDE.md for current priorities and context
2. Read memory/[your-state-file].md for what happened last time
3. Use the [agent-1] agent to [gather/research/analyze]
4. Use the [agent-2] agent to [draft/format/write]
5. Use the [agent-3] agent to [review/verify/check]
6. If the reviewer finds issues, send back to [agent-2] for fixes
7. Save the final output to pipeline-output/[your-output]-[date].md
8. Update memory/[your-state-file].md with what was done and when
9. Show me the first 10 lines of the output to confirm

Save it to .claude/skills/weekly-[your-workflow].md

Concrete example for a marketing manager:

Create a skill called "weekly-competitive-intel" that runs my
Monday morning competitive intelligence workflow.

Steps:
1. Read CLAUDE.md for the list of competitors I track
2. Read memory/competitive-intel-state.md for what was covered last week
3. Use the market-scanner agent to research each competitor's
   activity this week (news, product changes, pricing, partnerships)
4. Use the report-writer agent to draft a structured briefing:
   - Executive summary (3 bullets)
   - Per-competitor section with changes and source URLs
   - "So what" section: what this means for our strategy
5. Use the quality-reviewer agent to verify all claims have sources
   and no speculation is presented as fact
6. If the reviewer finds issues, send back to report-writer for fixes
7. Save to pipeline-output/competitive-intel-[date].md
8. Update memory/competitive-intel-state.md with today's date
   and what was found
9. Show me the executive summary

Save it to .claude/skills/weekly-competitive-intel.md

Test the pipeline

Run it on a real task:

/weekly-[your-workflow]

Watch the agents hand off to each other. This is the same orchestration you saw in Example 10, but configured for your work. The pipeline should take 2-5 minutes depending on how much research is involved.

Evaluate the output

Read the output as if your manager, client, or colleague sent it to you.

Question	If no
Is it accurate?	Tighten the reviewer agent's instructions
Is it the right format?	Adjust the writer agent's output template
Is it the right depth?	Change the researcher agent's scope
Would you send this as-is?	Identify what's missing and update the pipeline

Iterate. Run the pipeline again after each adjustment. Most people need 2-3 rounds to get output they are genuinely satisfied with.

Pattern from Example 10: The pipeline is a recipe with interchangeable ingredients. Swap agents, change topics, adjust the output format. The orchestration stays the same.

Phase 5: Add your safety layer (10 minutes)

Your pipeline will soon run automatically. Before that happens, set up the guardrails you identified in Phase 1.

Create your custom hooks

Tell Claude to create hooks for your specific risks:

Create a PreToolUse hook that blocks Bash commands containing any of:
- [pattern 1: e.g., "rm -rf", "DROP TABLE"]
- [pattern 2: e.g., commands targeting your production directories]
- [pattern 3: e.g., curl commands with API tokens in arguments]

Save to hooks/[your-hook-name].sh and register it in
.claude/settings.json

Create a PostToolUse hook that logs every tool call to
hooks/audit.log with timestamp, tool name, and a summary of what
was done. Save to hooks/audit-logger.sh and register it.

Verify the hooks work

Try running: [a command your hook should block]
Then check hooks/audit.log to confirm it was blocked and logged.

Pattern from Example 09: Hooks run on every tool call, including inside agent invocations and automated pipeline runs. What you configure here protects every automated run from Phase 6 onward.

Phase 6: Connect and automate (10 minutes)

Set up your phone channel

claude --channels

Follow the setup for your chosen channel (Telegram recommended for first setup). See messaging/ in this repo for detailed guides.

Test: send "hello" from your phone. If Claude responds, you are connected.

Schedule your pipeline

Tell Claude:

Create a cron job that runs /weekly-[your-workflow] every
[day] at [time]. Show me the cron entry before creating it.

Or for remote scheduling:

/schedule "Run /weekly-[your-workflow] and send me the executive
summary via Telegram" at [date]T[time]:00

Test the full flow

The real test. From your phone, send:

Run /weekly-[your-workflow]

Wait for the result. When the pipeline output arrives on your phone, your system is working end-to-end: phone trigger, agent orchestration, quality review, file output, delivery.

Pattern from Example 08: /loop for testing, CronCreate for production, /schedule for remote triggers.

Phase 7: Run it on real work (15 minutes)

This is the moment that matters. Everything up to here was setup. Now you test it on an actual task from your actual week.

Pick a real task

Not a test. Not a hypothetical. Something you need to do this week that your pipeline is designed to handle. A real report, a real briefing, a real analysis.

Run it

/weekly-[your-workflow]

Evaluate honestly

Read the output. Compare it to what you would have produced manually.

The scoring rubric:

Score	Meaning	What to do
"I would send this as-is"	Your system works. Ship it.	Minor tweaks over time
"Close, needs 10 min of editing"	Your system saves 80% of the work.	Refine agent instructions for the gaps
"The structure is right but content is off"	Agents need better domain context	Add more detail to CLAUDE.md and agent files
"This is not usable"	Pipeline design mismatch	Go back to Phase 1 and re-map the workflow

Most people land on "close, needs 10 minutes of editing" on the first real run. By the third run, the agents have been refined enough that the output is send-ready.

What you built

Component	What it does	How to change it
CLAUDE.md	Operating manual for your domain	Update weekly with priorities
2-3 agents	Specialists for your recurring tasks	Refine when output misses the mark
Pipeline skill	Chains agents into a complete workflow	Add steps, swap agents, change output
Custom hooks	Protects automated runs	Add patterns as you automate more
Scheduling	Runs the pipeline on autopilot	Adjust frequency based on need
Phone channel	Access from anywhere	Add more channels over time
Memory	Tracks what was done and when	Grows automatically with each run

This is not a configured tool. This is a system.

A GETTING-STARTED reader has a Claude Code setup with preferences. You have a multi-agent pipeline that handles a complete recurring workflow, runs on schedule, reviews its own output for quality, logs every action for safety, and delivers results to your phone.

Your first week

Day 1: Run the pipeline manually. Fix what is wrong with the output. Refine agent instructions.

Day 2: Run it again. Compare to yesterday. Is it better? Add domain knowledge to CLAUDE.md that would have prevented yesterday's mistakes.

Day 3: Let the cron job run it. Check the output when it arrives on your phone. The moment it produces something useful without you touching anything: that is the moment it clicks.

Day 4-5: Add a second pipeline skill for another recurring task. You already have agents. Creating a new pipeline that reuses them takes 10 minutes.

Day 6-7: Review your hooks and audit log. Are the right things being blocked? Is the log useful? Tighten security now that you trust the system to run autonomously.

After one week: You have 1-2 automated pipelines, 2-3 specialized agents, and a growing memory of your work context. Routine Monday tasks take one phone message instead of two hours. You will know whether this changed how you work.

Growing from here

More agents. As you notice new recurring patterns, create agents for them. Each agent you add is a building block that any pipeline skill can use.

More pipelines. A pipeline is just a skill that calls agents in sequence. Once you have 3-4 agents, creating a new pipeline for a different workflow takes minutes.

More integrations. MCP servers connect your agents to external services: Slack, Google Drive, databases, APIs. Each integration multiplies what your pipelines can do without manual data gathering.

Tighter security. As automation increases, so should guardrails. Add domain-specific hook patterns. Review audit logs. Set up alerts for blocked actions that might indicate a prompt going wrong.

Shared agents. The agents and pipelines you build are markdown files. You can share them with colleagues, version control them, or use them as templates for other projects. The system is portable.

Honest assessment

This system will not replace your expertise, your relationships, or your judgment. It replaces the scaffolding: gathering, formatting, cross-checking, delivering, and remembering. The work that has to happen before and after the real thinking.

The people who get the most from it share three traits:

They are specific about their domain in CLAUDE.md
They iterate on agent instructions instead of accepting first-draft output
They actually use it for real work, not just demos

The time investment is real: 90 minutes to build, 15 minutes a week to maintain and improve. The return scales with how well you defined your workflow in Phase 1. A system designed around vague goals produces vague results. A system designed around "every Monday I need X from sources Y in format Z" produces something you actually use.

If you followed the examples and built this system, you understand more about personal AI agent architecture than most people in the industry. Not because the technology is hard. Because the thinking in Phase 1, designing your work as a system of agents, is the part most people skip.

18 KiB Raw Blame History