refactor: rewrite Example 14 as genuine capstone and fix skill creation
Example 14 completely rewritten. Was: GETTING-STARTED.md repeated (one skill + phone + cron). Now: a 7-phase system design that produces a complete personal agent ecosystem (custom agents, multi-agent pipeline, custom hooks, automation, phone delivery). Requires accumulated knowledge from examples 01-13. Includes: - Phase 1: Map your work (design before building) - Phase 3: Custom agent team created via Claude (not manually) - Phase 4: Pipeline skill chaining agents into complete workflow - Phase 5: Custom security hooks for user's context - Phase 7: Test on real work with evaluation rubric - Three concrete persona examples (marketing, engineering, consulting) GETTING-STARTED.md Step 4: replaced manual file creation with "tell Claude to create the skill" workflow. Skills, agents, and hooks should always be created by asking Claude, not by hand. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
0d0b83f98c
commit
ccf2749127
2 changed files with 490 additions and 243 deletions
|
|
@ -1,276 +1,549 @@
|
|||
# Example 14: Build Your Personal Agent
|
||||
# Example 14: Build Your Personal Agent System
|
||||
|
||||
This is not a demo. This is the example where you build something real.
|
||||
Everything before this was a demo. This is where you build
|
||||
something real.
|
||||
|
||||
Everything you explored in examples 01-13 demonstrated capabilities. This
|
||||
example puts them together into a personal agent that works for you
|
||||
specifically. Not a copy of the demo. Not a tutorial exercise. A setup
|
||||
you will actually use tomorrow morning.
|
||||
Examples 01-13 showed you individual capabilities: research, file
|
||||
management, web search, agents, pipelines, messaging, automation,
|
||||
security. Individually, each one is useful. Together, they form
|
||||
something fundamentally different: a personal agent system that
|
||||
handles your recurring work, runs on schedule, and delivers results
|
||||
to your phone.
|
||||
|
||||
**Time needed:** 45-60 minutes for the core setup. You will refine it
|
||||
over the first week of use.
|
||||
This is not "create a skill and connect your phone." You could have
|
||||
done that after reading GETTING-STARTED.md. This is: design a system
|
||||
of agents that understands your domain, automate a complete workflow,
|
||||
add safety guardrails, and test it on real work from your actual week.
|
||||
|
||||
**Time needed:** 90 minutes for the full build. You will iterate on
|
||||
it over the first week. Most people say it clicks around day 3.
|
||||
|
||||
**What you need:** A project directory (new or existing) where you
|
||||
will set up your personal agent system. Everything you build here
|
||||
is portable. Move it to any project later.
|
||||
|
||||
---
|
||||
|
||||
## What you will build
|
||||
## Phase 1: Map your work (10 minutes, no computer needed)
|
||||
|
||||
By the end of this example, you will have:
|
||||
Before you build anything, think. The difference between a useful
|
||||
agent system and a novelty is whether you designed it around how
|
||||
you actually work.
|
||||
|
||||
1. A `CLAUDE.md` written for your life and work (not the demo one)
|
||||
2. A personal skill that automates something you do every week
|
||||
3. A messaging channel connected to your phone
|
||||
4. A scheduled automation that runs without you
|
||||
5. A tested end-to-end flow: phone to agent to result to phone
|
||||
### The three questions
|
||||
|
||||
This is the setup that makes people text their agent from bed and
|
||||
find the answer waiting when they sit down with coffee.
|
||||
**1. What do you do every week that follows a pattern?**
|
||||
|
||||
---
|
||||
Not creative work. Not relationship work. The structured, repeatable
|
||||
tasks that eat your time because they require gathering, processing,
|
||||
formatting, and delivering information.
|
||||
|
||||
## Step 1: Write your CLAUDE.md (15 minutes)
|
||||
Examples:
|
||||
- A weekly status report that pulls from multiple sources
|
||||
- Competitor monitoring that requires searching, comparing, summarizing
|
||||
- Meeting preparation that needs research, context, and a briefing doc
|
||||
- Client updates that combine project data with a professional narrative
|
||||
|
||||
Close this repo's `CLAUDE.md` in your mind. Start fresh. Open a new
|
||||
project directory (or use an existing one) and create `CLAUDE.md`.
|
||||
Write down 3-5 tasks. Be specific about inputs, processing, and outputs.
|
||||
|
||||
Write it like a briefing for a brilliant new colleague on their
|
||||
first day. Include:
|
||||
**2. What roles would you hire for if you could?**
|
||||
|
||||
```markdown
|
||||
# [Your Project or Life Context]
|
||||
Not "an assistant." Specific roles with specific expertise:
|
||||
|
||||
## Who I am
|
||||
[Your role, what you do day to day, what you care about]
|
||||
- A **researcher** who tracks your industry and surfaces what matters
|
||||
- A **writer** who turns your notes into polished documents
|
||||
- A **reviewer** who catches errors before anything goes out
|
||||
- A **briefing officer** who summarizes what you need to know each morning
|
||||
- An **analyst** who processes data and highlights patterns
|
||||
|
||||
## How I work
|
||||
[Communication preferences, formats you like, what annoys you]
|
||||
[Example: "Be direct. Skip caveats. Bullet points over paragraphs."]
|
||||
Write down 2-3 roles. For each: what do they know, what do they produce,
|
||||
what quality standard must they meet?
|
||||
|
||||
## What I am working on right now
|
||||
[Your top 3-5 priorities with deadlines if they exist]
|
||||
**3. What must never happen automatically?**
|
||||
|
||||
## What Claude should never do
|
||||
[Hard boundaries. Things that would break trust.]
|
||||
[Example: "Never send anything externally without my explicit OK"]
|
||||
The safety question. When your system runs at 07:00 on Monday without
|
||||
you watching:
|
||||
|
||||
## Tools and accounts
|
||||
[What MCP servers are configured, what services you use]
|
||||
[Example: "Slack MCP connected to workspace X. Telegram channel active."]
|
||||
- What actions should be blocked? (sending emails, deleting files, accessing certain APIs)
|
||||
- What should always be logged? (all external calls, all file writes, all web searches)
|
||||
- What requires your approval before executing? (anything touching production, anything external)
|
||||
|
||||
Write these down. They become your hooks in Phase 5.
|
||||
|
||||
### Design sketch
|
||||
|
||||
Before touching Claude Code, sketch your system on paper or in your head:
|
||||
|
||||
```
|
||||
[Your recurring task]
|
||||
|
|
||||
v
|
||||
[Agent 1: gather information] --> raw data
|
||||
|
|
||||
v
|
||||
[Agent 2: process and draft] --> draft output
|
||||
|
|
||||
v
|
||||
[Agent 3: review and verify] --> verified output
|
||||
|
|
||||
v
|
||||
[Save to files] + [Send to phone] + [Log to memory]
|
||||
```
|
||||
|
||||
### How to know it is good enough
|
||||
|
||||
Read it back and ask: if a capable person read only this file, could
|
||||
they handle my Monday morning? If the answer is "mostly, yes," it is
|
||||
good enough. You will improve it every week.
|
||||
|
||||
**Pattern from Example 05:** This file is loaded at every session start.
|
||||
Everything you write here shapes every interaction.
|
||||
This is the same architecture as Example 10, adapted to your work.
|
||||
You are not inventing something new. You are configuring a proven
|
||||
pattern for your specific context.
|
||||
|
||||
---
|
||||
|
||||
## Step 2: Write your first real skill (15 minutes)
|
||||
## Phase 2: Build your operating manual (15 minutes)
|
||||
|
||||
Not the demo skill. Not a copy. A skill that solves a problem you
|
||||
have this week.
|
||||
Open your project directory in Claude Code. Your first step is not
|
||||
writing a CLAUDE.md yourself. It is asking Claude to interview you
|
||||
and write one.
|
||||
|
||||
Think about what you do repeatedly that follows a pattern:
|
||||
- A report you write every Monday
|
||||
- Research you do before every meeting
|
||||
- A summary you prepare for your manager
|
||||
- A check-in you run on a project
|
||||
### The prompt
|
||||
|
||||
Create `.claude/skills/[your-skill-name].md`:
|
||||
```
|
||||
Interview me to create a CLAUDE.md for this project. Ask me questions
|
||||
about:
|
||||
1. Who I am and what I do
|
||||
2. How I like to communicate (format, tone, length)
|
||||
3. What I am working on right now (top 3-5 priorities)
|
||||
4. What tools and services I use
|
||||
5. What you should never do without my permission
|
||||
|
||||
```markdown
|
||||
---
|
||||
name: [your-skill-name]
|
||||
description: [one sentence that explains when to use this]
|
||||
---
|
||||
|
||||
# [What This Does]
|
||||
|
||||
[Clear instructions for Claude. Be specific about:]
|
||||
|
||||
## Steps
|
||||
1. [Where to get the input: files, web, memory]
|
||||
2. [What to do with it: research, draft, analyze, compare]
|
||||
3. [How to format the output: bullets, paragraphs, table]
|
||||
4. [Where to save it: file path, message channel, both]
|
||||
|
||||
## Quality criteria
|
||||
- [What makes this output good vs. mediocre]
|
||||
- [What to never include or assume]
|
||||
|
||||
## Output format
|
||||
[Exact structure you want every time]
|
||||
Ask one category at a time. After all five, write a CLAUDE.md that
|
||||
would let you handle my Monday morning without additional context.
|
||||
```
|
||||
|
||||
### How to know the skill works
|
||||
Claude will ask you questions, listen to your answers, and produce
|
||||
a CLAUDE.md tailored to what you said. This is better than writing
|
||||
it yourself because Claude knows what information it needs to be
|
||||
effective.
|
||||
|
||||
Run it: `/[your-skill-name]`
|
||||
### What makes a great CLAUDE.md
|
||||
|
||||
Does the output match what you would have produced manually? If yes,
|
||||
you just automated a recurring task. If not, refine the instructions
|
||||
and run it again. Most skills take 2-3 iterations to get right.
|
||||
After Claude writes it, check:
|
||||
|
||||
**Pattern from Example 06:** If the skill involves research and writing,
|
||||
consider using the researcher/writer/reviewer agent pattern inside it.
|
||||
Multi-agent review catches errors that a single pass misses.
|
||||
| Criteria | Test |
|
||||
|----------|------|
|
||||
| Specific enough | Could someone handle your Monday with only this file? |
|
||||
| Priorities current | Does it reflect this week's work, not last month's? |
|
||||
| Boundaries clear | Are the "never do" rules unambiguous? |
|
||||
| Tools documented | Does it list what MCP servers and channels are available? |
|
||||
| Format preferences stated | Will Claude produce output in the shape you actually want? |
|
||||
|
||||
If any answer is "no," tell Claude what to fix. Iterate until it
|
||||
passes all five. This is the foundation of everything that follows.
|
||||
|
||||
**Pattern from Example 05:** CLAUDE.md loads at every session start.
|
||||
Every agent you create in Phase 3 inherits this context automatically.
|
||||
|
||||
---
|
||||
|
||||
## Step 3: Connect your phone (10 minutes)
|
||||
## Phase 3: Create your agent team (20 minutes)
|
||||
|
||||
Pick one channel. You can always add more later.
|
||||
Now build the 2-3 agents you identified in Phase 1. You do not write
|
||||
agent files by hand. You tell Claude what each agent should do and
|
||||
let it create the files.
|
||||
|
||||
**Telegram** (works on any phone, recommended for first setup):
|
||||
```bash
|
||||
### Create your first agent
|
||||
|
||||
Pick the role you need most. Tell Claude:
|
||||
|
||||
```
|
||||
Create an agent called "[role-name]" that [what it does].
|
||||
|
||||
It should:
|
||||
- [Specific capability 1]
|
||||
- [Specific capability 2]
|
||||
- [Quality standard it must meet]
|
||||
|
||||
It should NOT:
|
||||
- [Boundary 1]
|
||||
- [Boundary 2]
|
||||
|
||||
Save it to .claude/agents/[role-name].md
|
||||
```
|
||||
|
||||
**Concrete examples for three different people:**
|
||||
|
||||
A marketing manager might say:
|
||||
```
|
||||
Create an agent called "market-scanner" that monitors competitor
|
||||
activity. It should search the web for news about [competitor names],
|
||||
extract pricing changes, new feature launches, and partnership
|
||||
announcements. It should produce a structured report with source URLs
|
||||
for every claim. It should not include rumors or unverified information.
|
||||
Save it to .claude/agents/market-scanner.md
|
||||
```
|
||||
|
||||
An engineering lead might say:
|
||||
```
|
||||
Create an agent called "tech-researcher" that evaluates technologies
|
||||
for our stack. It should search documentation and GitHub repos, compare
|
||||
features, check community health (stars, issues, release frequency),
|
||||
and produce a recommendation with pros, cons, and risks. It should not
|
||||
recommend anything without checking the last 3 months of GitHub issues
|
||||
for dealbreakers. Save it to .claude/agents/tech-researcher.md
|
||||
```
|
||||
|
||||
A consultant might say:
|
||||
```
|
||||
Create an agent called "client-briefer" that prepares meeting briefs.
|
||||
It should research the client's company (recent news, financials,
|
||||
leadership changes), review my notes in memory/, and produce a
|
||||
one-page briefing with talking points and potential questions they
|
||||
will ask. It should not include anything I cannot verify in the
|
||||
meeting. Save it to .claude/agents/client-briefer.md
|
||||
```
|
||||
|
||||
### Create your second agent
|
||||
|
||||
Every good system needs a quality gate. Create a reviewer:
|
||||
|
||||
```
|
||||
Create an agent called "[domain]-reviewer" that reviews output from
|
||||
my other agents. It should check for:
|
||||
- [Your specific accuracy requirements]
|
||||
- [Your formatting standards]
|
||||
- [Common mistakes in your domain]
|
||||
|
||||
When it finds issues, it should list them with specific fix
|
||||
instructions, not vague feedback. Save it to .claude/agents/[name].md
|
||||
```
|
||||
|
||||
### Create your third agent (optional but powerful)
|
||||
|
||||
A writer/formatter that takes raw output and shapes it for your
|
||||
audience:
|
||||
|
||||
```
|
||||
Create an agent called "[output]-writer" that transforms research
|
||||
and analysis into [format your audience expects]. It should follow
|
||||
[your organization's style/tone]. It should be concise: [your word
|
||||
limit]. Save it to .claude/agents/[name].md
|
||||
```
|
||||
|
||||
### Test each agent individually
|
||||
|
||||
Before combining them, verify each one works alone:
|
||||
|
||||
```
|
||||
Use the [agent-name] agent to [a specific task from your real work].
|
||||
Show me the full output.
|
||||
```
|
||||
|
||||
Does the output meet your standards? If not, tell Claude what to fix
|
||||
in the agent file. Iterate until each agent produces work you would
|
||||
use. This is worth the time. A weak agent in a pipeline produces
|
||||
weak pipeline output.
|
||||
|
||||
**Pattern from Example 06:** The researcher-writer-reviewer pattern
|
||||
works for almost any domain. Your agents are the same pattern with
|
||||
your domain expertise built in.
|
||||
|
||||
---
|
||||
|
||||
## Phase 4: Build your pipeline (20 minutes)
|
||||
|
||||
Now the payoff. You combine your agents into a pipeline skill that
|
||||
handles a complete workflow.
|
||||
|
||||
### Create the pipeline skill
|
||||
|
||||
Tell Claude to create a skill that orchestrates your agents:
|
||||
|
||||
```
|
||||
Create a skill called "weekly-[your-workflow]" that runs my complete
|
||||
[task name] workflow.
|
||||
|
||||
Steps:
|
||||
1. Read CLAUDE.md for current priorities and context
|
||||
2. Read memory/[your-state-file].md for what happened last time
|
||||
3. Use the [agent-1] agent to [gather/research/analyze]
|
||||
4. Use the [agent-2] agent to [draft/format/write]
|
||||
5. Use the [agent-3] agent to [review/verify/check]
|
||||
6. If the reviewer finds issues, send back to [agent-2] for fixes
|
||||
7. Save the final output to pipeline-output/[your-output]-[date].md
|
||||
8. Update memory/[your-state-file].md with what was done and when
|
||||
9. Show me the first 10 lines of the output to confirm
|
||||
|
||||
Save it to .claude/skills/weekly-[your-workflow].md
|
||||
```
|
||||
|
||||
**Concrete example for a marketing manager:**
|
||||
|
||||
```
|
||||
Create a skill called "weekly-competitive-intel" that runs my
|
||||
Monday morning competitive intelligence workflow.
|
||||
|
||||
Steps:
|
||||
1. Read CLAUDE.md for the list of competitors I track
|
||||
2. Read memory/competitive-intel-state.md for what was covered last week
|
||||
3. Use the market-scanner agent to research each competitor's
|
||||
activity this week (news, product changes, pricing, partnerships)
|
||||
4. Use the report-writer agent to draft a structured briefing:
|
||||
- Executive summary (3 bullets)
|
||||
- Per-competitor section with changes and source URLs
|
||||
- "So what" section: what this means for our strategy
|
||||
5. Use the quality-reviewer agent to verify all claims have sources
|
||||
and no speculation is presented as fact
|
||||
6. If the reviewer finds issues, send back to report-writer for fixes
|
||||
7. Save to pipeline-output/competitive-intel-[date].md
|
||||
8. Update memory/competitive-intel-state.md with today's date
|
||||
and what was found
|
||||
9. Show me the executive summary
|
||||
|
||||
Save it to .claude/skills/weekly-competitive-intel.md
|
||||
```
|
||||
|
||||
### Test the pipeline
|
||||
|
||||
Run it on a real task:
|
||||
|
||||
```
|
||||
/weekly-[your-workflow]
|
||||
```
|
||||
|
||||
Watch the agents hand off to each other. This is the same orchestration
|
||||
you saw in Example 10, but configured for your work. The pipeline
|
||||
should take 2-5 minutes depending on how much research is involved.
|
||||
|
||||
### Evaluate the output
|
||||
|
||||
Read the output as if your manager, client, or colleague sent it to you.
|
||||
|
||||
| Question | If no |
|
||||
|----------|-------|
|
||||
| Is it accurate? | Tighten the reviewer agent's instructions |
|
||||
| Is it the right format? | Adjust the writer agent's output template |
|
||||
| Is it the right depth? | Change the researcher agent's scope |
|
||||
| Would you send this as-is? | Identify what's missing and update the pipeline |
|
||||
|
||||
Iterate. Run the pipeline again after each adjustment. Most people
|
||||
need 2-3 rounds to get output they are genuinely satisfied with.
|
||||
|
||||
**Pattern from Example 10:** The pipeline is a recipe with
|
||||
interchangeable ingredients. Swap agents, change topics, adjust the
|
||||
output format. The orchestration stays the same.
|
||||
|
||||
---
|
||||
|
||||
## Phase 5: Add your safety layer (10 minutes)
|
||||
|
||||
Your pipeline will soon run automatically. Before that happens,
|
||||
set up the guardrails you identified in Phase 1.
|
||||
|
||||
### Create your custom hooks
|
||||
|
||||
Tell Claude to create hooks for your specific risks:
|
||||
|
||||
```
|
||||
Create a PreToolUse hook that blocks Bash commands containing any of:
|
||||
- [pattern 1: e.g., "rm -rf", "DROP TABLE"]
|
||||
- [pattern 2: e.g., commands targeting your production directories]
|
||||
- [pattern 3: e.g., curl commands with API tokens in arguments]
|
||||
|
||||
Save to hooks/[your-hook-name].sh and register it in
|
||||
.claude/settings.json
|
||||
```
|
||||
|
||||
```
|
||||
Create a PostToolUse hook that logs every tool call to
|
||||
hooks/audit.log with timestamp, tool name, and a summary of what
|
||||
was done. Save to hooks/audit-logger.sh and register it.
|
||||
```
|
||||
|
||||
### Verify the hooks work
|
||||
|
||||
```
|
||||
Try running: [a command your hook should block]
|
||||
Then check hooks/audit.log to confirm it was blocked and logged.
|
||||
```
|
||||
|
||||
**Pattern from Example 09:** Hooks run on every tool call, including
|
||||
inside agent invocations and automated pipeline runs. What you
|
||||
configure here protects every automated run from Phase 6 onward.
|
||||
|
||||
---
|
||||
|
||||
## Phase 6: Connect and automate (10 minutes)
|
||||
|
||||
### Set up your phone channel
|
||||
|
||||
```
|
||||
claude --channels
|
||||
```
|
||||
Follow the Telegram setup in `messaging/telegram-channels-setup.md`.
|
||||
|
||||
**iMessage** (if you live in Apple's ecosystem):
|
||||
```bash
|
||||
/install @anthropic-ai/claude-code-imessage
|
||||
claude --channels
|
||||
Follow the setup for your chosen channel (Telegram recommended
|
||||
for first setup). See `messaging/` in this repo for detailed guides.
|
||||
|
||||
Test: send "hello" from your phone. If Claude responds, you are connected.
|
||||
|
||||
### Schedule your pipeline
|
||||
|
||||
Tell Claude:
|
||||
|
||||
```
|
||||
Create a cron job that runs /weekly-[your-workflow] every
|
||||
[day] at [time]. Show me the cron entry before creating it.
|
||||
```
|
||||
Follow `messaging/imessage-setup.md`.
|
||||
|
||||
### Test it
|
||||
Or for remote scheduling:
|
||||
|
||||
From your phone, send: "Run /[your-skill-name]"
|
||||
```
|
||||
/schedule "Run /weekly-[your-workflow] and send me the executive
|
||||
summary via Telegram" at [date]T[time]:00
|
||||
```
|
||||
|
||||
If the result comes back to your phone, your agent is connected.
|
||||
### Test the full flow
|
||||
|
||||
**Pattern from Example 07:** Channels turn a desktop tool into
|
||||
a personal assistant you can reach from anywhere.
|
||||
The real test. From your phone, send:
|
||||
|
||||
```
|
||||
Run /weekly-[your-workflow]
|
||||
```
|
||||
|
||||
Wait for the result. When the pipeline output arrives on your
|
||||
phone, your system is working end-to-end: phone trigger, agent
|
||||
orchestration, quality review, file output, delivery.
|
||||
|
||||
**Pattern from Example 08:** /loop for testing, CronCreate for
|
||||
production, /schedule for remote triggers.
|
||||
|
||||
---
|
||||
|
||||
## Step 4: Automate it (10 minutes)
|
||||
## Phase 7: Run it on real work (15 minutes)
|
||||
|
||||
Your skill works. Your phone is connected. Now make it run without you.
|
||||
This is the moment that matters. Everything up to here was setup.
|
||||
Now you test it on an actual task from your actual week.
|
||||
|
||||
**For a daily task:**
|
||||
```
|
||||
Create a cron job that runs /[your-skill-name] every [weekday] at [time].
|
||||
Use automation/daily-briefing.sh as a template. Show me the cron entry
|
||||
before creating it.
|
||||
```
|
||||
### Pick a real task
|
||||
|
||||
**For a weekly task:**
|
||||
```
|
||||
/schedule "Run /[your-skill-name] and send me the result via Telegram"
|
||||
at [next Monday]T07:00:00
|
||||
```
|
||||
Not a test. Not a hypothetical. Something you need to do this week
|
||||
that your pipeline is designed to handle. A real report, a real
|
||||
briefing, a real analysis.
|
||||
|
||||
### Test it
|
||||
|
||||
Create a `/loop` test first to verify the flow works before committing
|
||||
to a cron job:
|
||||
### Run it
|
||||
|
||||
```
|
||||
/loop interval=120
|
||||
|
||||
Run /[your-skill-name]. Send the result to [your channel].
|
||||
Then wait for the next interval.
|
||||
/weekly-[your-workflow]
|
||||
```
|
||||
|
||||
If the output arrives on your phone every 2 minutes, the automation works.
|
||||
Replace with a real schedule.
|
||||
### Evaluate honestly
|
||||
|
||||
**Pattern from Example 08:** /loop for testing, CronCreate for daily
|
||||
drivers, /schedule for remote triggers.
|
||||
Read the output. Compare it to what you would have produced manually.
|
||||
|
||||
**The scoring rubric:**
|
||||
|
||||
| Score | Meaning | What to do |
|
||||
|-------|---------|------------|
|
||||
| "I would send this as-is" | Your system works. Ship it. | Minor tweaks over time |
|
||||
| "Close, needs 10 min of editing" | Your system saves 80% of the work. | Refine agent instructions for the gaps |
|
||||
| "The structure is right but content is off" | Agents need better domain context | Add more detail to CLAUDE.md and agent files |
|
||||
| "This is not usable" | Pipeline design mismatch | Go back to Phase 1 and re-map the workflow |
|
||||
|
||||
Most people land on "close, needs 10 minutes of editing" on the
|
||||
first real run. By the third run, the agents have been refined
|
||||
enough that the output is send-ready.
|
||||
|
||||
---
|
||||
|
||||
## Step 5: Test the full flow (5 minutes)
|
||||
## What you built
|
||||
|
||||
The real test. Put your phone down. Walk away from the computer.
|
||||
| Component | What it does | How to change it |
|
||||
|-----------|-------------|-----------------|
|
||||
| CLAUDE.md | Operating manual for your domain | Update weekly with priorities |
|
||||
| 2-3 agents | Specialists for your recurring tasks | Refine when output misses the mark |
|
||||
| Pipeline skill | Chains agents into a complete workflow | Add steps, swap agents, change output |
|
||||
| Custom hooks | Protects automated runs | Add patterns as you automate more |
|
||||
| Scheduling | Runs the pipeline on autopilot | Adjust frequency based on need |
|
||||
| Phone channel | Access from anywhere | Add more channels over time |
|
||||
| Memory | Tracks what was done and when | Grows automatically with each run |
|
||||
|
||||
From your phone, send:
|
||||
This is not a configured tool. This is a system.
|
||||
|
||||
```
|
||||
Run /[your-skill-name] and tell me when it is done.
|
||||
```
|
||||
|
||||
Wait.
|
||||
|
||||
When the result arrives, you have a working personal agent. Not a demo.
|
||||
Not an exercise. A system that does real work on your behalf, triggered
|
||||
from your phone, using your context, delivering to where you need it.
|
||||
A GETTING-STARTED reader has a Claude Code setup with preferences.
|
||||
You have a multi-agent pipeline that handles a complete recurring
|
||||
workflow, runs on schedule, reviews its own output for quality,
|
||||
logs every action for safety, and delivers results to your phone.
|
||||
|
||||
---
|
||||
|
||||
## What you have now
|
||||
## Your first week
|
||||
|
||||
| Component | What it does | File |
|
||||
|-----------|-------------|------|
|
||||
| CLAUDE.md | Your context, always loaded | `CLAUDE.md` |
|
||||
| Skill | Your recurring task, automated | `.claude/skills/[name].md` |
|
||||
| Channel | Your phone connection | Telegram/iMessage/Discord |
|
||||
| Schedule | Your automation trigger | cron or /schedule |
|
||||
| Memory | Your persistent state | `memory/MEMORY.md` |
|
||||
| Hooks | Your safety guardrails | `hooks/` |
|
||||
**Day 1:** Run the pipeline manually. Fix what is wrong with the
|
||||
output. Refine agent instructions.
|
||||
|
||||
This is the same architecture, at a smaller scale, that runs
|
||||
production content pipelines. The pieces are the same. The
|
||||
difference is what you point them at.
|
||||
**Day 2:** Run it again. Compare to yesterday. Is it better?
|
||||
Add domain knowledge to CLAUDE.md that would have prevented
|
||||
yesterday's mistakes.
|
||||
|
||||
---
|
||||
**Day 3:** Let the cron job run it. Check the output when it arrives
|
||||
on your phone. The moment it produces something useful without you
|
||||
touching anything: that is the moment it clicks.
|
||||
|
||||
## What to do in your first week
|
||||
**Day 4-5:** Add a second pipeline skill for another recurring task.
|
||||
You already have agents. Creating a new pipeline that reuses them
|
||||
takes 10 minutes.
|
||||
|
||||
**Day 1-2:** Use the skill manually a few times. Notice what the output
|
||||
gets wrong or could improve. Edit the skill file.
|
||||
**Day 6-7:** Review your hooks and audit log. Are the right things
|
||||
being blocked? Is the log useful? Tighten security now that you
|
||||
trust the system to run autonomously.
|
||||
|
||||
**Day 3-4:** Add a second skill for something else you do often. Start
|
||||
texting tasks from your phone as a habit.
|
||||
|
||||
**Day 5-7:** Check your CLAUDE.md. Is it still accurate? Add what you
|
||||
learned this week. Remove what turned out to be irrelevant.
|
||||
|
||||
**After one week:** You will know whether this is a novelty or a genuine
|
||||
tool. Most people who get this far keep going. The ones who do not usually
|
||||
stopped at the CLAUDE.md and never made it personal enough to be useful.
|
||||
**After one week:** You have 1-2 automated pipelines, 2-3 specialized
|
||||
agents, and a growing memory of your work context. Routine Monday
|
||||
tasks take one phone message instead of two hours. You will know
|
||||
whether this changed how you work.
|
||||
|
||||
---
|
||||
|
||||
## Growing from here
|
||||
|
||||
Your agent gets more useful the more you invest in it:
|
||||
**More agents.** As you notice new recurring patterns, create agents
|
||||
for them. Each agent you add is a building block that any pipeline
|
||||
skill can use.
|
||||
|
||||
**Add more skills.** Every task you do more than twice a week is a
|
||||
candidate. A good personal setup has 3-5 skills after a month.
|
||||
**More pipelines.** A pipeline is just a skill that calls agents in
|
||||
sequence. Once you have 3-4 agents, creating a new pipeline for a
|
||||
different workflow takes minutes.
|
||||
|
||||
**Add more tools.** MCP servers connect Claude to your services.
|
||||
Slack, Google Drive, calendar, databases. Each one extends what
|
||||
your agent can do autonomously.
|
||||
**More integrations.** MCP servers connect your agents to external
|
||||
services: Slack, Google Drive, databases, APIs. Each integration
|
||||
multiplies what your pipelines can do without manual data gathering.
|
||||
|
||||
**Add more agents.** The `.claude/agents/` directory can hold
|
||||
specialists for your domain. A "compliance checker," a "meeting prep"
|
||||
agent, a "customer research" agent. Pattern: give each agent a role,
|
||||
a scope, and clear instructions.
|
||||
**Tighter security.** As automation increases, so should guardrails.
|
||||
Add domain-specific hook patterns. Review audit logs. Set up alerts
|
||||
for blocked actions that might indicate a prompt going wrong.
|
||||
|
||||
**Tune the security.** As you automate more, tighten the hooks. Add
|
||||
patterns to the deny list. Review the audit log weekly. The more
|
||||
autonomous your agent is, the more important the guardrails.
|
||||
**Shared agents.** The agents and pipelines you build are markdown
|
||||
files. You can share them with colleagues, version control them, or
|
||||
use them as templates for other projects. The system is portable.
|
||||
|
||||
---
|
||||
|
||||
## Honest assessment
|
||||
|
||||
This setup will not replace your judgment, your relationships, or
|
||||
your taste. It replaces the scaffolding: the research, the formatting,
|
||||
the status updates, the routine decisions, the cognitive overhead of
|
||||
remembering where you left off.
|
||||
This system will not replace your expertise, your relationships, or
|
||||
your judgment. It replaces the scaffolding: gathering, formatting,
|
||||
cross-checking, delivering, and remembering. The work that has to
|
||||
happen before and after the real thinking.
|
||||
|
||||
The people who get the most from it are the ones who are specific about
|
||||
what they need. A vague CLAUDE.md produces vague results. A precise one
|
||||
produces surprisingly useful results from day one.
|
||||
The people who get the most from it share three traits:
|
||||
1. They are specific about their domain in CLAUDE.md
|
||||
2. They iterate on agent instructions instead of accepting first-draft output
|
||||
3. They actually use it for real work, not just demos
|
||||
|
||||
The time investment is real: one hour to set up, five minutes a week
|
||||
to maintain. The return depends entirely on how well you describe your
|
||||
work and how consistently you use it.
|
||||
The time investment is real: 90 minutes to build, 15 minutes a week
|
||||
to maintain and improve. The return scales with how well you defined
|
||||
your workflow in Phase 1. A system designed around vague goals
|
||||
produces vague results. A system designed around "every Monday I need
|
||||
X from sources Y in format Z" produces something you actually use.
|
||||
|
||||
Start with one skill. Make it genuinely useful. Everything else follows.
|
||||
If you followed the examples and built this system, you understand
|
||||
more about personal AI agent architecture than most people in the
|
||||
industry. Not because the technology is hard. Because the thinking
|
||||
in Phase 1, designing your work as a system of agents, is the part
|
||||
most people skip.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue