diff --git a/commands/evaluate.md b/commands/evaluate.md new file mode 100644 index 0000000..02015de --- /dev/null +++ b/commands/evaluate.md @@ -0,0 +1,63 @@ +--- +description: Evaluate your agent system against the 22 agent capabilities. Shows coverage, gaps, and recommendations. +argument-hint: "Optional: focus area (security, deployment, memory, autonomy)" +allowed-tools: ["Read", "Glob", "Grep", "Bash"] +--- + +You are running `/agent-factory:evaluate` — a capability assessment for your agent system. + +## Step 1: Scan project components + +Scan for all agent system components: +- Agents: Glob for `.claude/agents/*.md` +- Pipeline skills: Glob for `.claude/skills/*/SKILL.md` +- Knowledge skills: Glob for `.claude/skills/*.md` +- Hooks: Glob for `.claude/hooks/*.sh` and `hooks/*.sh` +- Settings: Read `.claude/settings.json` if it exists +- Context: Read `CLAUDE.md` if it exists +- Automation: Glob for `automation/*`, `scripts/*.sh` +- Memory: look for `memory/MEMORY.md`, `memory/SESSION-STATE.md`, `data/run-state.json` +- Heartbeat: look for `HEARTBEAT.md` +- Goals: look for `GOALS.md` +- Governance: look for `GOVERNANCE.md` +- Org chart: look for `ORG-CHART.md` +- Budget: look for `budget/BUDGET.md`, `budget/cost-events.jsonl` +- Docker: look for `Dockerfile`, `docker-compose.yml` + +## Step 2: Score against 22 capabilities + +Read the feature map at `${CLAUDE_PLUGIN_ROOT}/skills/agent-system-design/references/feature-map.md`. + +For each of the 22 capabilities, check whether the user's project has the corresponding component. Score as: +- **OK** — component exists and is properly configured +- **Partial** — component exists but is incomplete or misconfigured +- **Missing** — component does not exist + +## Step 3: Output capability matrix + +``` +AGENT SYSTEM EVALUATION +======================= + +| # | Capability | Status | What exists | What's needed | +|---|-----------|--------|-------------|---------------| +| 1 | Agent Runtime | OK | CLAUDE.md + settings.json | — | +| 2 | Shell Execution | Missing | — | hooks/pre-tool-use.sh + deny list | +... + +Score: X/22 OK | Y/22 Partial | Z/22 Missing +``` + +## Step 4: Recommendations + +Provide specific recommendations for filling gaps, ordered by impact: +1. Security gaps first (hooks, permissions) +2. Core functionality gaps (missing agents, skills) +3. Operational gaps (memory, automation, deployment) +4. Advanced gaps (governance, budget, self-learning) + +If $ARGUMENTS specifies a focus area, expand that section with detailed guidance and link to relevant templates from `${CLAUDE_PLUGIN_ROOT}/scripts/templates/`. + +## Step 5: Next steps + +Suggest: "Run `/agent-factory:build` to fill gaps interactively, or `/agent-factory:deploy` to configure deployment."