agent-builder/commands/evaluate.md at 9d24dc5c4173241835f2c0cc21ab6f654b0fbb72

open/agent-builder

Fork 0

Kjell Tore Guttormsen dce550a2cb feat(commands): add /agent-factory:evaluate command

2026-04-12 06:45:35 +02:00

2.5 KiB

Raw Blame History

description

argument-hint

allowed-tools

Evaluate your agent system against the 22 agent capabilities. Shows coverage, gaps, and recommendations.

Optional: focus area (security, deployment, memory, autonomy)

Read

Glob

Grep

Bash

You are running /agent-factory:evaluate — a capability assessment for your agent system.

Step 1: Scan project components

Scan for all agent system components:

Agents: Glob for .claude/agents/*.md
Pipeline skills: Glob for .claude/skills/*/SKILL.md
Knowledge skills: Glob for .claude/skills/*.md
Hooks: Glob for .claude/hooks/*.sh and hooks/*.sh
Settings: Read .claude/settings.json if it exists
Context: Read CLAUDE.md if it exists
Automation: Glob for automation/*, scripts/*.sh
Memory: look for memory/MEMORY.md, memory/SESSION-STATE.md, data/run-state.json
Heartbeat: look for HEARTBEAT.md
Goals: look for GOALS.md
Governance: look for GOVERNANCE.md
Org chart: look for ORG-CHART.md
Budget: look for budget/BUDGET.md, budget/cost-events.jsonl
Docker: look for Dockerfile, docker-compose.yml

Step 2: Score against 22 capabilities

Read the feature map at ${CLAUDE_PLUGIN_ROOT}/skills/agent-system-design/references/feature-map.md.

For each of the 22 capabilities, check whether the user's project has the corresponding component. Score as:

OK — component exists and is properly configured
Partial — component exists but is incomplete or misconfigured
Missing — component does not exist

Step 3: Output capability matrix

AGENT SYSTEM EVALUATION
=======================

| # | Capability | Status | What exists | What's needed |
|---|-----------|--------|-------------|---------------|
| 1 | Agent Runtime | OK | CLAUDE.md + settings.json | — |
| 2 | Shell Execution | Missing | — | hooks/pre-tool-use.sh + deny list |
...

Score: X/22 OK | Y/22 Partial | Z/22 Missing

Step 4: Recommendations

Provide specific recommendations for filling gaps, ordered by impact:

Security gaps first (hooks, permissions)
Core functionality gaps (missing agents, skills)
Operational gaps (memory, automation, deployment)
Advanced gaps (governance, budget, self-learning)

If $ARGUMENTS specifies a focus area, expand that section with detailed guidance and link to relevant templates from ${CLAUDE_PLUGIN_ROOT}/scripts/templates/.

Step 5: Next steps

Suggest: "Run /agent-factory:build to fill gaps interactively, or /agent-factory:deploy to configure deployment."

2.5 KiB Raw Blame History