# Feedback Loop Systematic feedback collection and performance scoring for agent pipelines. ## How it works 1. After each pipeline run, a reviewer agent (or human) assigns a score (0–100) and categorizes any issues with a pattern tag. 2. `feedback-collector.sh` runs as a PostToolUse hook on `review_pipeline` or `score_output` tool calls. It appends a row to `FEEDBACK.md`. 3. When 3+ rows share the same pattern tag, a recurring-pattern alert fires. 4. `performance-scorer.sh` reads `FEEDBACK.md` and `budget/cost-events.jsonl` to compute per-agent metrics: average score, error rate, cost per run, improvement trend (last 10 vs. previous 10 runs). 5. Agents scoring below the threshold (default 60/100) are flagged for review. ## Pattern tags Consistent tags are required for pattern detection to work. Use the tags defined in `FEEDBACK.md`. Add project-specific tags as needed — but be consistent. Inconsistent tagging produces false negatives. ## Scoring → self-improvement connection Feedback scores are the input to VFM (Value-for-Money) pre-scoring defined in `scripts/templates/proactive/VFM-SCORING.md` (Step 11). A low-scoring agent gets a lower VFM pre-score for future pipeline tasks, making it less likely to be selected until its performance improves. The feedback loop closes the improvement cycle: 1. Pipeline runs → reviewer assigns score + pattern tag 2. `feedback-collector.sh` appends to FEEDBACK.md 3. `performance-scorer.sh` flags underperforming agents 4. Developer reviews top patterns → iterates on agent prompt 5. New runs produce new feedback → trend shows improvement 6. VFM scores update automatically on next pipeline selection ## Example: prompt iteration driven by feedback Suppose `agent-writer` repeatedly scores 45/100 with pattern `quality-low`: ``` | 2025-01-10 | doc-pipeline | agent-writer | 45/100 | Output too brief | Added detail requirement | quality-low | | 2025-01-11 | doc-pipeline | agent-writer | 42/100 | Still too brief | Repeated instruction | quality-low | | 2025-01-12 | doc-pipeline | agent-writer | 48/100 | Slightly better | — | quality-low | ``` After 3 rows: feedback-collector.sh fires the recurring-pattern alert. performance-scorer.sh shows avg 45/100, error rate 100%. Action: update agent-writer's system prompt with explicit length and depth requirements. Next 10 runs show trend "improving (+18.3)". ## Integration Add feedback-collector.sh as a PostToolUse hook in `.claude/settings.json`: ```json { "hooks": { "PostToolUse": [{ "matcher": "review_pipeline", "hooks": [{"type": "command", "command": "bash feedback/feedback-collector.sh"}] }] } } ``` Run performance-scorer.sh on demand or as a scheduled report: ```bash ./feedback/performance-scorer.sh ./feedback/performance-scorer.sh --agent agent-writer --threshold 70 ```