feat(humanizer): update agent system prompts [skip-docs]

Wave 5 Step 16 — final wave step. Threads humanizer-aware rendering
rules through the three agent prompts that produce user-facing output,
and adds a shape test that locks the structure.

- agents/analyzer-agent.md: documents the humanizer envelope shape
  (userImpactCategory, userActionLanguage, relevanceContext) in the
  Input section; new "Humanizer-aware rendering rules" subsection
  instructs the agent to: render humanized title/description/
  recommendation verbatim, group findings by userImpactCategory, lead
  each line with userActionLanguage, surface relevanceContext when
  not affects-everyone, and skip jargon-translation subroutines.
  --raw fallback documented (v5.0.0 verbatim severity prefiks).
- agents/planner-agent.md: documents the same vocabulary; instructs
  the planner to consume humanized fields from the analysis report,
  preserve titles verbatim, and order actions by both dependencies
  AND userActionLanguage urgency. Translation duties explicitly
  removed from the plan.
- agents/feature-gap-agent.md: replaces the inline t1/t2/t3/t4
  tier-to-prose section ladder with userActionLanguage-driven
  groupings ("Fix soon" → High Impact, "Fix when convenient" →
  Worth Considering, "Optional cleanup"/"FYI" → Explore When Ready);
  instructs skipping findings whose relevanceContext is
  test-fixture-no-impact; --raw fallback documented.

tests/agents/agent-prompt-shape.test.mjs (new, +6 tests, 786 → 792):
  - structural: humanized field reference + frontmatter preserved
  - per-agent anchors: analyzer groups by userImpactCategory; planner
    orders by userActionLanguage; feature-gap references
    test-fixture-no-impact
  - global: no "explain what {jargon} means" / "translate jargon" /
    "jargon-translation duty" prose anywhere

Self-audit: Grade A unchanged (config 97/100, plugin 100/100).
This commit is contained in:
Kjell Tore Guttormsen 2026-05-01 19:53:59 +02:00
commit ec4ac3e6d1
4 changed files with 136 additions and 27 deletions

View file

@ -19,10 +19,17 @@ You receive posture assessment data (JSON) containing:
- `areas` — per-scanner grades (10 quality areas incl. Token Efficiency, Plugin Hygiene, + Feature Coverage)
- `overallGrade` — health grade (quality areas only)
- `opportunityCount` — number of unused features detected
- `scannerEnvelope` — full scanner results including GAP findings
- `scannerEnvelope` — full scanner results. In default mode each GAP finding carries humanizer fields: `userImpactCategory` ("Missed opportunity"), `userActionLanguage` ("Fix soon", "Fix when convenient", "Optional cleanup", "FYI"), and `relevanceContext`. The humanizer also replaced `title`/`description`/`recommendation` strings with plain-language equivalents.
You also receive project context: language, file count, existing configuration.
## Humanizer-aware rendering rules
- **Render the humanizer's `title`/`description`/`recommendation` verbatim.** Do not paraphrase. The humanizer owns the plain-language vocabulary.
- **Drive prioritization with `userActionLanguage`, not raw category tiers.** "Fix soon" → "Fix when convenient" → "Optional cleanup" → "FYI" replaces the t1/t2/t3/t4 tier ladder for output ordering.
- **Skip findings with `relevanceContext === "test-fixture-no-impact"`** unless the user explicitly asked to include fixtures.
- **Do not include "explain what X means" subroutines.** The category labels ("Missed opportunity") are pre-translated.
## Knowledge Files
Read **at most 3** of these files from the plugin's `knowledge/` directory:
@ -36,6 +43,8 @@ Write `feature-gap-report.md` to the session directory. Max 200 lines.
### Report Structure
Group findings by `userActionLanguage` rather than by raw category tier. Render the humanizer's `title` and `recommendation` verbatim — the humanizer has already produced plain-language equivalents.
```markdown
# Feature Opportunities
@ -47,38 +56,34 @@ Write `feature-gap-report.md` to the session directory. Max 200 lines.
## High Impact
These address correctness or security — consider them seriously.
[Findings where userActionLanguage is "Fix soon" — these address correctness or security; consider them seriously.]
**[feature name]**
Why: [evidence-backed reason, cite Anthropic docs or proven issues]
How: [2-3 concrete steps]
[Repeat for each T1 finding]
**[humanized title verbatim]**
Why: [humanized description verbatim, plus "relevant because your project has X" context]
How: [humanized recommendation verbatim, broken into 2-3 concrete steps from gap-closure-templates.md]
## Worth Considering
These improve workflow efficiency for projects like yours.
[Findings where userActionLanguage is "Fix when convenient" — these improve workflow efficiency for projects like yours.]
**[feature name]**
Why: [reason, with "relevant because your project has X"]
How: [2-3 concrete steps]
[Repeat for each T2 finding]
**[humanized title verbatim]**
Why: [humanized description verbatim, plus relevance context]
How: [humanized recommendation verbatim, broken into 2-3 concrete steps]
## Explore When Ready
Nice-to-have features. Skip these if your current setup works well.
[Findings where userActionLanguage is "Optional cleanup" or "FYI" — nice-to-have, skip if current setup works well.]
**[feature name]**
Why: [brief reason]
[Repeat for T3/T4 findings, keep brief]
**[humanized title verbatim]**
Why: [humanized description verbatim, brief]
## When You Might Skip These
[Honest qualification: which recommendations are genuinely optional and why. A minimal setup can be the right choice.]
[Honest qualification: which recommendations are genuinely optional and why. A minimal setup can be the right choice. Mention any findings whose `relevanceContext` is `affects-this-machine-only` so the user knows the change won't propagate to teammates.]
```
In `--raw` mode (humanizer fields absent), fall back to grouping by raw category tier (t1/t2/t3/t4) and render scanner-emitted titles verbatim — flag in the report header that output is unhumanized.
## Guidelines
- Frame everything as opportunities, never as failures or gaps