feat(humanizer): update audit/analysis command templates group A [skip-docs]

Wave 5 Step 13. Threads the humanizer vocabulary through five audit/
analysis command templates and adds a shape test that locks the
structure in place.

- commands/posture.md, tokens.md, feature-gap.md (findings-renderers):
  reference userImpactCategory/userActionLanguage/relevanceContext;
  remove hardcoded A/B/C/D/F-to-prose tables (humanizer owns the
  grade-context vocabulary now via the stderr scorecard headline).
- commands/manifest.md, whats-active.md (inventory CLIs): add --raw
  pass-through for CLI-surface consistency. --raw is a no-op in these
  CLIs, but the flag is threaded through so users get uniform behaviour.
- All five files: --raw flag parsed from $ARGUMENTS and passed verbatim
  to the underlying scanner CLI when present.

tests/commands/group-a-shape.test.mjs (new, +5 tests, 767 → 772):
  - structural: every file has a bash invocation block, Read tool
    reference, and --raw/$ARGUMENTS plumbing
  - findings-renderers only: at least one humanized field referenced;
    no hardcoded "[grade] grade is..." prose tables
This commit is contained in:
Kjell Tore Guttormsen 2026-05-01 19:41:08 +02:00
commit 79b6e29073
6 changed files with 174 additions and 62 deletions

View file

@ -19,9 +19,13 @@ Quick, deterministic configuration health scorecard. No agents needed — runs a
## Implementation
### Step 1: Determine target
### Step 1: Determine target and flags
Parse `$ARGUMENTS` for a path (default: current working directory). Resolve relative paths.
Split `$ARGUMENTS` into a path and flags. Path is the first non-flag argument (default: current working directory). Resolve relative paths. Recognized flags:
- `--raw` — pass-through to the scanner; produces v5.0.0 verbatim output (bypasses the humanizer). Power-user mode for byte-stable diffs and machine consumption.
- `--drift` — append a "Configuration Drift" section (see Step 5).
- `--plugin-health` — append a "Plugin Health" section (see Step 5).
Tell the user:
@ -33,32 +37,34 @@ Running quick assessment{if path != cwd: " on `{path}`"}...
### Step 2: Run posture scanner
Run silently — all output goes to a file:
Run silently — JSON goes to a file, the humanized scorecard prints to stderr (default mode). The humanized stderr scorecard already includes the grade headline and area-score lines in plain language, so render those directly rather than re-deriving prose tables.
```bash
node ${CLAUDE_PLUGIN_ROOT}/scanners/posture.mjs <target-path> --json --output-file /tmp/config-audit-posture-$$.json 2>/dev/null; echo $?
RAW_FLAG=""
if echo "$ARGUMENTS" | grep -q -- "--raw"; then RAW_FLAG="--raw"; fi
node ${CLAUDE_PLUGIN_ROOT}/scanners/posture.mjs <target-path> --output-file /tmp/config-audit-posture-$$.json $RAW_FLAG 2>/tmp/config-audit-posture-stderr-$$.txt; echo $?
```
If exit code is non-zero, tell the user: "Assessment couldn't complete. Check that the path exists and contains Claude Code configuration files."
If `--raw` was passed, treat the captured stderr as v5.0.0-shape verbatim text and present it as-is in a code block; skip the humanized rendering steps below.
### Step 3: Read and interpret results
Read the JSON output file using the Read tool. Extract:
- `overallGrade`, `opportunityCount`
- `areas[]` — each with `name`, `grade`, `score`, `findingCount`
- `scannerEnvelope.scanners[].findings[]` — when surfacing individual findings, prefer the humanizer-provided fields: `userImpactCategory` (e.g., "Configuration mistake", "Wasted tokens"), `userActionLanguage` (e.g., "Fix this now", "Fix soon", "Optional cleanup"), and `relevanceContext` ("affects-everyone", "affects-this-machine-only", "test-fixture-no-impact"). These let you group and prioritize without hardcoded severity-to-prose mappings.
Also Read the captured stderr file — its body is the humanized scorecard (grade headline, area-score block, opportunity hint). You can present it verbatim or interleave its lines with the JSON-driven table.
### Step 4: Present the scorecard
```markdown
**Health: {overallGrade}** | {qualityAreaCount} areas scanned
{grade-based context — pick ONE:}
- A: "Your configuration is correct and well-maintained."
- B: "Solid configuration with minor improvements available."
- C: "Working configuration with some issues worth addressing."
- D: "Configuration needs attention in several areas."
- F: "Significant issues found — addressing these will improve your experience."
{Use the headline line from the humanized stderr scorecard — it carries grade-context prose already (e.g., " Health: A (97/100) — Healthy setup, only minor polish needed"). Do not re-derive an A/B/C/D prose table here; the humanizer owns that vocabulary.}
### Area Scores
@ -73,22 +79,13 @@ Read the JSON output file using the Read tool. Extract:
### What's next
```
**Grade A or B:**
```
Your configuration health is strong. Re-run after major changes to catch regressions.
For feature recommendations: `/config-audit feature-gap`
```
Group "what's next" suggestions by `userActionLanguage` from the humanized findings:
**Grade C:**
```
Run `/config-audit fix` to auto-fix what's possible, then `/config-audit plan` for a prioritized improvement path.
```
- Findings tagged "Fix this now" / "Fix soon" → suggest `/config-audit fix` first, then `/config-audit plan`.
- Findings tagged "Fix when convenient" / "Optional cleanup" → suggest `/config-audit feature-gap` and routine maintenance.
- No high-urgency findings → suggest `/config-audit feature-gap` for opportunities and re-running posture after major config changes.
**Grade D or F:**
```
Start with `/config-audit fix` — it handles the most impactful issues automatically with backup and rollback.
Then run `/config-audit plan` for a step-by-step path to a better configuration.
```
Avoid hardcoded grade-to-prose ladders here — the humanized scorecard headline already supplies grade context, and `userActionLanguage` supplies finding-level urgency.
### Step 5: Optional sections