feat(humanizer): scenario read-test corpus + runner (SC-4) [skip-docs]
Step 9 of v5.1.0 humanizer Wave 4. Adds tests/scenario-read-test.mjs
runner, tests/scenario-read-test.test.mjs wrapper, and 5 scenario
fixtures in tests/scenarios/ that feed deterministic raw findings
through humanizeFinding and assert the humanized
title/description/recommendation match brief-owner-approved regex
patterns encoding the ground-truth what/why/whatNext answers.
Corpus selection (per brief criteria):
- 01-tok-cascade.json - TOK/CPS category (token efficiency)
- 02-cps-volatile.json - TOK/CPS category (cache prefix stability)
- 03-cnf-conflict.json - CNF category (conflicts)
- 04-gap-no-claude-md.json - GAP category (feature gap)
- 05-set-invalid-json.json - SET category, AND its v5.0.0 title +
description carry tier1 'invalid' (the brief criterion 'one finding
whose v5.0.0 description uses a forbidden word').
Runner mechanics:
- Loads scenarios matching ^\\d{2}-[a-z0-9-]+\\.json$ in sorted order.
- Calls humanizeFinding(scannerInput) and matches each humanized field
against its declared pattern (case-insensitive regex).
- Verifies humanizer-added structural fields (userImpactCategory,
userActionLanguage, relevanceContext) are non-empty strings.
- Per session decision (1a) acceptance is deterministic regex matching
without a runtime human approval gate.
Wrapper adds 3 tests: scenario-match (binds runner to node --test),
category-coverage (TOK/CPS, CNF, GAP, SET all present), and
tier1-presence (at least one v5.0.0 title or description contains a
tier1 forbidden word).
Tests: 736 to 739 (+3 SC-4 tests). Full suite passes.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
parent
c5c937e94e
commit
8b146bf489
7 changed files with 373 additions and 0 deletions
29
plugins/config-audit/tests/scenarios/01-tok-cascade.json
Normal file
29
plugins/config-audit/tests/scenarios/01-tok-cascade.json
Normal file
|
|
@ -0,0 +1,29 @@
|
|||
{
|
||||
"_meta": {
|
||||
"comment": "Scenario 01: TOK CLAUDE.md cascade exceeds 10k tokens. Covers the TOK/CPS (token-efficiency) category. v5.0.0 title contains tier3 'CLAUDE.md' — humanizer rewrites to non-jargon prose."
|
||||
},
|
||||
"findingId": "CA-TOK-001",
|
||||
"scannerInput": {
|
||||
"id": "CA-TOK-001",
|
||||
"scanner": "TOK",
|
||||
"severity": "high",
|
||||
"title": "CLAUDE.md cascade exceeds 10k tokens per turn",
|
||||
"description": "Total CLAUDE.md cascade is 12450 tokens across 4 files.",
|
||||
"file": ".claude/CLAUDE.md",
|
||||
"line": null,
|
||||
"evidence": "tokens=12450; files=4",
|
||||
"recommendation": "Reduce CLAUDE.md cascade size. Move content into modular skill files or trim verbose sections.",
|
||||
"category": null,
|
||||
"autoFixable": false
|
||||
},
|
||||
"expectedHumanized": {
|
||||
"titlePattern": "instruction files take a lot of space on every turn",
|
||||
"descriptionPattern": "10,000 tokens|every turn carries that weight",
|
||||
"recommendationPattern": "Trim or split the largest files"
|
||||
},
|
||||
"groundTruth": {
|
||||
"what": "The instruction files Claude reads on every turn are large enough that they slow each response.",
|
||||
"why": "The combined size has gone above 10,000 tokens. That weight loads on every turn and leaves less room for the conversation itself.",
|
||||
"whatNext": "Trim or split the largest files. The details show which file contributes most."
|
||||
}
|
||||
}
|
||||
Loading…
Add table
Add a link
Reference in a new issue