Plain-language UX humanizer release. Default output of all 18 commands now leads with prose; technical IDs surface at end-of-line as references rather than headlines. Scanner internals are unchanged; humanization is a pure output-time transform applied at the rendering layer. Highlights: - New scanner-lib modules: humanizer.mjs, humanizer-data.mjs (TRANSLATIONS for 13 scanner prefixes) - New --raw flag threaded through every CLI for byte-stable v5.0.0 verbatim output (--json unchanged from v5.0.0, also byte-stable) - 5 user-impact categories, 5 action-language phrases, 3 relevance contexts - Self-audit terminal output also humanized; --json path unchanged - 21 command and agent templates updated for humanized rendering with --raw passthrough - 635 → 792 tests (+157) including SC-3 forbidden-words lint, SC-4 scenario read-test, SC-5/6/7 backwards-compat snapshots Migration: - Existing --json automation: zero changes required (envelope is byte-stable with v5.0.0; humanizer fields are bypassed) - stderr-scraping tooling: review default mode (now uses prose); pass --raw for v5.0.0 verbatim - No scanner-internal changes (IDs, severity ladders, scoring weights, area scorecards all unchanged) Verification: - 792/792 tests pass - self-audit configGrade A (97), pluginGrade A (100), readmeCheck passed - README badge: tests-635+ → tests-792+
47 KiB
Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
[5.1.0] - 2026-05-01
Summary
Plain-language UX humanizer release. Default output of all 18 commands now leads with prose; technical IDs surface at end-of-line as references rather than headlines. Non-expert users — the bulk of the OSS audience — now read findings like "Fix soon: The same automation is set up more than once" instead of "[high] CA-CNF-001: Hook duplicate event registration". Scanner internals are unchanged; humanization is a pure output-time transform applied at the rendering layer. The --raw flag preserves v5.0.0 verbatim output for tooling that scrapes stderr; --json is unchanged from v5.0.0 and remains byte-stable for programmatic consumption.
Delivered across 6 waves (Wave 0 baseline → Wave 1 humanizer module → Wave 2 test re-anchoring → Wave 3 CLI wiring → Wave 4 contract tests → Wave 5 templates/agents → Wave 6 release).
Added
scanners/lib/humanizer.mjs— pure-function output translator:humanizeFinding,humanizeFindings,humanizeEnvelope,computeRelevanceContext. Never mutates inputs. Adds three additive fields per finding (userImpactCategory,userActionLanguage,relevanceContext) and replaces title/description/recommendation when a translation is available; falls through to originals otherwise.scanners/lib/humanizer-data.mjs— TRANSLATIONS table for 13 scanner prefixes (CML, SET, HKV, RUL, MCP, IMP, CNF, COL, TOK, CPS, DIS, GAP, PLH). Three-step lookup per finding: exact title → regex pattern →_default→ fall through to scanner original.--rawflag threaded through every CLI:posture.mjs,scan-orchestrator.mjs,token-hotspots-cli.mjs,manifest.mjs,whats-active.mjs,fix-cli.mjs,drift-cli.mjs,self-audit.mjs. Bypasses humanizer; emits byte-stable v5.0.0 verbatim output.- User-impact categories (5 labels): Configuration mistake, Conflict, Wasted tokens, Missed opportunity, Dead config. Mapped from scanner prefix.
- Action-language phrases (5 labels): Fix this now, Fix soon, Fix when convenient, Optional cleanup, FYI. Mapped from severity.
- Relevance context (3 values):
test-fixture-no-impact,affects-this-machine-only,affects-everyone. Computed from finding's file path — basenames matching*.local.*and paths containing/tests/fixtures/are recognized. - Self-audit terminal humanization —
formatSelfAudit()routes throughhumanizeEnvelope. JSON path (--json) is unchanged; humanization applies only to the prose terminal render. - Forbidden-words lint (
tests/lint-forbidden-words.json+ runner) — 3-tier vocabulary blocklist enforced over default-mode output, ensuring humanized prose stays in plain language. - Scenario read-test (
tests/scenario-read-test.mjs+ 5 scenarios) — corpus-driven readability check covering broken hook, duplicate keys, stale @import, dead tool, oversized cascade. tests/snapshots/v5.0.0/+tests/snapshots/v5.0.0-stderr/— frozen byte-equal references for SC-6 (--json) and SC-7 (--raw) backwards-compatibility tests across 8 CLIs.tests/snapshots/default-output/— humanized-prose snapshots for SC-5 default-output stability.
Changed
- Default output of all 18 commands now uses plain-language descriptions. Findings group by user-impact category; titles lead with prose; technical IDs (
CA-CML-001,CA-TOK-005, …) surface at end-of-line as references. - All 21 command and agent templates updated to render humanized output by default and pass
--rawthrough when the user requests v5.0.0 verbatim mode. - CLI flag inventory — every CLI now accepts
--raw(new) in addition to--json(existing, unchanged).--output-file <path>still writes raw v5.0.0-shape JSON regardless of mode (humanizer-bypassed, posture-specific).
Migration
- No action required for existing automation that consumes
--json— the JSON envelope shape is byte-stable with v5.0.0 and humanizer fields are bypassed in--jsonand--rawpaths. - Tooling that scrapes stderr from default mode (e.g.,
posture.mjs's scorecard) needs review — default stderr now uses prose vocabulary. Pass--rawfor byte-stable v5.0.0 verbatim stderr. - No scanner-internal changes. Finding IDs, severity ladders, scoring weights, and area scorecards are unchanged. Upgrades are presentation-layer only.
Test count
- 635 → 792 tests across 52 test files (+157 humanizer-tester through Waves 0–5).
- New top-level tests:
json-backcompat.test.mjs,raw-backcompat.test.mjs,scenario-read-test.test.mjs,snapshot-default-output.test.mjs. - New lib tests:
humanizer.test.mjs,humanizer-data.test.mjs,scoring-humanizer.test.mjs. - New scanner tests:
posture-humanizer.test.mjs,scan-orchestrator-humanizer.test.mjs,cli-humanizer.test.mjs.
Out of scope (deferred to v5.1.1+)
- Posture
--output-filehumanization —posture.mjsdoes not callhumanizeEnvelope, so files written via--output-fileare raw v5.0.0-shape JSON. Future revision: drop--output-filefrom command templates or add a--humanized-jsonflag. - Knowledge cross-references (Step 17 of plan) — not delivered per user decision (2a).
- Scoring scorecard JSON headline emission — currently rendered prose-side only; command templates that want to skip stderr parsing would benefit.
Verification
- 792/792 tests pass (
node --test 'tests/**/*.test.mjs') node scanners/self-audit.mjs --json --check-readmereturnsconfigGrade: A(97),pluginGrade: A(100),readmeCheck.passed: true- README badge updated:
tests-635+→tests-792+
[5.0.0] - 2026-05-01
Summary
Reality-based token-optimization release. v4.0.0 shipped Opus-4.7 token surfaces aligned to a Sonnet-era cost model; v5.0.0 rebuilds the foundations against verified Opus-4.7 cost dynamics. Three pillars: honest token estimation (severity-weighted scoring, MCP estimates 15 → 500+, optional --accurate-tokens API calibration), new structural scanners (cache-prefix stability, dead tool grants, plugin collisions), and new diagnostic surfaces (/config-audit manifest, /config-audit tokens extended, knowledge-base rensing aligned to Opus 4.7 cache dynamics).
Consolidated from 5.0.0-alpha.1 (F1-F5 token-economy round), 5.0.0-alpha.2 (M1, M2, M4-M6, F6, F7 structural gaps + README self-audit), 5.0.0-beta.1 (N1-N4, N6 new scanners + manifest CLI), and 5.0.0-rc.1 (M7, M8 knowledge rensing + N5 tokenizer calibration).
Added
- 3 new scanners (9 → 12 deterministic):
- CPS — Cache-Prefix Stability (
CA-CPS-NNN): volatile content in lines 31–150 of CLAUDE.md cascade, beyond TOK Pattern A's top-30 window. Volatile-pattern set extends Pattern A with shell-exec lines (!prefix) and${VAR}substitutions. - DIS — Disabled-In-Schema (
CA-DIS-NNN): tools listed in BOTHpermissions.denyANDpermissions.allow. Tool identity uses bare name (Bash(npm:*)andBashare the same tool). Severity low. - COL — Cross-Plugin Skill Collision (
CA-COL-001): plugin-vs-plugin same skill name → low; user-vs-plugin → medium.details.namespacespayload identifies conflicting sources.
- CPS — Cache-Prefix Stability (
- TOK extensions:
- CA-TOK-005 MCP tool-schema budget: per-server tiered finding (< 20 none, 20–49 low, 50–99 medium, 100+ high; null low + "tool count unknown"). Scoped to project-local
.mcp.json. - Pattern E — Oversized cascade: medium when
activeConfig.claudeMd.estimatedTokens > 10_000. - Pattern F — Bloated SKILL.md description: low when frontmatter
description > 500 chars(loads every turn). Scoped todiscovery.files.
- CA-TOK-005 MCP tool-schema budget: per-server tiered finding (< 20 none, 20–49 low, 50–99 medium, 100+ high; null low + "tool count unknown"). Scoped to project-local
/config-audit manifest+scanners/manifest.mjsCLI — single ranked table of every system-prompt token source (CLAUDE.md cascade, plugins, skills, MCP servers, hooks) sorted DESC byestimated_tokens. CLAUDE.md per-file tokens distributed proportional to bytes.--accurate-tokensflag ontoken-hotspots-cli.mjs(N5): whenANTHROPIC_API_KEYis set, calls Anthropic'scount_tokensfor the top 3 hotspots and populatesoutput.calibration = { actual_tokens, source: 'count_tokens_api', sampled_hotspots: 3 }. When absent:calibration = { skipped: 'no-api-key' }plus stderr warning.scanners/lib/tokenizer-api.mjs—count_tokenswrapper. 5s AbortController timeout. Exponential backoff on 429 (3 retries: 1s/2s/4s). API key masked to${key.slice(0,8)}...in every error; HTTP body never included in errors (it may echo the key on auth failures).maskKey()exported.--with-telemetry-recipeflag on the same CLI (M7): emitstelemetry_recipe_pathfield pointing toknowledge/cache-telemetry-recipe.md.knowledge/cache-telemetry-recipe.md(M7): manualjqrecipe summingcache_read_input_tokens+cache_creation_input_tokensper turn from session transcripts. Hit-rate interpretation table.'mcp'kind onestimateTokens(F2): active MCP servers estimate ≥ 500 tokens (base + schema overhead) instead of v4's flat 15. Optional{toolCount}raises to500 + toolCount × 200.- MCP tool-count detection (M1):
readActiveMcpServersresolves count via cache →node_modules/<pkg>/package.json→{toolCount: null, toolCountUnknown: true}fallback. additionalDirectoriessettings key (M6): added toKNOWN_KEYS; new low-severity finding when length > 2.- HKV verbose hook output (M5): low-severity finding when referenced hook script contains > 50
console.log/process.stdout.writelines (static, no execution). self-audit --check-readmeflag (F6): filesystem counts compared against README badges. HelpercheckReadmeBadges(pluginDir). Step 28 of v5 plan reconciled all badges.scoringVersion: 'v5'field onscoreByAreaoutput for cross-version drift detection.WEIGHTSnamed export fromscanners/lib/severity.mjs(frozen).detailsfield on findings (output.mjs:finding()): optional structured payload for scanner-specific data (used by COL).- Plugin Hygiene as 10th quality area (from COL). Posture JSON now reports 10 areas.
- TOK-readActiveConfig integration (F1): one hotspot per active MCP server;
result.activeConfigsummary (claudeMd cascade tokens, mcpServerCount, pluginCount, skillCount); try/catch fallback when scope-limited.
Changed
- F3 —
scoreByAreais severity-weighted. Penalty =Σ count[s] × WEIGHTS[s];passRate = max(0, 100 − penalty / max(10, findingCount × 4) × 100). Lows no longer crater an area's grade; criticals/highs do.baseline-all-afixture remains all-A (no critical/high present). - F7 — TOK pattern severities recalibrated for tokens-per-turn impact: Pattern A
medium → high, Pattern Blow → medium, Pattern Cmedium → low. Each finding carries acalibration_noteevidence field documenting the heuristic basis. scoreByAreadeduplicates by area name (N3 prep): TOK + CPS share "Token Efficiency"; SET + DIS share "Settings". Combined row with merged finding counts.- M8 — knowledge rensing: replaced "Keep CLAUDE.md under 200 lines" in
knowledge/configuration-best-practices.mdwith cache-stability guidance (first 30 lines stable, volatile content below the cache threshold). Footnote explains the 200-line rule was a Sonnet-era adherence heuristic; Opus 4.7 uses prompt-cache structure as the dominant cost lever. Cross-referencesknowledge/opus-4.7-patterns.md. commands/tokens.mdnext-steps: documents--with-telemetry-recipeas the cache-verification path.- Scanner count: 9 → 12. Command count: 17 → 18. Knowledge: 7 → 8. Quality areas: 8 → 10.
.gitignore— unignore rules fortests/fixtures/**/node_modules/so themcp-tool-heavyfixture stays under version control.
Removed
- F4 — TOK hotspot padding loop and
takedead-code. Hotspots may now contain fewer than 3 entries for tiny projects (the honest answer); contract still bounds at ≤ 10. - F5 — Pattern D /
CA-TOK-004(sonnet-era signature). Catalogue entry removed fromknowledge/opus-4.7-patterns.mdandcommands/tokens.md. Suppression entries forCA-TOK-004are now no-ops.
Breaking changes
- F2 — MCP token estimates jump from flat 15 to ≥ 500. Token Efficiency grades for projects with MCP servers may shift.
whats-activetotals report higher numbers. Documented incommands/posture.mdnext-steps. - F3 —
scoreByAreais severity-weighted. Posture JSON consumers readingareas[*].scorewill see different values for non-clean configs. Useresult.scoringVersion === 'v5'to detect the change. Drift comparisons across v4↔v5 baselines may show artificial deltas — re-baseline after upgrade. - F5 — Pattern D /
CA-TOK-004no longer emitted. Existing exactCA-TOK-004suppression entries are harmless but obsolete. - N1 suppression backward-compat —
CA-TOK-*glob now also matchesCA-TOK-005. To preserve prior behavior of suppressing only patterns A/B/C, replace the glob with explicit IDs:
A one-time runtime warning for this case is a v5.0.1 candidate.CA-TOK-001 CA-TOK-002 CA-TOK-003 - Posture areas count: 9 → 10 (Plugin Hygiene from COL). Consumers hard-coding 9 must update.
Migration notes
CA-TOK-*glob suppressions: explicit-ID list recommended if CA-TOK-005 should not be suppressed.CA-TOK-004exact-ID suppression entries: safe to remove.- Drift baselines created against v4 should be re-saved post-upgrade to avoid artificial F3 weighting deltas.
- Posture JSON consumers must update any hardcoded
areas.length === 8or=== 9assertions to>= 10.
Tests
- 543 → 635 (+92): F1-F7 (alpha rounds = +43), N1-N4 + N6 (beta = +39), M7 + M8 + N5 (rc = +10). 36 test files (12 lib + 23 scanner + 1 hook).
- New fixtures:
tok-active-config/,additional-dirs-many/,additional-dirs-ok/,large-cascade/,small-cascade/,skill-bloated/,skill-tight/,mcp-tool-heavy/(with mockednode_modules/),hooks-verbose/,hooks-quiet/,readme-desynced/,mcp-budget/{14,25,60,120,unknown}-tools/,volatile-mid-section/{volatile-line-60,volatile-line-200}/,denied-tools-in-schema/,collision-plugins/fake-home/(plugin-a + plugin-b + plugin-c + user-level review skill). - New test files:
tests/scanners/manifest.test.mjs,tests/scanners/cache-prefix.test.mjs,tests/scanners/disabled-in-schema.test.mjs,tests/scanners/collision.test.mjs,tests/scanners/accurate-tokens.test.mjs.
Notes
mock.methodagainst ESM module exports does not work (Node 18+ ESM read-only export bindings). v5 tests useglobalThis.fetchmocking for--accurate-tokensinstead — equivalent coverage at the actual external-dependency boundary.- Plugin-vs-built-in collision detection is intentionally not implemented. Step 22a research spike (
docs/v5-namespace-research.md, gitignored) could not verify Claude Code's resolution behavior when a plugin command shares a name with a built-in. Treated as info-only; v5.0.1 candidate. - README/CLAUDE.md badge reconciliation done in Step 28 (this release).
self-audit --check-readmePASSES against the filesystem. Test count counter switched from file-count to test-case count via subprocessnode --testparse. hotspot.pathexposed on file-backed hotspots (Step 30 fix). The rc.1--accurate-tokensimplementation looked uphotspot.pathbut the scanner only emittedsource. File-backed hotspots now carrypath(absolute path); MCP-server hotspots leave it unset (they are virtual entries representing runtime tool-schema cost, not file content).
SC-6b release-gate result (verified 2026-05-01)
- PASS — 0.85% under-estimation against real
count_tokensAPI. - Fixture:
tests/fixtures/marketplace-large/. Top-3 hotspots = 1 file-backed (CLAUDE.md) + 2 MCP virtuals. MCP entries skipped per design (no readable content; their tokens are formula-based at 500 + toolCount × 200). CLAUDE.mdactual: 589 tokens (Anthropiccount_tokens,claude-opus-4-7). Estimated: 594 tokens (byte heuristic at 4 bytes/token viaestimateTokens). Delta: −5 tokens, −0.85% — well within the ±5% gate.- No tuning of
estimateTokensheuristic required for v5.0.0.
[5.0.0-rc.1] - 2026-05-01
Summary
Release candidate for v5.0.0 — knowledge rensing and tokenizer calibration. Three deliverables: M8 (Sonnet-era → Opus 4.7 best-practices rewrite), M7 (cache-telemetry recipe in knowledge/ plus an opt-in CLI flag), and N5 (--accurate-tokens API calibration via Anthropic's count_tokens endpoint).
Added
- N5 —
--accurate-tokensflag onscanners/token-hotspots-cli.mjs. WhenANTHROPIC_API_KEYis set, the CLI calls Anthropic'scount_tokensendpoint for the top 3 hotspots and populatesoutput.calibration = { actual_tokens, source: 'count_tokens_api', sampled_hotspots: 3 }. When the key is absent,calibration = { skipped: 'no-api-key' }and a stderr warning is emitted. Designed for the manual SC-6b release-gate verification, not routine use. scanners/lib/tokenizer-api.mjs— wrapper aroundcount_tokenswith a 5-second AbortController timeout, exponential-backoff retry on HTTP 429 (max 3 retries: 1s, 2s, 4s), and required headers (x-api-key,anthropic-version: 2023-06-01,content-type). API key is masked to${key.slice(0,8)}...in every error message and every thrown error; non-429 HTTP errors throw status code only — response body is never included (it may echo the key on auth failures).maskKey()is exported for callers that need safe logging.- M7 —
knowledge/cache-telemetry-recipe.md(new). Manualjqrecipe for verifying prompt-cache hit rate from Claude Code session transcripts (~/.claude/projects/<slug>/*.jsonl). Sumscache_read_input_tokensandcache_creation_input_tokensper turn and reports a hit-rate ratio. Recipe-form (not bundled scanner) keeps the project's "no transcript-parsing as core feature" non-goal intact while giving users a runtime escape hatch. - M7 —
--with-telemetry-recipeflag on the same CLI. When passed, emitstelemetry_recipe_pathin the JSON output pointing to the recipe file. Without the flag, output is unchanged. Committed as a default deliverable, opt-in at invocation time.
Changed
- M8 — knowledge-base rensing: replaced the "Keep CLAUDE.md under 200 lines" rule in
knowledge/configuration-best-practices.mdwith cache-stability guidance (first 30 lines stable, volatile content below the cache threshold). Added a footnote that the 200-line rule was a Sonnet-era adherence heuristic; Opus 4.7 uses prompt-cache structure as the dominant cost lever. Cross-referencesknowledge/opus-4.7-patterns.md. commands/tokens.mdnext-steps: documents--with-telemetry-recipeas the cache-verification path after a structural fix.
Tests
- 625 → 635 (+10):
--with-telemetry-recipe(×2), tokenizer-api unit tests (×6 — masking, body-leak protection, AbortController signal, 429 retry, header set, fetch mock happy path),--accurate-tokensno-key subprocess test (×1), absent-flag negative test (×1). - New file:
tests/scanners/accurate-tokens.test.mjs. No new fixtures (re-usesmarketplace-large).
Notes
- SC-6b release gate is NOT closed by these commits. Step 26's tests use mocked
globalThis.fetchto verify the integration contract; ±5% accuracy against realcount_tokensrequires a live API key and must be verified manually before tagging v5.0.0 in Session 5. - The plan's specified
mock.method(tokenizerApi, 'callCountTokensApi', ...)pattern collides with ESM read-only export bindings in Node 18+. Tests mock at theglobalThis.fetchboundary instead — equivalent coverage, no module-export rebinding required. - README/CLAUDE.md badge counts and
plugin.jsonversion still target v4.0.0; Step 28+29 will sync those during the release wrap. [skip-docs]tag on the N5 feat commit; M7 and M8 aredocs(...)commits and don't need it.
[5.0.0-beta.1] - 2026-05-01
Summary
First v5.0.0 beta — new scanners. Five new finding sources land: MCP tool-schema budget (CA-TOK-005), system-prompt manifest CLI/command (/config-audit manifest), cache-prefix stability (CPS), disabled-tools-still-in-schema (DIS), and cross-plugin/user-vs-plugin skill collision (COL/CA-COL-001). Plugin Hygiene becomes a 10th area-scorecard column.
Added
- N1 —
CA-TOK-005MCP tool-schema budget: per-server tiered finding inside the TOK scanner. Thresholds —< 20no finding,20–49low,50–99medium,100+high;null(manifest unparseable) low + "tool count unknown" message. Scoped to project-local.mcp.jsonto keep/config-audit <path>actionable. Recommendation links to the Step 25 cache-telemetry recipe. - N2 —
/config-audit manifest: new slash command +scanners/manifest.mjsCLI. Renders a single ranked table of every token source (CLAUDE.md cascade, plugins, skills, MCP servers, hooks) sorted DESC byestimated_tokens. ReusesreadActiveConfig; CLAUDE.md per-file tokens are distributed proportional to bytes. - N3 — CPS scanner (
CA-CPS-NNN): Cache-Prefix Stability Analyzer. Walks the CLAUDE.md cascade and flags volatile content between lines 31 and 150 — beyond TOK Pattern A's top-30 territory. Volatile-pattern set extends Pattern A with shell-exec lines (!prefix) and${VAR}substitutions. Severity medium per finding. Skips lines 1–30 (Pattern A's range). - N4 — DIS scanner (
CA-DIS-NNN): Disabled-In-Schema Detector. Detects tools that appear in BOTHpermissions.denyandpermissions.allowwithin the samesettings.json. The deny list wins, so allow entries are dead config but still load every turn. Tool identity is the bare name (everything before();Bash(npm:*)andBashare treated as the same tool. Severity low. - N6 — COL scanner (
CA-COL-001): Cross-Plugin Skill Collision detector. Plugin-vs-plugin same skill name → low. User-vs-plugin same skill name → medium. Findings carrydetails.namespacesarray with{source, name, path}for every conflicting source. detailsfield on findings:output.mjs:finding()helper now passes through optionaldetailsfor scanner-specific structured payloads (used by COL).- "Plugin Hygiene" area (10th in scorecard): COL contributes here. Posture JSON now reports 10 areas instead of 9.
Changed
scoreByAreadeduplicates by area name: when multiple scanners share an area (TOK + CPS → "Token Efficiency", SET + DIS → "Settings"), they produce one combined row with merged finding counts. Existing 9-area contract preserved for non-Plugin-Hygiene areas.
Known breaking changes
- Suppression backward-compat —
CA-TOK-*glob now also matchesCA-TOK-005. Existing.config-audit-ignoreentries that suppress TOK findings via theCA-TOK-*glob will silently include CA-TOK-005 (MCP budget). To preserve the prior behavior of suppressing only patterns A/B/C, replace the glob with explicit IDs:
A one-time runtime warning for this case is out of scope for v5.0.0 — it is a candidate for v5.0.1.CA-TOK-001 CA-TOK-002 CA-TOK-003 - Plugin-vs-built-in collision is intentionally not implemented. The Step 22a research spike could not verify Claude Code's resolution behavior when a plugin command shares a name with a built-in (
/help,/clear,/init,/review,/config,/cost,/security-review). Treated as info-only in this release; a follow-up v5.0.1 ticket may add an opt-in check.
Tests
- 586 → 625 (+39): N1 (×7), N2 (×11), N3 (×7), N4 (×6), N6 (×8).
- New fixtures:
mcp-budget/{14,25,60,120,unknown}-tools/,volatile-mid-section/{volatile-line-60,volatile-line-200}/,denied-tools-in-schema/,collision-plugins/fake-home/(plugin-a + plugin-b + plugin-c + user-level review skill).
Notes
[skip-docs]tag used on every feat commit — README/CLAUDE.md badge counts (scanner count, command count, test count) and the architecture sections are intentionally fenced off until Session 5 (Step 28). This keeps the v5 plan's session boundaries clean even when the Forgejopre-commit-docs-gatehook would otherwise block these commits.
[5.0.0-alpha.2] - 2026-05-01
Summary
Second v5.0.0 alpha — structural gaps + README self-audit. TOK pattern severities recalibrated for tokens/turn impact (F7), three new findings cover settings/skills/cascade structure (M2, M4, M6), MCP tool-count detection wired (M1), HKV gains a verbose-output check (M5), and self-audit grows a --check-readme flag (F6).
Added
- F7 — TOK severity recalibration: Pattern A (cache-breaking volatile top)
medium → high, Pattern B (redundant permissions)low → medium, Pattern C (deep imports)medium → low. Each finding now carries acalibration_noteevidence field documenting the heuristic basis. - M6 —
additionalDirectoriessettings key: added toKNOWN_KEYSso it no longer trips "unknown settings key". New low-severity finding whenadditionalDirectories.length > 2. - M4 — TOK Pattern E: medium-severity finding when
activeConfig.claudeMd.estimatedTokens > 10_000— flags cascades that bleed budget every turn. - M2 — TOK Pattern F: low-severity finding for project-local
SKILL.mdwhose frontmatterdescriptionexceeds 500 characters (description loads on every turn even when the body does not). Scoped todiscovery.files; user/plugin skills out of project scope are not flagged. - M1 — MCP tool-count detection:
readActiveMcpServersnow resolves tool count via cache →node_modules/<pkg>/package.json→{toolCount: null, toolCountUnknown: true}fallback. Tool count drivesestimateTokensper server. - M5 — HKV verbose hook output: new low-severity finding when a referenced hook script contains > 50
console.log/process.stdout.writelines (static heuristic, no execution). - F6 —
self-audit --check-readmeflag: filesystem counts (scanners, commands, agents, hooks, tests, knowledge) compared against README badge values. Helper export:checkReadmeBadges(pluginDir).
Changed
- TOK severities (F7) — see Added. Posture aggregates that depended on Pattern A being
mediumwill now reflect the higher-impact rating. .gitignore— added unignore rules sotests/fixtures/**/node_modules/are tracked. Required by themcp-tool-heavyfixture.
Tests
- 563 → 586 (+23): F7 table-driven (×6), M6 (×3), M4 (×2), M2 (×2), M1 (×4), M5 (×2), F6 (×4).
- New fixtures:
additional-dirs-many/,additional-dirs-ok/,large-cascade/,small-cascade/,skill-bloated/,skill-tight/,mcp-tool-heavy/(with mockednode_modules/),hooks-verbose/,hooks-quiet/,readme-desynced/.
Notes
result.readmeCheck.passed === trueis not required during alpha/beta phases. The real plugin's own check is currently red (scanners10 vs README 9,tests31 vs README 543) — reconciliation deferred to Session 5 Step 28 (README sync).[skip-docs]tag used on every commit — README/CLAUDE.md badge counts and architecture text are intentionally fenced off until Session 5.
[5.0.0-alpha.1] - 2026-05-01
Summary
First v5.0.0 alpha — token-economy round, F1-F5. The TOK scanner now consumes readActiveConfig (per-MCP-server hotspots, claudeMd cascade tokens), severity weighting replaces flat finding counts in scoreByArea, and MCP servers no longer estimate at a flat 15 tokens. Pattern D (CA-TOK-004 sonnet-era signature) removed — too noisy, not actionable.
Added
'mcp'kind forestimateTokens(F2): an active MCP server now estimates ≥ 500 tokens (base protocol + schema overhead) instead of the v4 flat 15. Optional{toolCount}raises the estimate to500 + toolCount * 200once Step 14 wires tool-count detection.- TOK ↔ readActiveConfig integration (F1): the TOK scanner emits one hotspot per active MCP server, sums their tokens into
total_estimated_tokens, and exposesresult.activeConfig(claudeMd cascade tokens, mcpServerCount, pluginCount, skillCount). scoringVersion: 'v5'field onscoreByAreaoutput for cross-version drift detection.WEIGHTSnamed export fromscanners/lib/severity.mjs(Object.freeze).
Changed
- BREAKING (intentional, F3):
scoreByAreais now severity-weighted. Penalty =Σ count[s] * WEIGHTS[s];passRate = max(0, 100 - penalty / max(10, findingCount * 4) * 100). Lows no longer crater an area's grade; a single high or critical consumes a large fraction of budget.baseline-all-afixture remains all-A (no critical/high on that fixture). - BREAKING (intentional, F2): MCP server token estimates jump from a flat 15 to ≥ 500.
whats-activetotals and TOK hotspots will report higher numbers for any project with active MCP servers. - BREAKING (intentional, F5): Pattern D /
CA-TOK-004(sonnet-era signature) is no longer emitted. Suppression entries forCA-TOK-004are now no-ops; downstream tools that filter on the ID should drop it. The catalogue entry was removed fromknowledge/opus-4.7-patterns.mdandcommands/tokens.md. - Hotspots contract (F4): the v4 padding loop and
takedead-code are gone. Hotspots may now contain fewer than 3 entries for tiny projects (the honest answer); contract still bounds at ≤ 10.
Migration notes
CA-TOK-*glob suppression entries continue to suppress 001-003. Existing exactCA-TOK-004entries are harmless but obsolete — remove them at convenience.- Posture/JSON consumers reading
areas[*].scorewill see different values for non-clean configs. Useresult.scoringVersion === 'v5'to detect.
Tests
- 543 → 563 across the alpha.1 commits (+9 severity-weighting/scoring, +4 estimateTokens 'mcp', +1 MCP caller migration, +3 readActiveConfig integration, +2 hotspots-uniqueness, +2 sonnet-era zero-finding).
- New fixture
tests/fixtures/tok-active-config/— minimal repo with.mcp.json(2 servers),CLAUDE.md, plugin skeleton.
[4.0.0] - 2026-04-19
Summary
Opus 4.7 era upgrade. New TOK scanner detects token-efficiency anti-patterns (cache-breaking volatile content, redundant tool permissions, deep import chains, sonnet-era minimal setups). Token Efficiency joins the quality scorecard as the 8th area. Scanner-agent and verifier-agent migrate from haiku → sonnet per global no-haiku policy.
Added
token-hotspots.mjsscanner (CA-TOK-001..004) — 4 patterns aligned with Opus 4.7 token-cost dynamics:- CA-TOK-001 cache-breaking volatile content (timestamps/UUIDs in top 30 lines of CLAUDE.md)
- CA-TOK-002 redundant tool permissions (duplicate or subset overlaps)
- CA-TOK-003 deep @import chains (>2 hops on the load path)
- CA-TOK-004 sonnet-era minimal setup (no skills/MCP/hooks/managed/plugins)
/config-audit tokens [path] [--global]— ranked hotspot table + per-pattern findings.scanners/token-hotspots-cli.mjs— standalone CLI emittingtotal_estimated_tokens,hotspots, and per-finding output.- Token Efficiency as the 8th quality area in the posture scorecard (now 9 scanners total: CML/SET/HKV/RUL/MCP/IMP/CNF/GAP/TOK).
idfield on every area in the scorecard payload (token_efficiency,instruction_clarity, etc.) for stable downstream lookup.- 13 new TOK scanner tests + 3 CLI tests + posture grade-stability test for
token_efficiency. - Knowledge refresh:
knowledge/opus-4.7-patterns.md, plus 2026-04 deltas (v2.1.83–v2.1.111) added tofeature-evolution.md,claude-code-capabilities.md, andhook-events-reference.mdfromresearch/03-claude-code-changes-config-surfaces.md.
Changed
- BREAKING (additive surface): Quality areas count 7 → 8. Posture JSON consumers that hard-coded 7 areas must update.
- BREAKING (model migration):
scanner-agentandverifier-agentmigratedhaiku→sonnet. Latency and cost trade-offs accepted; deterministic scanner CLIs preferred over agent invocations. - Scanner count: 8 → 9 (TOK added).
- Command count: 16 → 17 (
/config-audit tokensadded). - Version bump:
3.1.0→4.0.0.
[3.1.0] - 2026-04-14
Summary
New read-only command /config-audit whats-active — shows exactly what Claude Code loads for a given repo, with token estimates.
Added
/config-audit whats-active [path]— inventory of active plugins, skills, MCP servers, hooks, and CLAUDE.md cascade for a repo, with source attribution (user/project/plugin) and rough token estimates. Read-only, <2s.scanners/lib/active-config-reader.mjs— pure async helper:readActiveConfig(),detectGitRoot(),walkClaudeMdCascade(),readClaudeJsonProjectSlice()(longest-prefix matching),enumeratePlugins(),enumerateSkills(),readActiveHooks(),readActiveMcpServers(),estimateTokens().scanners/whats-active.mjs— thin CLI shim supporting--json,--output-file,--verbose,--suggest-disables.- Optional
--suggest-disablesflag surfaces deterministic disable candidates (disabled MCP servers, zero-item plugins, unreferenced plugins, orphan skills) and invites an LLM judgment pass in the command. - 36 new tests in
tests/lib/active-config-reader.test.mjs, plus arich-repotmpdir fixture helper.
Changed
- Version bump:
3.0.1→3.1.0(minor, additive feature, no breaking changes). - Command count: 15 → 16.
[3.0.1] - 2026-04-04
Summary
Cross-platform fix — scanners, hooks, and lib now work correctly on Windows.
Fixed
file-discovery.mjs: depth calculation, agent/command/plugin path matching now usepath.sepscan-orchestrator.mjs: fixture-path filtering now usespath.seppost-edit-verify.mjs: rules-dir regex handles both/and\separatorsauto-backup-config.mjs: rules-dir detection now usespath.sepimport-resolver.mjs: circular import display usesbasename(),/tmpfallback replaced withos.tmpdir()string-utils.mjs:normalizePathtrailing separator regex handles both/and\
Added
- 4 cross-platform path tests (total 486 tests)
[3.0.0] - 2026-04-04
Summary
Health redesign — configuration health is now quality-only. Feature utilization removed from grades entirely.
Changed
- Health = quality only. 7 deterministic scanners (CML, SET, HKV, RUL, MCP, IMP, CNF) determine your grade. Feature Coverage is no longer a graded area.
- Feature recommendations are opt-in. Unused features shown as "opportunities" via
/config-audit feature-gap, grouped by impact (high/medium/explore), backed by Anthropic docs. No more "Feature Coverage: F" for correct minimal setups. - Posture output redesigned. Shows
Health: {grade} ({score}/100)with 7 quality areas. Removed utilization %, maturity level, segment label. - Feature-gap is interactive. Users select recommendations to implement directly — no manual file editing required. Backup created automatically.
- avgScore bug fixed. Grade letter and displayed score now computed from the same population (quality areas only).
Added
generateHealthScorecard()in scoring.mjs — quality-only scorecardopportunitySummary()in feature-gap-scanner.mjs — groups findings by impact tieropportunityCountfield in posture JSON output- "Official Configuration Guidance" section in knowledge base (Anthropic docs, proven impacts)
- 21 new tests (total 482 across 27 test files)
Removed
S2-PROMPT.mdandV2-ANNOUNCEMENT.md— v2 development artifacts- Utilization %, maturity level, segment label from posture terminal output and reports
- Feature Coverage row from area breakdown tables
- "Top Actions" sourced from GAP findings (replaced by opportunities pointer)
Backward Compatibility
- JSON output preserves all legacy fields (utilization, maturity, segment) for programmatic consumers
- Drift baselines unaffected — GAP findings still present in envelopes
- All existing exports maintained (calculateUtilization, determineMaturityLevel, etc.)
[2.2.0] - 2026-04-04
Summary
UX quality fix — fixture filtering, session path migration, output polish.
Added
- Automatic test-fixture filtering in scan-orchestrator: findings from
tests/,examples/,__tests__/excluded from grades, stored inenv.fixture_findings --include-fixturesCLI flag for scan-orchestrator and posture to override filteringscan-orchestrator.test.mjs— 20 new tests for fixture filtering andisFixturePath- Legacy session path detection in cleanup command
Changed
- Session storage moved from
~/.config-audit/to~/.claude/config-audit/(pathguard compatible) - Self-audit grade: F → A (98) after fixture filtering
- Combined scanner + posture into single Bash call in default audit command
- Removed "F grade is misleading" disclaimer — grades are now accurate
- All CLI banners and envelope metadata updated to v2.2.0
- 461 tests (up from 441), 27 test files (up from 26)
Removed
- Manual fixture counting instruction in
config-audit.md(orchestrator handles it) - Redundant
isFixtureOrExamplefilter inself-audit.mjs(promoted to orchestrator)
[2.1.0] - 2026-04-03
Summary
UX redesign — auto-scope detection, zero questions, simplified command surface.
Changed
/config-auditnow runs full audit automatically (auto-detects scope from git context)- Removed mode selection prompts — scope override via
/config-audit full|repo|home|current - Simplified from 17 to 15 commands (removed quick, report, watch; added help)
- All CLI banners and envelope metadata updated to v2.1.0
Added
/config-audit helpcommand with categorized command reference- Auto-scope detection from git context (repo vs home vs full-machine)
Removed
/config-audit:quick(merged into default/config-audit)/config-audit:report(merged into analyze output)/config-audit:watch(use/config-audit driftinstead)
[2.0.0] - 2026-04-03 (v2.0 Complete)
Summary
Complete rewrite from LLM-only prototype to deterministic scanner-backed configuration intelligence. 7 development sessions (S1-S7), ~15,000 lines of code, 408+ tests.
Highlights
- 8 deterministic scanners (CML, SET, HKV, RUL, MCP, IMP, CNF, GAP) + PLH standalone
- Feature gap analysis with 25 dimensions across 4 tiers
- Auto-fix engine with 9 fix types + backup/rollback
- Drift detection with baseline comparison
- Suppression engine (.config-audit-ignore)
- Self-audit CLI
- 17 commands, 6 agents, 4 hooks
- 408+ tests (zero external dependencies)
Added (S7)
- Example projects:
examples/minimal-setup/andexamples/optimal-setup/ - Demo script:
examples/run-demo.sh .config-audit-ignorefor self-audit suppressionsV2-ANNOUNCEMENT.mdDEPRECATED.mdfor capability-auditor skill
Fixed (S7)
hooks.json: SessionStart and Stop timeout 5ms → 5000msself-audit.mjs: Suppression now enabled (was hardcoded tosuppress: false)
Changed (S7)
- README.md: Complete rewrite for public release
- CLAUDE.md: Added Suppressions section
.gitignore: Addednode_modules/andS*-PROMPT.md
[1.6.0] - 2026-04-03 (v2.0 S6: Unified Reports + Self-Audit + Suppressions)
Added
- Report generator
scanners/lib/report-generator.mjs— unified markdown reports: generatePostureReport(), generateDriftReport(), generatePluginHealthReport(), generateFullReport() - Suppression engine
scanners/lib/suppression.mjs—.config-audit-ignorefile support with exact IDs and glob patterns (CA-SET-*), audit trail viasuppressed_findingsin envelope - Self-audit CLI
scanners/self-audit.mjs— runs all scanners + plugin health on this plugin:node self-audit.mjs [--json] [--fix], exit codes 0/1/2 - PostToolUse hook
post-edit-verify.mjs— verifies config files after Edit/Write, blocks if new critical/high findings introduced - New command:
/config-audit:report— generate unified report (posture + optional drift/plugin-health) - Test fixture
.config-audit-ignorein fixable-project - 54 new tests (total 408 across 25 test files)
Changed
scan-orchestrator.mjs: suppression integration — applies .config-audit-ignore after all scanners run,--no-suppressflag to disablehooks.json: added PostToolUse event with post-edit-verify
[1.5.0] - 2026-04-03 (v2.0 S5: Drift + Watch + Plugin Health)
Added
- Diff engine
scanners/lib/diff-engine.mjs— diffEnvelopes() comparing baseline vs current, formatDiffReport() for terminal output - Baseline manager
scanners/lib/baseline.mjs— save/load/list/delete named baselines in ~/.claude/config-audit/baselines/ - Drift CLI
scanners/drift-cli.mjs— standalone:node drift-cli.mjs <path> [--save] [--baseline name] [--json] [--list] - Plugin health scanner
scanners/plugin-health-scanner.mjs(PLH) — validates plugin structure, frontmatter, cross-plugin conflicts (runs independently, not in scan-orchestrator) - 3 new commands:
/config-audit:drift— compare current config against saved baseline/config-audit:watch— on-demand drift check with baseline monitoring/config-audit:plugin-health— audit plugin structure and cross-plugin coherence
- Test fixtures
test-plugin/(valid) andbroken-plugin/(invalid) for plugin health tests - 48 new tests (total 354 across 21 test files)
[1.4.0] - 2026-04-03 (v2.0 S4: Fix + Rollback Action Pillar)
Added
- Fix engine
scanners/fix-engine.mjs— deterministic auto-fix for 9 fix types:json-key-add(missing $schema),json-key-remove(deprecated keys),json-key-type-fix(type mismatches, invalid effortLevel),json-restructure(hooks array→object, matcher object→string),frontmatter-rename(globs→paths),file-rename(non-.md→.md)
- Rollback engine
scanners/rollback-engine.mjs— listBackups(), restoreBackup(), deleteBackup() with checksum verification - Fix CLI
scanners/fix-cli.mjs— standalone:node fix-cli.mjs <path> [--apply] [--json] [--global], dry-run by default - Backup lib
scanners/lib/backup.mjs— shared backup module with checksums and manifests - 2 new commands:
/config-audit:fix— scan, plan, backup, apply, verify in one flow/config-audit:rollback— list or restore from backups
- PreToolUse hook
auto-backup-config.mjs— auto-backup config files before Edit/Write - Test fixture
fixable-project/— fixture with all 9 fixable issue types - 38 new tests (total 306 across 17 test files)
Changed
file-discovery.mjs: walkRulesDir now discovers all files (not just .md) for non-.md validationbackup-before-change.mjs: refactored to use sharedlib/backup.mjs(no logic duplication)- hooks.json: added PreToolUse event with auto-backup
[1.3.0] - 2026-04-03 (v2.0 S3: Posture + Feature Gap Commands)
Added
- Scoring module
scanners/lib/scoring.mjs— utilization, maturity (5 levels), segments, area scoring, scorecard generation - Posture CLI
scanners/posture.mjs— standalone Node.js tool:node posture.mjs <path> [--json] [--global] - 2 new commands:
/config-audit:posture— quick scorecard with A-F grades, utilization%, maturity level/config-audit:feature-gap— deep gap analysis with prioritized next-best-actions
- feature-gap-agent — Opus agent for deep analysis, report generation (max 200 lines)
- Knowledge file
gap-closure-templates.md— 11 templates with effort/gain estimates - HTML report template
templates/feature-gap-report.html— visual report with progress bars, grade badges - 64 new tests (total 268 across 14 test files)
Changed
- Tier weighting: T1 gaps count 3x, T2 count 2x, T3/T4 count 1x in utilization score
- Maturity is threshold-based: highest level where ALL requirements are met
[1.2.0] - 2026-04-03 (v2.0 S2: Advanced Scanners + Knowledge Base)
Added
- 4 advanced scanners (zero external deps):
mcp-config-validator.mjs(MCP) — server types, trust levels, env vars, unknown fieldsimport-resolver.mjs(IMP) — broken @imports, circular refs, deep chains, tilde pathsconflict-detector.mjs(CNF) — settings conflicts, permission contradictions, hook duplicatesfeature-gap-scanner.mjs(GAP) — 25 feature gaps across 4 tiers (Foundation/Depth/Advanced/Enterprise)
- Knowledge base — 5 reference documents: capabilities, best practices, anti-patterns, hook events, feature evolution
- New test fixtures —
.mcp.jsonfiles, @import chains,conflict-project/fixture - 75 new tests (total 204 across 12 test files)
Changed
- Scan orchestrator runs 8 scanners (was 4)
- Analyzer agent cross-references scanner findings with knowledge base
[1.1.0] - 2026-04-03 (v2.0 S1: Scanner Foundation)
Added
- Deterministic scanner infrastructure — 4 Node.js scanners (zero external deps):
claude-md-linter.mjs(CML) — CLAUDE.md structure, length, sections, @imports, duplicatessettings-validator.mjs(SET) — settings.json schema, unknown/deprecated keys, type checkshook-validator.mjs(HKV) — hooks.json format, script existence, event validity, timeoutsrules-validator.mjs(RUL) — .claude/rules/ glob matching, orphan detection, deprecated fields
- Scanner lib — 5 shared modules: severity, output, file-discovery, yaml-parser, string-utils
- Scan orchestrator —
scan-orchestrator.mjsruns all scanners, outputs JSON envelope - Test infrastructure — 129 tests across 8 test files using node:test (zero deps)
- Test fixtures — 4 fixture projects (healthy, broken, empty, minimal)
- Finding ID format:
CA-{SCANNER}-{NNN}(e.g.CA-CML-001)
Fixed
- Agent model mismatches: scanner→haiku, analyzer→sonnet, planner→opus, implementer→sonnet, verifier→haiku
Changed
- CLAUDE.md rewritten in English for public release readiness
[1.0.0] - 2026-02-11
Added
- Cross-platform support (macOS, Linux, Windows)
Fixed
stop-session-reminder.mjs: Usepath.basename/path.dirnameinstead of hardcoded/splitbackup-before-change.mjs: Handle both/and\path separators in safe filename generation
Removed
- "Windows: hooks are 100% bash" from known gaps (was incorrect — all hooks are Node.js)
[0.7.0] - 2026-02-07
Note
Version reset from 1.2.0 to reflect actual maturity. Previous version was inflated — this plugin has never been externally tested.
What exists today
- 6 specialized agents (scanner, analyzer, interviewer, planner, implementer, verifier)
- Full machine-wide Claude Code configuration discovery
- Scope selection (current project, repo, home, full machine)
- Inheritance hierarchy mapping and conflict detection
- Mandatory backups before any changes
- Rollback support
- Syntax validation for all configuration files
- Quick audit-only mode
- Full optimization workflow with HITL checkpoints
Known gaps
- Testing: no automated tests
- Onboarding: never verified that a new user can install and use from scratch
- External verification: nobody else has ever used this