# Changelog All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). ## [5.0.0] - 2026-05-01 ### Summary Reality-based token-optimization release. v4.0.0 shipped Opus-4.7 token surfaces aligned to a Sonnet-era cost model; v5.0.0 rebuilds the foundations against verified Opus-4.7 cost dynamics. Three pillars: honest token estimation (severity-weighted scoring, MCP estimates 15 → 500+, optional `--accurate-tokens` API calibration), new structural scanners (cache-prefix stability, dead tool grants, plugin collisions), and new diagnostic surfaces (`/config-audit manifest`, `/config-audit tokens` extended, knowledge-base rensing aligned to Opus 4.7 cache dynamics). Consolidated from `5.0.0-alpha.1` (F1-F5 token-economy round), `5.0.0-alpha.2` (M1, M2, M4-M6, F6, F7 structural gaps + README self-audit), `5.0.0-beta.1` (N1-N4, N6 new scanners + manifest CLI), and `5.0.0-rc.1` (M7, M8 knowledge rensing + N5 tokenizer calibration). ### Added - **3 new scanners (9 → 12 deterministic):** - **CPS — Cache-Prefix Stability** (`CA-CPS-NNN`): volatile content in lines 31–150 of CLAUDE.md cascade, beyond TOK Pattern A's top-30 window. Volatile-pattern set extends Pattern A with shell-exec lines (`!` prefix) and `${VAR}` substitutions. - **DIS — Disabled-In-Schema** (`CA-DIS-NNN`): tools listed in BOTH `permissions.deny` AND `permissions.allow`. Tool identity uses bare name (`Bash(npm:*)` and `Bash` are the same tool). Severity low. - **COL — Cross-Plugin Skill Collision** (`CA-COL-001`): plugin-vs-plugin same skill name → low; user-vs-plugin → medium. `details.namespaces` payload identifies conflicting sources. - **TOK extensions:** - **CA-TOK-005 MCP tool-schema budget:** per-server tiered finding (< 20 none, 20–49 low, 50–99 medium, 100+ high; null low + "tool count unknown"). Scoped to project-local `.mcp.json`. - **Pattern E — Oversized cascade:** medium when `activeConfig.claudeMd.estimatedTokens > 10_000`. - **Pattern F — Bloated SKILL.md description:** low when frontmatter `description > 500 chars` (loads every turn). Scoped to `discovery.files`. - **`/config-audit manifest`** + `scanners/manifest.mjs` CLI — single ranked table of every system-prompt token source (CLAUDE.md cascade, plugins, skills, MCP servers, hooks) sorted DESC by `estimated_tokens`. CLAUDE.md per-file tokens distributed proportional to bytes. - **`--accurate-tokens` flag** on `token-hotspots-cli.mjs` (N5): when `ANTHROPIC_API_KEY` is set, calls Anthropic's `count_tokens` for the top 3 hotspots and populates `output.calibration = { actual_tokens, source: 'count_tokens_api', sampled_hotspots: 3 }`. When absent: `calibration = { skipped: 'no-api-key' }` plus stderr warning. - **`scanners/lib/tokenizer-api.mjs`** — `count_tokens` wrapper. 5s AbortController timeout. Exponential backoff on 429 (3 retries: 1s/2s/4s). API key masked to `${key.slice(0,8)}...` in every error; HTTP body never included in errors (it may echo the key on auth failures). `maskKey()` exported. - **`--with-telemetry-recipe` flag** on the same CLI (M7): emits `telemetry_recipe_path` field pointing to `knowledge/cache-telemetry-recipe.md`. - **`knowledge/cache-telemetry-recipe.md`** (M7): manual `jq` recipe summing `cache_read_input_tokens` + `cache_creation_input_tokens` per turn from session transcripts. Hit-rate interpretation table. - **`'mcp'` kind on `estimateTokens`** (F2): active MCP servers estimate ≥ 500 tokens (base + schema overhead) instead of v4's flat 15. Optional `{toolCount}` raises to `500 + toolCount × 200`. - **MCP tool-count detection** (M1): `readActiveMcpServers` resolves count via cache → `node_modules//package.json` → `{toolCount: null, toolCountUnknown: true}` fallback. - **`additionalDirectories` settings key** (M6): added to `KNOWN_KEYS`; new low-severity finding when length > 2. - **HKV verbose hook output** (M5): low-severity finding when referenced hook script contains > 50 `console.log`/`process.stdout.write` lines (static, no execution). - **`self-audit --check-readme` flag** (F6): filesystem counts compared against README badges. Helper `checkReadmeBadges(pluginDir)`. Step 28 of v5 plan reconciled all badges. - **`scoringVersion: 'v5'`** field on `scoreByArea` output for cross-version drift detection. - **`WEIGHTS`** named export from `scanners/lib/severity.mjs` (frozen). - **`details` field on findings** (`output.mjs:finding()`): optional structured payload for scanner-specific data (used by COL). - **Plugin Hygiene** as 10th quality area (from COL). Posture JSON now reports 10 areas. - **TOK-readActiveConfig integration** (F1): one hotspot per active MCP server; `result.activeConfig` summary (claudeMd cascade tokens, mcpServerCount, pluginCount, skillCount); try/catch fallback when scope-limited. ### Changed - **F3 — `scoreByArea` is severity-weighted.** Penalty = `Σ count[s] × WEIGHTS[s]`; `passRate = max(0, 100 − penalty / max(10, findingCount × 4) × 100)`. Lows no longer crater an area's grade; criticals/highs do. `baseline-all-a` fixture remains all-A (no critical/high present). - **F7 — TOK pattern severities recalibrated** for tokens-per-turn impact: Pattern A `medium → high`, Pattern B `low → medium`, Pattern C `medium → low`. Each finding carries a `calibration_note` evidence field documenting the heuristic basis. - **`scoreByArea` deduplicates by area name** (N3 prep): TOK + CPS share "Token Efficiency"; SET + DIS share "Settings". Combined row with merged finding counts. - **M8 — knowledge rensing:** replaced "Keep CLAUDE.md under 200 lines" in `knowledge/configuration-best-practices.md` with cache-stability guidance (first 30 lines stable, volatile content below the cache threshold). Footnote explains the 200-line rule was a Sonnet-era adherence heuristic; Opus 4.7 uses prompt-cache structure as the dominant cost lever. Cross-references `knowledge/opus-4.7-patterns.md`. - **`commands/tokens.md` next-steps:** documents `--with-telemetry-recipe` as the cache-verification path. - **Scanner count: 9 → 12.** Command count: 17 → 18. Knowledge: 7 → 8. Quality areas: 8 → 10. - **`.gitignore`** — unignore rules for `tests/fixtures/**/node_modules/` so the `mcp-tool-heavy` fixture stays under version control. ### Removed - **F4 — TOK hotspot padding loop and `take` dead-code.** Hotspots may now contain fewer than 3 entries for tiny projects (the honest answer); contract still bounds at ≤ 10. - **F5 — Pattern D / `CA-TOK-004` (sonnet-era signature).** Catalogue entry removed from `knowledge/opus-4.7-patterns.md` and `commands/tokens.md`. Suppression entries for `CA-TOK-004` are now no-ops. ### Breaking changes - **F2 — MCP token estimates jump from flat 15 to ≥ 500.** Token Efficiency grades for projects with MCP servers may shift. `whats-active` totals report higher numbers. Documented in `commands/posture.md` next-steps. - **F3 — `scoreByArea` is severity-weighted.** Posture JSON consumers reading `areas[*].score` will see different values for non-clean configs. Use `result.scoringVersion === 'v5'` to detect the change. Drift comparisons across v4↔v5 baselines may show artificial deltas — re-baseline after upgrade. - **F5 — Pattern D / `CA-TOK-004` no longer emitted.** Existing exact `CA-TOK-004` suppression entries are harmless but obsolete. - **N1 suppression backward-compat — `CA-TOK-*` glob now also matches `CA-TOK-005`.** To preserve prior behavior of suppressing only patterns A/B/C, replace the glob with explicit IDs: ``` CA-TOK-001 CA-TOK-002 CA-TOK-003 ``` A one-time runtime warning for this case is a v5.0.1 candidate. - **Posture areas count: 9 → 10** (Plugin Hygiene from COL). Consumers hard-coding 9 must update. ### Migration notes - `CA-TOK-*` glob suppressions: explicit-ID list recommended if CA-TOK-005 should not be suppressed. - `CA-TOK-004` exact-ID suppression entries: safe to remove. - Drift baselines created against v4 should be re-saved post-upgrade to avoid artificial F3 weighting deltas. - Posture JSON consumers must update any hardcoded `areas.length === 8` or `=== 9` assertions to `>= 10`. ### Tests - 543 → 635 (+92): F1-F7 (alpha rounds = +43), N1-N4 + N6 (beta = +39), M7 + M8 + N5 (rc = +10). 36 test files (12 lib + 23 scanner + 1 hook). - New fixtures: `tok-active-config/`, `additional-dirs-many/`, `additional-dirs-ok/`, `large-cascade/`, `small-cascade/`, `skill-bloated/`, `skill-tight/`, `mcp-tool-heavy/` (with mocked `node_modules/`), `hooks-verbose/`, `hooks-quiet/`, `readme-desynced/`, `mcp-budget/{14,25,60,120,unknown}-tools/`, `volatile-mid-section/{volatile-line-60,volatile-line-200}/`, `denied-tools-in-schema/`, `collision-plugins/fake-home/` (plugin-a + plugin-b + plugin-c + user-level review skill). - New test files: `tests/scanners/manifest.test.mjs`, `tests/scanners/cache-prefix.test.mjs`, `tests/scanners/disabled-in-schema.test.mjs`, `tests/scanners/collision.test.mjs`, `tests/scanners/accurate-tokens.test.mjs`. ### Notes - **`mock.method` against ESM module exports does not work** (Node 18+ ESM read-only export bindings). v5 tests use `globalThis.fetch` mocking for `--accurate-tokens` instead — equivalent coverage at the actual external-dependency boundary. - **Plugin-vs-built-in collision detection is intentionally not implemented.** Step 22a research spike (`docs/v5-namespace-research.md`, gitignored) could not verify Claude Code's resolution behavior when a plugin command shares a name with a built-in. Treated as info-only; v5.0.1 candidate. - **README/CLAUDE.md badge reconciliation** done in Step 28 (this release). `self-audit --check-readme` PASSES against the filesystem. Test count counter switched from file-count to test-case count via subprocess `node --test` parse. - **`hotspot.path` exposed on file-backed hotspots** (Step 30 fix). The rc.1 `--accurate-tokens` implementation looked up `hotspot.path` but the scanner only emitted `source`. File-backed hotspots now carry `path` (absolute path); MCP-server hotspots leave it unset (they are virtual entries representing runtime tool-schema cost, not file content). ### SC-6b release-gate result (verified 2026-05-01) - **PASS — 0.85% under-estimation against real `count_tokens` API.** - Fixture: `tests/fixtures/marketplace-large/`. Top-3 hotspots = 1 file-backed (`CLAUDE.md`) + 2 MCP virtuals. MCP entries skipped per design (no readable content; their tokens are formula-based at 500 + toolCount × 200). - `CLAUDE.md` actual: 589 tokens (Anthropic `count_tokens`, `claude-opus-4-7`). Estimated: 594 tokens (byte heuristic at 4 bytes/token via `estimateTokens`). Delta: **−5 tokens, −0.85%** — well within the ±5% gate. - No tuning of `estimateTokens` heuristic required for v5.0.0. ## [5.0.0-rc.1] - 2026-05-01 ### Summary Release candidate for v5.0.0 — knowledge rensing and tokenizer calibration. Three deliverables: M8 (Sonnet-era → Opus 4.7 best-practices rewrite), M7 (cache-telemetry recipe in `knowledge/` plus an opt-in CLI flag), and N5 (`--accurate-tokens` API calibration via Anthropic's `count_tokens` endpoint). ### Added - **N5 — `--accurate-tokens` flag** on `scanners/token-hotspots-cli.mjs`. When `ANTHROPIC_API_KEY` is set, the CLI calls Anthropic's `count_tokens` endpoint for the top 3 hotspots and populates `output.calibration = { actual_tokens, source: 'count_tokens_api', sampled_hotspots: 3 }`. When the key is absent, `calibration = { skipped: 'no-api-key' }` and a stderr warning is emitted. Designed for the manual SC-6b release-gate verification, not routine use. - **`scanners/lib/tokenizer-api.mjs`** — wrapper around `count_tokens` with a 5-second AbortController timeout, exponential-backoff retry on HTTP 429 (max 3 retries: 1s, 2s, 4s), and required headers (`x-api-key`, `anthropic-version: 2023-06-01`, `content-type`). API key is masked to `${key.slice(0,8)}...` in every error message and every thrown error; non-429 HTTP errors throw status code only — response body is never included (it may echo the key on auth failures). `maskKey()` is exported for callers that need safe logging. - **M7 — `knowledge/cache-telemetry-recipe.md`** (new). Manual `jq` recipe for verifying prompt-cache hit rate from Claude Code session transcripts (`~/.claude/projects//*.jsonl`). Sums `cache_read_input_tokens` and `cache_creation_input_tokens` per turn and reports a hit-rate ratio. Recipe-form (not bundled scanner) keeps the project's "no transcript-parsing as core feature" non-goal intact while giving users a runtime escape hatch. - **M7 — `--with-telemetry-recipe` flag** on the same CLI. When passed, emits `telemetry_recipe_path` in the JSON output pointing to the recipe file. Without the flag, output is unchanged. Committed as a default deliverable, opt-in at invocation time. ### Changed - **M8 — knowledge-base rensing:** replaced the "Keep CLAUDE.md under 200 lines" rule in `knowledge/configuration-best-practices.md` with cache-stability guidance (first 30 lines stable, volatile content below the cache threshold). Added a footnote that the 200-line rule was a Sonnet-era adherence heuristic; Opus 4.7 uses prompt-cache structure as the dominant cost lever. Cross-references `knowledge/opus-4.7-patterns.md`. - **`commands/tokens.md` next-steps:** documents `--with-telemetry-recipe` as the cache-verification path after a structural fix. ### Tests - 625 → 635 (+10): `--with-telemetry-recipe` (×2), tokenizer-api unit tests (×6 — masking, body-leak protection, AbortController signal, 429 retry, header set, fetch mock happy path), `--accurate-tokens` no-key subprocess test (×1), absent-flag negative test (×1). - New file: `tests/scanners/accurate-tokens.test.mjs`. No new fixtures (re-uses `marketplace-large`). ### Notes - **SC-6b release gate is NOT closed by these commits.** Step 26's tests use mocked `globalThis.fetch` to verify the integration contract; ±5% accuracy against real `count_tokens` requires a live API key and must be verified manually before tagging v5.0.0 in Session 5. - The plan's specified `mock.method(tokenizerApi, 'callCountTokensApi', ...)` pattern collides with ESM read-only export bindings in Node 18+. Tests mock at the `globalThis.fetch` boundary instead — equivalent coverage, no module-export rebinding required. - README/CLAUDE.md badge counts and `plugin.json` version still target v4.0.0; Step 28+29 will sync those during the release wrap. - `[skip-docs]` tag on the N5 feat commit; M7 and M8 are `docs(...)` commits and don't need it. ## [5.0.0-beta.1] - 2026-05-01 ### Summary First v5.0.0 beta — new scanners. Five new finding sources land: MCP tool-schema budget (CA-TOK-005), system-prompt manifest CLI/command (`/config-audit manifest`), cache-prefix stability (CPS), disabled-tools-still-in-schema (DIS), and cross-plugin/user-vs-plugin skill collision (COL/CA-COL-001). Plugin Hygiene becomes a 10th area-scorecard column. ### Added - **N1 — `CA-TOK-005` MCP tool-schema budget:** per-server tiered finding inside the TOK scanner. Thresholds — `< 20` no finding, `20–49` low, `50–99` medium, `100+` high; `null` (manifest unparseable) low + "tool count unknown" message. Scoped to project-local `.mcp.json` to keep `/config-audit ` actionable. Recommendation links to the Step 25 cache-telemetry recipe. - **N2 — `/config-audit manifest`:** new slash command + `scanners/manifest.mjs` CLI. Renders a single ranked table of every token source (CLAUDE.md cascade, plugins, skills, MCP servers, hooks) sorted DESC by `estimated_tokens`. Reuses `readActiveConfig`; CLAUDE.md per-file tokens are distributed proportional to bytes. - **N3 — CPS scanner (`CA-CPS-NNN`):** Cache-Prefix Stability Analyzer. Walks the CLAUDE.md cascade and flags volatile content between lines 31 and 150 — beyond TOK Pattern A's top-30 territory. Volatile-pattern set extends Pattern A with shell-exec lines (`!` prefix) and `${VAR}` substitutions. Severity medium per finding. Skips lines 1–30 (Pattern A's range). - **N4 — DIS scanner (`CA-DIS-NNN`):** Disabled-In-Schema Detector. Detects tools that appear in BOTH `permissions.deny` and `permissions.allow` within the same `settings.json`. The deny list wins, so allow entries are dead config but still load every turn. Tool identity is the bare name (everything before `(`); `Bash(npm:*)` and `Bash` are treated as the same tool. Severity low. - **N6 — COL scanner (`CA-COL-001`):** Cross-Plugin Skill Collision detector. Plugin-vs-plugin same skill name → low. User-vs-plugin same skill name → medium. Findings carry `details.namespaces` array with `{source, name, path}` for every conflicting source. - **`details` field on findings:** `output.mjs:finding()` helper now passes through optional `details` for scanner-specific structured payloads (used by COL). - **"Plugin Hygiene" area** (10th in scorecard): COL contributes here. Posture JSON now reports 10 areas instead of 9. ### Changed - **`scoreByArea` deduplicates by area name:** when multiple scanners share an area (TOK + CPS → "Token Efficiency", SET + DIS → "Settings"), they produce one combined row with merged finding counts. Existing 9-area contract preserved for non-Plugin-Hygiene areas. ### Known breaking changes - **Suppression backward-compat — `CA-TOK-*` glob now also matches `CA-TOK-005`.** Existing `.config-audit-ignore` entries that suppress TOK findings via the `CA-TOK-*` glob will silently include CA-TOK-005 (MCP budget). To preserve the prior behavior of suppressing only patterns A/B/C, replace the glob with explicit IDs: ``` CA-TOK-001 CA-TOK-002 CA-TOK-003 ``` A one-time runtime warning for this case is out of scope for v5.0.0 — it is a candidate for v5.0.1. - **Plugin-vs-built-in collision is intentionally not implemented.** The Step 22a research spike could not verify Claude Code's resolution behavior when a plugin command shares a name with a built-in (`/help`, `/clear`, `/init`, `/review`, `/config`, `/cost`, `/security-review`). Treated as info-only in this release; a follow-up v5.0.1 ticket may add an opt-in check. ### Tests - 586 → 625 (+39): N1 (×7), N2 (×11), N3 (×7), N4 (×6), N6 (×8). - New fixtures: `mcp-budget/{14,25,60,120,unknown}-tools/`, `volatile-mid-section/{volatile-line-60,volatile-line-200}/`, `denied-tools-in-schema/`, `collision-plugins/fake-home/` (plugin-a + plugin-b + plugin-c + user-level review skill). ### Notes - `[skip-docs]` tag used on every feat commit — README/CLAUDE.md badge counts (scanner count, command count, test count) and the architecture sections are intentionally fenced off until Session 5 (Step 28). This keeps the v5 plan's session boundaries clean even when the Forgejo `pre-commit-docs-gate` hook would otherwise block these commits. ## [5.0.0-alpha.2] - 2026-05-01 ### Summary Second v5.0.0 alpha — structural gaps + README self-audit. TOK pattern severities recalibrated for tokens/turn impact (F7), three new findings cover settings/skills/cascade structure (M2, M4, M6), MCP tool-count detection wired (M1), HKV gains a verbose-output check (M5), and self-audit grows a `--check-readme` flag (F6). ### Added - **F7 — TOK severity recalibration:** Pattern A (cache-breaking volatile top) `medium → high`, Pattern B (redundant permissions) `low → medium`, Pattern C (deep imports) `medium → low`. Each finding now carries a `calibration_note` evidence field documenting the heuristic basis. - **M6 — `additionalDirectories` settings key:** added to `KNOWN_KEYS` so it no longer trips "unknown settings key". New low-severity finding when `additionalDirectories.length > 2`. - **M4 — TOK Pattern E:** medium-severity finding when `activeConfig.claudeMd.estimatedTokens > 10_000` — flags cascades that bleed budget every turn. - **M2 — TOK Pattern F:** low-severity finding for project-local `SKILL.md` whose frontmatter `description` exceeds 500 characters (description loads on every turn even when the body does not). Scoped to `discovery.files`; user/plugin skills out of project scope are not flagged. - **M1 — MCP tool-count detection:** `readActiveMcpServers` now resolves tool count via cache → `node_modules//package.json` → `{toolCount: null, toolCountUnknown: true}` fallback. Tool count drives `estimateTokens` per server. - **M5 — HKV verbose hook output:** new low-severity finding when a referenced hook script contains > 50 `console.log` / `process.stdout.write` lines (static heuristic, no execution). - **F6 — `self-audit --check-readme` flag:** filesystem counts (scanners, commands, agents, hooks, tests, knowledge) compared against README badge values. Helper export: `checkReadmeBadges(pluginDir)`. ### Changed - **TOK severities** (F7) — see Added. Posture aggregates that depended on Pattern A being `medium` will now reflect the higher-impact rating. - **`.gitignore`** — added unignore rules so `tests/fixtures/**/node_modules/` are tracked. Required by the `mcp-tool-heavy` fixture. ### Tests - 563 → 586 (+23): F7 table-driven (×6), M6 (×3), M4 (×2), M2 (×2), M1 (×4), M5 (×2), F6 (×4). - New fixtures: `additional-dirs-many/`, `additional-dirs-ok/`, `large-cascade/`, `small-cascade/`, `skill-bloated/`, `skill-tight/`, `mcp-tool-heavy/` (with mocked `node_modules/`), `hooks-verbose/`, `hooks-quiet/`, `readme-desynced/`. ### Notes - `result.readmeCheck.passed === true` is **not** required during alpha/beta phases. The real plugin's own check is currently red (`scanners` 10 vs README 9, `tests` 31 vs README 543) — reconciliation deferred to Session 5 Step 28 (README sync). - `[skip-docs]` tag used on every commit — README/CLAUDE.md badge counts and architecture text are intentionally fenced off until Session 5. ## [5.0.0-alpha.1] - 2026-05-01 ### Summary First v5.0.0 alpha — token-economy round, F1-F5. The TOK scanner now consumes `readActiveConfig` (per-MCP-server hotspots, claudeMd cascade tokens), severity weighting replaces flat finding counts in `scoreByArea`, and MCP servers no longer estimate at a flat 15 tokens. Pattern D (CA-TOK-004 sonnet-era signature) removed — too noisy, not actionable. ### Added - **`'mcp'` kind for `estimateTokens`** (F2): an active MCP server now estimates ≥ 500 tokens (base protocol + schema overhead) instead of the v4 flat 15. Optional `{toolCount}` raises the estimate to `500 + toolCount * 200` once Step 14 wires tool-count detection. - **TOK ↔ readActiveConfig integration** (F1): the TOK scanner emits one hotspot per active MCP server, sums their tokens into `total_estimated_tokens`, and exposes `result.activeConfig` (claudeMd cascade tokens, mcpServerCount, pluginCount, skillCount). - **`scoringVersion: 'v5'`** field on `scoreByArea` output for cross-version drift detection. - **`WEIGHTS`** named export from `scanners/lib/severity.mjs` (`Object.freeze`). ### Changed - **BREAKING (intentional, F3):** `scoreByArea` is now severity-weighted. Penalty = `Σ count[s] * WEIGHTS[s]`; `passRate = max(0, 100 - penalty / max(10, findingCount * 4) * 100)`. Lows no longer crater an area's grade; a single high or critical consumes a large fraction of budget. `baseline-all-a` fixture remains all-A (no critical/high on that fixture). - **BREAKING (intentional, F2):** MCP server token estimates jump from a flat 15 to ≥ 500. `whats-active` totals and TOK hotspots will report higher numbers for any project with active MCP servers. - **BREAKING (intentional, F5):** Pattern D / `CA-TOK-004` (sonnet-era signature) is no longer emitted. Suppression entries for `CA-TOK-004` are now no-ops; downstream tools that filter on the ID should drop it. The catalogue entry was removed from `knowledge/opus-4.7-patterns.md` and `commands/tokens.md`. - **Hotspots contract (F4):** the v4 padding loop and `take` dead-code are gone. Hotspots may now contain fewer than 3 entries for tiny projects (the honest answer); contract still bounds at ≤ 10. ### Migration notes - `CA-TOK-*` glob suppression entries continue to suppress 001-003. Existing exact `CA-TOK-004` entries are harmless but obsolete — remove them at convenience. - Posture/JSON consumers reading `areas[*].score` will see different values for non-clean configs. Use `result.scoringVersion === 'v5'` to detect. ### Tests - 543 → 563 across the alpha.1 commits (+9 severity-weighting/scoring, +4 estimateTokens 'mcp', +1 MCP caller migration, +3 readActiveConfig integration, +2 hotspots-uniqueness, +2 sonnet-era zero-finding). - New fixture `tests/fixtures/tok-active-config/` — minimal repo with `.mcp.json` (2 servers), `CLAUDE.md`, plugin skeleton. ## [4.0.0] - 2026-04-19 ### Summary Opus 4.7 era upgrade. New TOK scanner detects token-efficiency anti-patterns (cache-breaking volatile content, redundant tool permissions, deep import chains, sonnet-era minimal setups). Token Efficiency joins the quality scorecard as the 8th area. Scanner-agent and verifier-agent migrate from haiku → sonnet per global no-haiku policy. ### Added - **`token-hotspots.mjs`** scanner (CA-TOK-001..004) — 4 patterns aligned with Opus 4.7 token-cost dynamics: - CA-TOK-001 cache-breaking volatile content (timestamps/UUIDs in top 30 lines of CLAUDE.md) - CA-TOK-002 redundant tool permissions (duplicate or subset overlaps) - CA-TOK-003 deep @import chains (>2 hops on the load path) - CA-TOK-004 sonnet-era minimal setup (no skills/MCP/hooks/managed/plugins) - **`/config-audit tokens [path] [--global]`** — ranked hotspot table + per-pattern findings. - **`scanners/token-hotspots-cli.mjs`** — standalone CLI emitting `total_estimated_tokens`, `hotspots`, and per-finding output. - **Token Efficiency** as the 8th quality area in the posture scorecard (now 9 scanners total: CML/SET/HKV/RUL/MCP/IMP/CNF/GAP/TOK). - `id` field on every area in the scorecard payload (`token_efficiency`, `instruction_clarity`, etc.) for stable downstream lookup. - 13 new TOK scanner tests + 3 CLI tests + posture grade-stability test for `token_efficiency`. - Knowledge refresh: `knowledge/opus-4.7-patterns.md`, plus 2026-04 deltas (v2.1.83–v2.1.111) added to `feature-evolution.md`, `claude-code-capabilities.md`, and `hook-events-reference.md` from `research/03-claude-code-changes-config-surfaces.md`. ### Changed - **BREAKING (additive surface):** Quality areas count 7 → 8. Posture JSON consumers that hard-coded 7 areas must update. - **BREAKING (model migration):** `scanner-agent` and `verifier-agent` migrated `haiku` → `sonnet`. Latency and cost trade-offs accepted; deterministic scanner CLIs preferred over agent invocations. - Scanner count: 8 → 9 (TOK added). - Command count: 16 → 17 (`/config-audit tokens` added). - Version bump: `3.1.0` → `4.0.0`. ## [3.1.0] - 2026-04-14 ### Summary New read-only command `/config-audit whats-active` — shows exactly what Claude Code loads for a given repo, with token estimates. ### Added - **`/config-audit whats-active [path]`** — inventory of active plugins, skills, MCP servers, hooks, and CLAUDE.md cascade for a repo, with source attribution (user/project/plugin) and rough token estimates. Read-only, <2s. - `scanners/lib/active-config-reader.mjs` — pure async helper: `readActiveConfig()`, `detectGitRoot()`, `walkClaudeMdCascade()`, `readClaudeJsonProjectSlice()` (longest-prefix matching), `enumeratePlugins()`, `enumerateSkills()`, `readActiveHooks()`, `readActiveMcpServers()`, `estimateTokens()`. - `scanners/whats-active.mjs` — thin CLI shim supporting `--json`, `--output-file`, `--verbose`, `--suggest-disables`. - Optional `--suggest-disables` flag surfaces deterministic disable candidates (disabled MCP servers, zero-item plugins, unreferenced plugins, orphan skills) and invites an LLM judgment pass in the command. - 36 new tests in `tests/lib/active-config-reader.test.mjs`, plus a `rich-repo` tmpdir fixture helper. ### Changed - Version bump: `3.0.1` → `3.1.0` (minor, additive feature, no breaking changes). - Command count: 15 → 16. ## [3.0.1] - 2026-04-04 ### Summary Cross-platform fix — scanners, hooks, and lib now work correctly on Windows. ### Fixed - `file-discovery.mjs`: depth calculation, agent/command/plugin path matching now use `path.sep` - `scan-orchestrator.mjs`: fixture-path filtering now uses `path.sep` - `post-edit-verify.mjs`: rules-dir regex handles both `/` and `\` separators - `auto-backup-config.mjs`: rules-dir detection now uses `path.sep` - `import-resolver.mjs`: circular import display uses `basename()`, `/tmp` fallback replaced with `os.tmpdir()` - `string-utils.mjs`: `normalizePath` trailing separator regex handles both `/` and `\` ### Added - 4 cross-platform path tests (total 486 tests) ## [3.0.0] - 2026-04-04 ### Summary Health redesign — configuration health is now quality-only. Feature utilization removed from grades entirely. ### Changed - **Health = quality only.** 7 deterministic scanners (CML, SET, HKV, RUL, MCP, IMP, CNF) determine your grade. Feature Coverage is no longer a graded area. - **Feature recommendations are opt-in.** Unused features shown as "opportunities" via `/config-audit feature-gap`, grouped by impact (high/medium/explore), backed by Anthropic docs. No more "Feature Coverage: F" for correct minimal setups. - **Posture output redesigned.** Shows `Health: {grade} ({score}/100)` with 7 quality areas. Removed utilization %, maturity level, segment label. - **Feature-gap is interactive.** Users select recommendations to implement directly — no manual file editing required. Backup created automatically. - **avgScore bug fixed.** Grade letter and displayed score now computed from the same population (quality areas only). ### Added - `generateHealthScorecard()` in scoring.mjs — quality-only scorecard - `opportunitySummary()` in feature-gap-scanner.mjs — groups findings by impact tier - `opportunityCount` field in posture JSON output - "Official Configuration Guidance" section in knowledge base (Anthropic docs, proven impacts) - 21 new tests (total 482 across 27 test files) ### Removed - `S2-PROMPT.md` and `V2-ANNOUNCEMENT.md` — v2 development artifacts - Utilization %, maturity level, segment label from posture terminal output and reports - Feature Coverage row from area breakdown tables - "Top Actions" sourced from GAP findings (replaced by opportunities pointer) ### Backward Compatibility - JSON output preserves all legacy fields (utilization, maturity, segment) for programmatic consumers - Drift baselines unaffected — GAP findings still present in envelopes - All existing exports maintained (calculateUtilization, determineMaturityLevel, etc.) ## [2.2.0] - 2026-04-04 ### Summary UX quality fix — fixture filtering, session path migration, output polish. ### Added - Automatic test-fixture filtering in scan-orchestrator: findings from `tests/`, `examples/`, `__tests__/` excluded from grades, stored in `env.fixture_findings` - `--include-fixtures` CLI flag for scan-orchestrator and posture to override filtering - `scan-orchestrator.test.mjs` — 20 new tests for fixture filtering and `isFixturePath` - Legacy session path detection in cleanup command ### Changed - Session storage moved from `~/.config-audit/` to `~/.claude/config-audit/` (pathguard compatible) - Self-audit grade: F → A (98) after fixture filtering - Combined scanner + posture into single Bash call in default audit command - Removed "F grade is misleading" disclaimer — grades are now accurate - All CLI banners and envelope metadata updated to v2.2.0 - 461 tests (up from 441), 27 test files (up from 26) ### Removed - Manual fixture counting instruction in `config-audit.md` (orchestrator handles it) - Redundant `isFixtureOrExample` filter in `self-audit.mjs` (promoted to orchestrator) ## [2.1.0] - 2026-04-03 ### Summary UX redesign — auto-scope detection, zero questions, simplified command surface. ### Changed - `/config-audit` now runs full audit automatically (auto-detects scope from git context) - Removed mode selection prompts — scope override via `/config-audit full|repo|home|current` - Simplified from 17 to 15 commands (removed quick, report, watch; added help) - All CLI banners and envelope metadata updated to v2.1.0 ### Added - `/config-audit help` command with categorized command reference - Auto-scope detection from git context (repo vs home vs full-machine) ### Removed - `/config-audit:quick` (merged into default `/config-audit`) - `/config-audit:report` (merged into analyze output) - `/config-audit:watch` (use `/config-audit drift` instead) ## [2.0.0] - 2026-04-03 (v2.0 Complete) ### Summary Complete rewrite from LLM-only prototype to deterministic scanner-backed configuration intelligence. 7 development sessions (S1-S7), ~15,000 lines of code, 408+ tests. ### Highlights - 8 deterministic scanners (CML, SET, HKV, RUL, MCP, IMP, CNF, GAP) + PLH standalone - Feature gap analysis with 25 dimensions across 4 tiers - Auto-fix engine with 9 fix types + backup/rollback - Drift detection with baseline comparison - Suppression engine (.config-audit-ignore) - Self-audit CLI - 17 commands, 6 agents, 4 hooks - 408+ tests (zero external dependencies) ### Added (S7) - Example projects: `examples/minimal-setup/` and `examples/optimal-setup/` - Demo script: `examples/run-demo.sh` - `.config-audit-ignore` for self-audit suppressions - `V2-ANNOUNCEMENT.md` - `DEPRECATED.md` for capability-auditor skill ### Fixed (S7) - `hooks.json`: SessionStart and Stop timeout 5ms → 5000ms - `self-audit.mjs`: Suppression now enabled (was hardcoded to `suppress: false`) ### Changed (S7) - README.md: Complete rewrite for public release - CLAUDE.md: Added Suppressions section - `.gitignore`: Added `node_modules/` and `S*-PROMPT.md` ## [1.6.0] - 2026-04-03 (v2.0 S6: Unified Reports + Self-Audit + Suppressions) ### Added - **Report generator** `scanners/lib/report-generator.mjs` — unified markdown reports: generatePostureReport(), generateDriftReport(), generatePluginHealthReport(), generateFullReport() - **Suppression engine** `scanners/lib/suppression.mjs` — `.config-audit-ignore` file support with exact IDs and glob patterns (CA-SET-*), audit trail via `suppressed_findings` in envelope - **Self-audit CLI** `scanners/self-audit.mjs` — runs all scanners + plugin health on this plugin: `node self-audit.mjs [--json] [--fix]`, exit codes 0/1/2 - **PostToolUse hook** `post-edit-verify.mjs` — verifies config files after Edit/Write, blocks if new critical/high findings introduced - **New command**: `/config-audit:report` — generate unified report (posture + optional drift/plugin-health) - **Test fixture** `.config-audit-ignore` in fixable-project - 54 new tests (total 408 across 25 test files) ### Changed - `scan-orchestrator.mjs`: suppression integration — applies .config-audit-ignore after all scanners run, `--no-suppress` flag to disable - `hooks.json`: added PostToolUse event with post-edit-verify ## [1.5.0] - 2026-04-03 (v2.0 S5: Drift + Watch + Plugin Health) ### Added - **Diff engine** `scanners/lib/diff-engine.mjs` — diffEnvelopes() comparing baseline vs current, formatDiffReport() for terminal output - **Baseline manager** `scanners/lib/baseline.mjs` — save/load/list/delete named baselines in ~/.claude/config-audit/baselines/ - **Drift CLI** `scanners/drift-cli.mjs` — standalone: `node drift-cli.mjs [--save] [--baseline name] [--json] [--list]` - **Plugin health scanner** `scanners/plugin-health-scanner.mjs` (PLH) — validates plugin structure, frontmatter, cross-plugin conflicts (runs independently, not in scan-orchestrator) - **3 new commands**: - `/config-audit:drift` — compare current config against saved baseline - `/config-audit:watch` — on-demand drift check with baseline monitoring - `/config-audit:plugin-health` — audit plugin structure and cross-plugin coherence - **Test fixtures** `test-plugin/` (valid) and `broken-plugin/` (invalid) for plugin health tests - 48 new tests (total 354 across 21 test files) ## [1.4.0] - 2026-04-03 (v2.0 S4: Fix + Rollback Action Pillar) ### Added - **Fix engine** `scanners/fix-engine.mjs` — deterministic auto-fix for 9 fix types: - `json-key-add` (missing $schema), `json-key-remove` (deprecated keys), `json-key-type-fix` (type mismatches, invalid effortLevel), `json-restructure` (hooks array→object, matcher object→string), `frontmatter-rename` (globs→paths), `file-rename` (non-.md→.md) - **Rollback engine** `scanners/rollback-engine.mjs` — listBackups(), restoreBackup(), deleteBackup() with checksum verification - **Fix CLI** `scanners/fix-cli.mjs` — standalone: `node fix-cli.mjs [--apply] [--json] [--global]`, dry-run by default - **Backup lib** `scanners/lib/backup.mjs` — shared backup module with checksums and manifests - **2 new commands**: - `/config-audit:fix` — scan, plan, backup, apply, verify in one flow - `/config-audit:rollback` — list or restore from backups - **PreToolUse hook** `auto-backup-config.mjs` — auto-backup config files before Edit/Write - **Test fixture** `fixable-project/` — fixture with all 9 fixable issue types - 38 new tests (total 306 across 17 test files) ### Changed - `file-discovery.mjs`: walkRulesDir now discovers all files (not just .md) for non-.md validation - `backup-before-change.mjs`: refactored to use shared `lib/backup.mjs` (no logic duplication) - hooks.json: added PreToolUse event with auto-backup ## [1.3.0] - 2026-04-03 (v2.0 S3: Posture + Feature Gap Commands) ### Added - **Scoring module** `scanners/lib/scoring.mjs` — utilization, maturity (5 levels), segments, area scoring, scorecard generation - **Posture CLI** `scanners/posture.mjs` — standalone Node.js tool: `node posture.mjs [--json] [--global]` - **2 new commands**: - `/config-audit:posture` — quick scorecard with A-F grades, utilization%, maturity level - `/config-audit:feature-gap` — deep gap analysis with prioritized next-best-actions - **feature-gap-agent** — Opus agent for deep analysis, report generation (max 200 lines) - **Knowledge file** `gap-closure-templates.md` — 11 templates with effort/gain estimates - **HTML report template** `templates/feature-gap-report.html` — visual report with progress bars, grade badges - 64 new tests (total 268 across 14 test files) ### Changed - Tier weighting: T1 gaps count 3x, T2 count 2x, T3/T4 count 1x in utilization score - Maturity is threshold-based: highest level where ALL requirements are met ## [1.2.0] - 2026-04-03 (v2.0 S2: Advanced Scanners + Knowledge Base) ### Added - **4 advanced scanners** (zero external deps): - `mcp-config-validator.mjs` (MCP) — server types, trust levels, env vars, unknown fields - `import-resolver.mjs` (IMP) — broken @imports, circular refs, deep chains, tilde paths - `conflict-detector.mjs` (CNF) — settings conflicts, permission contradictions, hook duplicates - `feature-gap-scanner.mjs` (GAP) — 25 feature gaps across 4 tiers (Foundation/Depth/Advanced/Enterprise) - **Knowledge base** — 5 reference documents: capabilities, best practices, anti-patterns, hook events, feature evolution - **New test fixtures** — `.mcp.json` files, @import chains, `conflict-project/` fixture - 75 new tests (total 204 across 12 test files) ### Changed - Scan orchestrator runs 8 scanners (was 4) - Analyzer agent cross-references scanner findings with knowledge base ## [1.1.0] - 2026-04-03 (v2.0 S1: Scanner Foundation) ### Added - **Deterministic scanner infrastructure** — 4 Node.js scanners (zero external deps): - `claude-md-linter.mjs` (CML) — CLAUDE.md structure, length, sections, @imports, duplicates - `settings-validator.mjs` (SET) — settings.json schema, unknown/deprecated keys, type checks - `hook-validator.mjs` (HKV) — hooks.json format, script existence, event validity, timeouts - `rules-validator.mjs` (RUL) — .claude/rules/ glob matching, orphan detection, deprecated fields - **Scanner lib** — 5 shared modules: severity, output, file-discovery, yaml-parser, string-utils - **Scan orchestrator** — `scan-orchestrator.mjs` runs all scanners, outputs JSON envelope - **Test infrastructure** — 129 tests across 8 test files using node:test (zero deps) - **Test fixtures** — 4 fixture projects (healthy, broken, empty, minimal) - Finding ID format: `CA-{SCANNER}-{NNN}` (e.g. `CA-CML-001`) ### Fixed - Agent model mismatches: scanner→haiku, analyzer→sonnet, planner→opus, implementer→sonnet, verifier→haiku ### Changed - CLAUDE.md rewritten in English for public release readiness ## [1.0.0] - 2026-02-11 ### Added - Cross-platform support (macOS, Linux, Windows) ### Fixed - `stop-session-reminder.mjs`: Use `path.basename`/`path.dirname` instead of hardcoded `/` split - `backup-before-change.mjs`: Handle both `/` and `\` path separators in safe filename generation ### Removed - "Windows: hooks are 100% bash" from known gaps (was incorrect — all hooks are Node.js) ## [0.7.0] - 2026-02-07 ### Note Version reset from 1.2.0 to reflect actual maturity. Previous version was inflated — this plugin has never been externally tested. ### What exists today - 6 specialized agents (scanner, analyzer, interviewer, planner, implementer, verifier) - Full machine-wide Claude Code configuration discovery - Scope selection (current project, repo, home, full machine) - Inheritance hierarchy mapping and conflict detection - Mandatory backups before any changes - Rollback support - Syntax validation for all configuration files - Quick audit-only mode - Full optimization workflow with HITL checkpoints ### Known gaps - Testing: no automated tests - Onboarding: never verified that a new user can install and use from scratch - External verification: nobody else has ever used this