docs(config-audit): v5 implementation log — Session 4 rc.1 result

2026-05-01 09:19:04 +02:00 · 2026-05-01 09:19:04 +02:00 · 17af3d55f6
commit 17af3d55f6
parent 1ce26fea41
1 changed files with 40 additions and 2 deletions
--- a/plugins/config-audit/docs/v5-implementation-log.md
+++ b/plugins/config-audit/docs/v5-implementation-log.md
@ -136,9 +136,47 @@ Written at the end of each session. State for the next session lives in

 ---

-## Session 4 — rc.1 (TBD)
+## Session 4 — rc.1 (2026-05-01)

-*Steps 24-27.*
+**Goal:** ship `v5.0.0-rc.1` — knowledge rensing + tokenizer calibration. Steps 24-27.
+
+### Steps
+
+- **Step 24 — M8 knowledge rensing.** Replaced "Keep CLAUDE.md under 200 lines" with cache-stability guidance (first 30 lines stable, volatile content below the cache threshold). Added footnote explaining the 200-line rule was a Sonnet-era adherence heuristic. Verified: `grep -q "Keep under 200 lines"` returns no match. Commit: `e1e23ed` `docs(config-audit): knowledge rensing — Opus 4.7 cache-stability guidance (v5 M8)`.
+
+- **Step 25 — M7 cache-telemetry recipe.**
+  - New `knowledge/cache-telemetry-recipe.md` — copy-paste `jq` recipe that sums `cache_read_input_tokens` and `cache_creation_input_tokens` per turn from `~/.claude/projects/<slug>/*.jsonl`. Hit-rate interpretation table, per-turn breakdown for spotting regression turns, design-rationale note explaining why this is a recipe and not a scanner.
+  - `--with-telemetry-recipe` flag on `token-hotspots-cli.mjs`. When present, emits `telemetry_recipe_path` field in JSON output. Without the flag, output unchanged (committed as default deliverable, opt-in at invocation).
+  - `commands/tokens.md` updated: flag documented in Step 1 args, surfaced in next-steps as the cache-verification path after a structural fix.
+  - Tests (×3): negative test (flag absent → field absent), positive test (flag present → string ending in `cache-telemetry-recipe.md`), existing 2 tests still pass. 627 → 628 tests.
+  - Commit: `df6e012` `docs(config-audit): cache-telemetry recipe + --with-telemetry-recipe flag (v5 M7)`.
+
+- **Step 26 — N5 `--accurate-tokens` API calibration.**
+  - New `scanners/lib/tokenizer-api.mjs`: `callCountTokensApi(text, apiKey, options)` wraps Anthropic's `count_tokens` endpoint. Required headers (`x-api-key`, `anthropic-version: 2023-06-01`, `content-type`). 5-second AbortController timeout. Exponential backoff on HTTP 429 (max 3 retries: 1s, 2s, 4s — base configurable for tests). Non-429 HTTP errors throw `count_tokens API failed (key sk-ant-X...): HTTP <status>` with the body deliberately omitted to avoid echo-leak. Network/abort errors masked similarly. `maskKey()` exported as a utility.
+  - `--accurate-tokens` flag on `token-hotspots-cli.mjs`. When `ANTHROPIC_API_KEY` is present, calls the API for the top 3 hotspots and populates `output.calibration = { actual_tokens, source: 'count_tokens_api', sampled_hotspots: 3 }`. When absent, `calibration = { skipped: 'no-api-key' }` plus stderr warning. On API error, `calibration = { skipped: 'api-error', error: <masked-message> }`.
+  - **Mocking pattern correction:** v5-plan specified `mock.method(tokenizerApi, 'callCountTokensApi', ...)` but ESM read-only export bindings reject property redefinition (`TypeError: Cannot redefine property: callCountTokensApi`). Switched to mocking `globalThis.fetch` instead — equivalent coverage at the actual external-dependency boundary. Documented in CHANGELOG Notes and the test-file comment.
+  - Tests (×8): 2× CLI subprocess (no-key skip + flag absence), 6× tokenizer-api unit (key-masking on network error, body-leak protection on 401, AbortController signal threaded, 429 retry with mocked fetch, headers asserted, happy-path fetch mock).
+  - Test count: 628 → 635 (+7 net; the +1 from the "absent-flag" test was added in Step 25 above so the Step 26 delta sees 7 new tests).
+  - Commit: `b741430` `feat(config-audit): --accurate-tokens API calibration (v5 N5) [skip-docs]`.
+
+- **Step 27 — rc.1 wrap.** Added `## [5.0.0-rc.1]` entry to `CHANGELOG.md` with Summary / Added / Changed / Tests / Notes. Documented the SC-6b release-gate carve-out (manual verification before tagging) and the `mock.method` → `fetch` mocking pivot. Commit: `1ce26fe` `docs(config-audit): CHANGELOG 5.0.0-rc.1 entry`.
+
+### Result
+
+- 4 steps shipped, all green. Pushed to Forgejo `main` (autorisert).
+- Test count: 625 → 635 (+10).
+- New files: `knowledge/cache-telemetry-recipe.md`, `scanners/lib/tokenizer-api.mjs`, `tests/scanners/accurate-tokens.test.mjs`.
+- Modified: `knowledge/configuration-best-practices.md`, `scanners/token-hotspots-cli.mjs`, `commands/tokens.md`, `tests/scanners/token-hotspots-cli.test.mjs`, `CHANGELOG.md`.
+- Untouched (scope fence): `README.md`, `CLAUDE.md`, `.claude-plugin/plugin.json` — all wait for Session 5.
+
+### Observations carried into Session 5
+
+- **SC-6b release gate is open.** Before tagging `v5.0.0`, KTG must run `--accurate-tokens` against a known fixture with a real `ANTHROPIC_API_KEY`, manually compare `calibration.actual_tokens` against the byte-estimated value for that fixture, and confirm error ≤ ±5%. If error exceeds ±5%, the heuristic in `estimateTokens` must be re-tuned before tagging.
+- **`mock.method` for ESM modules is a known footgun** — record this in REMEMBER for future scanners that try to stub library exports. Use `globalThis.fetch` mocking, dependency-injection seams, or `vi.mock`-style loaders if needed; do NOT rely on `mock.method` against ESM module namespaces.
+- **`--check-readme` will still fail in beta state.** Self-audit's badge mismatch report (scanners 12 vs 9, tests now 31 vs 543) is by-design until Step 28's straggler sweep aligns README/CLAUDE.md with filesystem truth. Posture-test still expects 10 areas (unchanged in this session).
+- **`fetch` global confirmed working** on Node 25.8.2 (KTG's machine). No fallback to `node:https` needed.
+
+**No blockers carried into Session 5.**

 ---