diff --git a/plugins/config-audit/commands/tokens.md b/plugins/config-audit/commands/tokens.md index 92e1342..54df519 100644 --- a/plugins/config-audit/commands/tokens.md +++ b/plugins/config-audit/commands/tokens.md @@ -29,6 +29,7 @@ Split `$ARGUMENTS` into a path and flags. Path is the first non-flag argument. D - `--global` — also include the user-level `~/.claude/` cascade - `--json` — emit raw JSON instead of rendered tables (power-user mode) +- `--with-telemetry-recipe` — include `telemetry_recipe_path` in the JSON output, pointing to `knowledge/cache-telemetry-recipe.md`. Use this when you want to verify a structural fix actually improved cache hit rate (manual jq recipe, opt-in) ### Step 2: Run the CLI silently @@ -105,6 +106,7 @@ rm -f "$TMPFILE" - **`/config-audit posture`** — overall health scorecard (Token Efficiency is the 8th area) - **`/config-audit fix`** — auto-fix deterministic issues (where applicable) - See `knowledge/opus-4.7-patterns.md` for the full pattern catalogue (CA-TOK-001 … 003) +- **Verify cache hit rate after a fix:** rerun with `--with-telemetry-recipe` to surface the path to `knowledge/cache-telemetry-recipe.md` — a copy-paste `jq` recipe that reads cache hit rate from your session transcripts. Opt-in. The TOK scanner is structural; this recipe is the runtime escape hatch. ``` ## Scope and limits diff --git a/plugins/config-audit/knowledge/cache-telemetry-recipe.md b/plugins/config-audit/knowledge/cache-telemetry-recipe.md new file mode 100644 index 0000000..a9d767f --- /dev/null +++ b/plugins/config-audit/knowledge/cache-telemetry-recipe.md @@ -0,0 +1,114 @@ +# Cache Telemetry Recipe + +> Manual recipe for verifying prompt-cache hit rate from Claude Code session +> transcripts. Opt-in. The TOK scanner is structural — it estimates token cost +> from disk content but never reads runtime telemetry. This recipe closes that +> gap when you need to confirm a structural fix actually improved cache reuse. +> +> Last verified 2026-05-01 against Claude Code transcript schema. + +## Synopsis + +Each turn in a Claude Code session is logged as a JSONL entry under +`~/.claude/projects//`. Anthropic's API response includes +`cache_read_input_tokens` and `cache_creation_input_tokens` per turn, and Claude +Code persists these in the transcript. Summing them gives a per-session cache +hit rate without needing the API key or any external service. + +A high cache-read share (≥ 70%) means structural fixes are working. A low share +(< 30%) means something at the top of the prompt is changing per turn — +typically a CLAUDE.md timestamp, a rolling counter, or a deep `@import` +boundary. Cross-reference with `/config-audit tokens` to find the culprit. + +## Recipe + +### 1. Locate the transcript + +```bash +# Newest transcript for the current project +PROJECT_SLUG=$(pwd | sed 's|/|-|g') +TRANSCRIPT=$(ls -t ~/.claude/projects/${PROJECT_SLUG}/*.jsonl 2>/dev/null | head -1) +echo "Transcript: $TRANSCRIPT" +``` + +If no transcript exists, run a few turns in Claude Code first. + +### 2. Sum cache tokens per turn + +```bash +# Requires jq. Sums cache_read and cache_creation across all turns. +jq -s ' + [.[] | select(.type == "assistant" and .message.usage)] + | { + turns: length, + cache_read: ([.[] | .message.usage.cache_read_input_tokens // 0] | add), + cache_creation: ([.[] | .message.usage.cache_creation_input_tokens // 0] | add), + input_no_cache: ([.[] | .message.usage.input_tokens // 0] | add) + } + | . + { + total_input: (.cache_read + .cache_creation + .input_no_cache), + hit_rate: (if (.cache_read + .cache_creation + .input_no_cache) > 0 + then (.cache_read / (.cache_read + .cache_creation + .input_no_cache)) + else 0 end) + } +' "$TRANSCRIPT" +``` + +Example output: + +```json +{ + "turns": 18, + "cache_read": 458320, + "cache_creation": 12440, + "input_no_cache": 5120, + "total_input": 475880, + "hit_rate": 0.9631 +} +``` + +### 3. Interpret + +| Hit rate | Reading | +|----------|---------| +| ≥ 0.85 | Cache structure healthy. Structural fixes are paying off. | +| 0.50–0.85 | Cache works but something near the prefix is shifting. Inspect first 30 lines of CLAUDE.md and any `@import`-ed file. | +| 0.20–0.50 | Cache is being broken most turns. Likely a volatile CLAUDE.md top-of-file (timestamp, session id, rolling activity log) or a `defaultMode` flip. Run `/config-audit tokens` to locate. | +| < 0.20 | Cache is essentially disabled. Either the prefix is rewritten every turn, or the session is so short caching never warmed up. | + +### 4. Per-turn breakdown (for spotting the regression turn) + +```bash +jq -c ' + select(.type == "assistant" and .message.usage) + | { + ts: .timestamp, + cache_read: (.message.usage.cache_read_input_tokens // 0), + cache_creation: (.message.usage.cache_creation_input_tokens // 0) + } +' "$TRANSCRIPT" | head -20 +``` + +Look for turns where `cache_read` drops sharply and `cache_creation` spikes — +that's a cache invalidation event. Whatever changed in CLAUDE.md, settings.json, +or the active `@import` chain at that moment is the cause. + +## Why this is a recipe, not a scanner + +Parsing transcripts as a core scanner feature was rejected during v5 planning: + +1. Transcripts are user-private session data. Bundling parsing logic implies + the plugin reads transcripts by default, which crosses a privacy boundary. +2. Transcript schema is undocumented and may change without notice. A scanner + would silently drift. +3. The recipe form (jq one-liner) is auditable in 30 seconds. A bundled parser + is not. + +Surface area stays read-only and structural. This file is the escape hatch +when structural signal alone isn't enough. + +## See also + +- `knowledge/opus-4.7-patterns.md` — structural patterns the TOK scanner detects (CA-TOK-001..005) +- `knowledge/configuration-best-practices.md` — CLAUDE.md cache-stability guidance +- `/config-audit tokens --with-telemetry-recipe` — surfaces a pointer to this file in JSON output diff --git a/plugins/config-audit/scanners/token-hotspots-cli.mjs b/plugins/config-audit/scanners/token-hotspots-cli.mjs index 4b75106..31028ce 100755 --- a/plugins/config-audit/scanners/token-hotspots-cli.mjs +++ b/plugins/config-audit/scanners/token-hotspots-cli.mjs @@ -6,27 +6,34 @@ * * Usage: * node token-hotspots-cli.mjs [path] [--json] [--output-file ] [--global] + * [--with-telemetry-recipe] * * Exit codes: 0=ok, 3=unrecoverable error. * Zero external dependencies. */ -import { resolve } from 'node:path'; +import { resolve, dirname, join } from 'node:path'; +import { fileURLToPath } from 'node:url'; import { writeFile, stat } from 'node:fs/promises'; import { discoverConfigFiles } from './lib/file-discovery.mjs'; import { resetCounter } from './lib/output.mjs'; import { scan } from './token-hotspots.mjs'; +const __dirname = dirname(fileURLToPath(import.meta.url)); +const TELEMETRY_RECIPE_PATH = resolve(__dirname, '..', 'knowledge', 'cache-telemetry-recipe.md'); + async function main() { const args = process.argv.slice(2); let targetPath = '.'; let outputFile = null; let jsonMode = false; let includeGlobal = false; + let withTelemetryRecipe = false; for (let i = 0; i < args.length; i++) { if (args[i] === '--json') jsonMode = true; else if (args[i] === '--global') includeGlobal = true; + else if (args[i] === '--with-telemetry-recipe') withTelemetryRecipe = true; else if (args[i] === '--output-file' && args[i + 1]) outputFile = args[++i]; else if (!args[i].startsWith('-')) targetPath = args[i]; } @@ -58,6 +65,10 @@ async function main() { counts: result.counts, }; + if (withTelemetryRecipe) { + payload.telemetry_recipe_path = TELEMETRY_RECIPE_PATH; + } + const json = JSON.stringify(payload, null, 2); if (outputFile) { diff --git a/plugins/config-audit/tests/scanners/token-hotspots-cli.test.mjs b/plugins/config-audit/tests/scanners/token-hotspots-cli.test.mjs index ef97a94..f6bfde5 100644 --- a/plugins/config-audit/tests/scanners/token-hotspots-cli.test.mjs +++ b/plugins/config-audit/tests/scanners/token-hotspots-cli.test.mjs @@ -42,6 +42,30 @@ describe('token-hotspots-cli', () => { await unlink(out).catch(() => {}); } }); + + it('omits telemetry_recipe_path when --with-telemetry-recipe is absent', async () => { + const { stdout } = await exec('node', [CLI, FIXTURE, '--json'], { + timeout: 30000, + cwd: REPO, + }); + const json = JSON.parse(stdout); + assert.equal(json.telemetry_recipe_path, undefined, + 'telemetry_recipe_path must NOT appear without the flag'); + }); + + it('includes telemetry_recipe_path when --with-telemetry-recipe is passed', async () => { + const { stdout } = await exec('node', [CLI, FIXTURE, '--json', '--with-telemetry-recipe'], { + timeout: 30000, + cwd: REPO, + }); + const json = JSON.parse(stdout); + assert.equal(typeof json.telemetry_recipe_path, 'string'); + assert.ok(json.telemetry_recipe_path.length > 0, 'expected non-empty path'); + assert.ok( + json.telemetry_recipe_path.endsWith('cache-telemetry-recipe.md'), + `expected path to end with cache-telemetry-recipe.md, got ${json.telemetry_recipe_path}` + ); + }); }); describe('scan-orchestrator integration — TOK hotspots survive envelope', () => {