114 lines
4.3 KiB
Markdown
114 lines
4.3 KiB
Markdown
# Cache Telemetry Recipe
|
||
|
||
> Manual recipe for verifying prompt-cache hit rate from Claude Code session
|
||
> transcripts. Opt-in. The TOK scanner is structural — it estimates token cost
|
||
> from disk content but never reads runtime telemetry. This recipe closes that
|
||
> gap when you need to confirm a structural fix actually improved cache reuse.
|
||
>
|
||
> Last verified 2026-05-01 against Claude Code transcript schema.
|
||
|
||
## Synopsis
|
||
|
||
Each turn in a Claude Code session is logged as a JSONL entry under
|
||
`~/.claude/projects/<slug>/`. Anthropic's API response includes
|
||
`cache_read_input_tokens` and `cache_creation_input_tokens` per turn, and Claude
|
||
Code persists these in the transcript. Summing them gives a per-session cache
|
||
hit rate without needing the API key or any external service.
|
||
|
||
A high cache-read share (≥ 70%) means structural fixes are working. A low share
|
||
(< 30%) means something at the top of the prompt is changing per turn —
|
||
typically a CLAUDE.md timestamp, a rolling counter, or a deep `@import`
|
||
boundary. Cross-reference with `/config-audit tokens` to find the culprit.
|
||
|
||
## Recipe
|
||
|
||
### 1. Locate the transcript
|
||
|
||
```bash
|
||
# Newest transcript for the current project
|
||
PROJECT_SLUG=$(pwd | sed 's|/|-|g')
|
||
TRANSCRIPT=$(ls -t ~/.claude/projects/${PROJECT_SLUG}/*.jsonl 2>/dev/null | head -1)
|
||
echo "Transcript: $TRANSCRIPT"
|
||
```
|
||
|
||
If no transcript exists, run a few turns in Claude Code first.
|
||
|
||
### 2. Sum cache tokens per turn
|
||
|
||
```bash
|
||
# Requires jq. Sums cache_read and cache_creation across all turns.
|
||
jq -s '
|
||
[.[] | select(.type == "assistant" and .message.usage)]
|
||
| {
|
||
turns: length,
|
||
cache_read: ([.[] | .message.usage.cache_read_input_tokens // 0] | add),
|
||
cache_creation: ([.[] | .message.usage.cache_creation_input_tokens // 0] | add),
|
||
input_no_cache: ([.[] | .message.usage.input_tokens // 0] | add)
|
||
}
|
||
| . + {
|
||
total_input: (.cache_read + .cache_creation + .input_no_cache),
|
||
hit_rate: (if (.cache_read + .cache_creation + .input_no_cache) > 0
|
||
then (.cache_read / (.cache_read + .cache_creation + .input_no_cache))
|
||
else 0 end)
|
||
}
|
||
' "$TRANSCRIPT"
|
||
```
|
||
|
||
Example output:
|
||
|
||
```json
|
||
{
|
||
"turns": 18,
|
||
"cache_read": 458320,
|
||
"cache_creation": 12440,
|
||
"input_no_cache": 5120,
|
||
"total_input": 475880,
|
||
"hit_rate": 0.9631
|
||
}
|
||
```
|
||
|
||
### 3. Interpret
|
||
|
||
| Hit rate | Reading |
|
||
|----------|---------|
|
||
| ≥ 0.85 | Cache structure healthy. Structural fixes are paying off. |
|
||
| 0.50–0.85 | Cache works but something near the prefix is shifting. Inspect first 30 lines of CLAUDE.md and any `@import`-ed file. |
|
||
| 0.20–0.50 | Cache is being broken most turns. Likely a volatile CLAUDE.md top-of-file (timestamp, session id, rolling activity log) or a `defaultMode` flip. Run `/config-audit tokens` to locate. |
|
||
| < 0.20 | Cache is essentially disabled. Either the prefix is rewritten every turn, or the session is so short caching never warmed up. |
|
||
|
||
### 4. Per-turn breakdown (for spotting the regression turn)
|
||
|
||
```bash
|
||
jq -c '
|
||
select(.type == "assistant" and .message.usage)
|
||
| {
|
||
ts: .timestamp,
|
||
cache_read: (.message.usage.cache_read_input_tokens // 0),
|
||
cache_creation: (.message.usage.cache_creation_input_tokens // 0)
|
||
}
|
||
' "$TRANSCRIPT" | head -20
|
||
```
|
||
|
||
Look for turns where `cache_read` drops sharply and `cache_creation` spikes —
|
||
that's a cache invalidation event. Whatever changed in CLAUDE.md, settings.json,
|
||
or the active `@import` chain at that moment is the cause.
|
||
|
||
## Why this is a recipe, not a scanner
|
||
|
||
Parsing transcripts as a core scanner feature was rejected during v5 planning:
|
||
|
||
1. Transcripts are user-private session data. Bundling parsing logic implies
|
||
the plugin reads transcripts by default, which crosses a privacy boundary.
|
||
2. Transcript schema is undocumented and may change without notice. A scanner
|
||
would silently drift.
|
||
3. The recipe form (jq one-liner) is auditable in 30 seconds. A bundled parser
|
||
is not.
|
||
|
||
Surface area stays read-only and structural. This file is the escape hatch
|
||
when structural signal alone isn't enough.
|
||
|
||
## See also
|
||
|
||
- `knowledge/opus-4.7-patterns.md` — structural patterns the TOK scanner detects (CA-TOK-001..005)
|
||
- `knowledge/configuration-best-practices.md` — CLAUDE.md cache-stability guidance
|
||
- `/config-audit tokens --with-telemetry-recipe` — surfaces a pointer to this file in JSON output
|