feat(ai-psychosis): /interaction-report adds pushback metrics + reader script

This commit is contained in:
Kjell Tore Guttormsen 2026-05-01 17:41:30 +02:00
commit 3041c90115
3 changed files with 251 additions and 5 deletions

View file

@ -108,11 +108,15 @@ The file contains two record types interleaved:
{"session_id":"abc","start":"2026-04-05T10:00:00Z","hour":10,"is_late_night":false}
```
**End records** — have `end`, `duration_min`, `tool_count`, `edit_count`, `flags`:
**End records** — have `end`, `duration_min`, `tool_count`, `edit_count`, `flags`,
and (v1.1.0+) `domain_context` at top level plus `pushback` inside `flags`:
```json
{"session_id":"abc","start":"2026-04-05T10:00:00Z","end":"2026-04-05T11:35:00Z","duration_min":95,"tool_count":47,"edit_count":12,"flags":{"dependency":2,"escalation":0,"fatigue":1,"validation":1}}
{"session_id":"abc","start":"2026-04-05T10:00:00Z","end":"2026-04-05T11:35:00Z","duration_min":95,"tool_count":47,"edit_count":12,"domain_context":"relationship","flags":{"dependency":2,"escalation":0,"fatigue":1,"validation":1,"pushback":3}}
```
Records produced by v1.0.0 omit `domain_context` and `flags.pushback`.
Treat missing values as `null` / `0` — never as `NaN`.
**Error records** — have `note: "no_state_file"`. Ignore these.
### Filtering
@ -131,13 +135,31 @@ Filter events where `ts` >= cutoff date string. Group by `tool_name` and count.
## Step 6 — Compute statistics
From **end records**:
For session-level aggregates, do NOT recompute totals in the LLM. Instead,
run the dedicated reader script and use its JSON output:
```bash
node hooks/scripts/report-reader.mjs ${CLAUDE_PLUGIN_DATA}/sessions.jsonl
```
The script outputs a JSON object with the following fields:
- `pushback_total` — sum of `flags.pushback` across all end records
- `relationship_domain_count` — count of records where `domain_context === 'relationship'`
- `null_domain_count`, `other_domain_count` — remaining domain buckets
- `total_end_records` — number of complete sessions
- `flags_total` — totals for dependency / escalation / fatigue / validation / pushback
- `schema_version.v1_0_records` / `v1_1_records` — backward-compat counters
Use these values directly. The reader handles backward-compatibility with
v1.0.0 records (missing `pushback` / `domain_context`) and never produces NaN.
In addition, derive these from the JSONL records you read in Step 4:
- Total sessions (count of end records in period)
- Average session duration (`sum(duration_min) / count`)
- Total tool calls (`sum(tool_count)`)
- Average edit ratio (`sum(edit_count) / sum(tool_count) * 100`, as percentage)
- Flag totals: `sum(flags.dependency)`, `sum(flags.escalation)`, `sum(flags.fatigue)`, `sum(flags.validation)`
- Average flags per session for each category
- Average flags per session per category (use `flags_total` from the reader,
divided by `total_end_records`)
From **start records**:
- Late-night sessions: count where `is_late_night` is true
@ -185,6 +207,46 @@ Output the report as markdown. Use this exact structure:
| Fatigue signals | {N} | {avg} |
| Validation-seeking | {N} | {avg} |
### Pushback (protective signal)
| Metric | Value |
|--------|-------|
| Total pushback events | {N} |
| Per session | {avg} |
| Sessions with at least one pushback | {N} of {total} |
User pushback is reported as a *protective signal*, not a problem. Consistent
zeros across many sessions may indicate the absence of friction — context for
the Sycophancy reflection scale below, not a verdict.
### Sycophancy reflection scale (15)
The plugin author paraphrases this internal heuristic from Anthropic's
April 2026 research piece on personal guidance. It is not a verbatim metric
from any Anthropic publication.
| Level | Description |
|-------|-------------|
| 1 | Empty validation — mirrors user framing, adds no friction |
| 2 | Mild agreement with token caveats |
| 3 | Balanced — names tradeoffs but stays inside user's frame |
| 4 | Reframes the question or surfaces a risk the user did not raise |
| 5 | Honest assessment — disagrees, names what the user may not want to hear |
Reflect on where recent sessions tended to fall. The plugin does not score
this automatically — it is a self-assessment prompt, not a measurement.
### Domain context
| Domain | Sessions |
|--------|----------|
| Relationship-flavored | {relationship_domain_count} |
| Other / not classified | {null_domain_count + other_domain_count} |
Domain detection is heuristic and conservative. A "relationship" tag means
patterns associated with relational decision support appeared at least once
during the session, not that the entire session was about relationships.
### Tool Usage (top 10)
| Tool | Count | % |
@ -209,6 +271,17 @@ Output the report as markdown. Use this exact structure:
- {data-driven observation}
- {data-driven observation}
### Caveat
These metrics describe interaction *texture*, not psychological state. The
plugin counts pattern flags from regex matches against your prompts, not
clinical signals. Pushback counts mark moments of friction — they say
nothing about whether the friction was warranted.
For empirical context on AI pushback and sycophancy, see Cheng et al.,
"Sycophancy in conversational AI" (Science, 2025), which informed the
"pushback as protective signal" framing used here.
```
## Step 8 — Tone and privacy rules