diff --git a/plugins/ai-psychosis/CLAUDE.md b/plugins/ai-psychosis/CLAUDE.md index 3457930..73fef54 100644 --- a/plugins/ai-psychosis/CLAUDE.md +++ b/plugins/ai-psychosis/CLAUDE.md @@ -24,6 +24,7 @@ Four layers, each building on the previous: | `hooks/hooks.json` | Hook event registration (4 events) | | `skills/ai-psychosis/SKILL.md` | Layer 1 behavioral instructions | | `commands/interaction-report.md` | Layer 3 slash command: `/interaction-report [weekly\|monthly\|all]` | +| `hooks/scripts/report-reader.mjs` | Layer 3 helper: reads sessions.jsonl with v1.0.0 backward compat | Legacy bash scripts were removed in v1.0 (available in git history). @@ -72,11 +73,14 @@ node --test tests/*.test.mjs | File | Cases | Coverage | |------|-------|----------| -| `tests/session-start.test.mjs` | 4 | State init, JSONL, missing sid | -| `tests/prompt-analyzer.test.mjs` | 56 | 25 patterns × 2 + 6 thresholds | +| `tests/session-start.test.mjs` | 5 | State init, JSONL, missing sid | +| `tests/prompt-analyzer.test.mjs` | 93 | 41 (25 negative + 12 pushback + 4 domain) patterns × 2 + thresholds + valence | | `tests/tool-tracker.test.mjs` | 8 | Counting, burst, reminders | -| `tests/session-end.test.mjs` | 4 | Finalize, duration, flags | -| `tests/privacy.test.mjs` | 1 | Canary string never on disk | +| `tests/session-end.test.mjs` | 6 | Finalize, duration, flags, v1.0.0 backward compat | +| `tests/privacy.test.mjs` | 2 | Canary string + matched-phrase leak guard | +| `tests/skill-md.test.mjs` | 1 | SKILL.md cites Constitution + 5-publication framework | +| `tests/perf.test.mjs` | 8 | 4 hooks × 2 modes — enforces <50ms logic / <200ms total | +| `tests/interaction-report.test.mjs` | 3 | report-reader.mjs reads JSONL with v1.0.0 backward compat | ## Conventions diff --git a/plugins/ai-psychosis/README.md b/plugins/ai-psychosis/README.md index aa04b81..4d9b392 100644 --- a/plugins/ai-psychosis/README.md +++ b/plugins/ai-psychosis/README.md @@ -118,6 +118,76 @@ commented on, and omitted entirely when conditions are not met. **Enable:** Set `layer4: true` in `.claude/ai-psychosis.local.md` and restart Claude Code. Layer 4 is opt-in (off by default). +## What's new in v1.1.0 + +v1.1.0 sharpens the pattern detection and grounds Layer 1 in +[Anthropic's CC0 Constitution](https://www.anthropic.com/constitution). + +### 12 pushback patterns + +Detects "you're wrong, my way is right" signals — escalation against +feedback rather than the user receiving it. Examples: + +- `\b(you'?re|you are) wrong\b` +- `\bdo it my way\b` +- `\b(stop|quit) (arguing|pushing back)\b` + +The goal is to flag reinforcement-by-pushback: the user repeatedly +overrides Claude's pushback to entrench their original position. + +### 4 domain-context patterns + +Flags relational/identity framing that, combined with elevated +pushback or validation-seeking, signals narrative crystallization +risk: + +- `\b(my|our) relationship\b` +- `\b(my|our) (purpose|mission|destiny)\b` + +Domain context alone is not a flag — it is a *modifier* on other +flags. + +### Valence-aware composition (silent counting) + +Pushback within the same prompt as a healthy correction ("you were +wrong, here's why — but we should still try X") is counted with +neutral valence. The composition is computed in-memory; nothing +written to disk distinguishes positive from negative pushback. This +prevents misinterpretation of healthy disagreement as escalation. + +### /interaction-report extensions + +`/interaction-report` now includes pushback frequency and domain +framing distribution. A companion script `report-reader.mjs` +reads JSONL records and gracefully handles legacy v1.0.0 records +(missing `pushback` / `domain_context` fields) without producing +NaN values in aggregates. + +### SKILL.md grounded in CC0 Constitution + +Layer 1's behavioral instructions now cite Anthropic's +[CC0-licensed Constitution](https://www.anthropic.com/constitution) +as primary source, plus a 5-publication research framework +(Anthropic, MIT CSAIL, Nature, arXiv, clinical case reports). + +### Honesty notes + +- **English-only v1.1.0** — Norwegian and other multilingual + patterns are deferred to v1.2 (see `ROADMAP.md`). For Norwegian + prompts, Layer 2 currently silently misses the new pattern + classes; Layer 1 is unaffected. +- **First-mover honesty** — domain-precision is "good enough" for + v1.1.0 ship, not exhaustive. Precision-tuning planned for v1.2. + +### Pattern count + +| Category | v1.0.0 | v1.1.0 | +|----------|--------|--------| +| Negative-valence | 25 | 25 | +| Pushback | — | 12 | +| Domain context | — | 4 | +| **Total** | **25** | **41** | + ## Architecture ```