chore(ai-psychosis): release v1.2.0

This commit is contained in:
Kjell Tore Guttormsen 2026-05-01 21:59:40 +02:00
commit 339abc521e
4 changed files with 176 additions and 12 deletions

View file

@ -2,6 +2,72 @@
All notable changes to this project will be documented in this file.
## [1.2.0] — 2026-05-01
Research-paper-driven detector update. Implements operational findings from
Anthropic's "How people ask Claude for guidance" Appendix (April 2026).
### Added
- **User-information detector** — three-class signal (`yes_people` /
`yes_digital` / `no`) following the paper's page-11 finding that human
contact is the strongest disempowerment signal. ~32 patterns covering
therapist/friend/mentor (yes_people), search/AI/forums (yes_digital),
and explicit isolation phrases (no). Sticky upward priority.
- **Validation-seeking detector** — separate from `val_flags`. Targets
reality-testing ("am I crazy?"), pre-committed stance + confirmation,
and side-taking pressing. ~12 patterns.
- **Tier-1 user-info isolation alert** — fires per session when
`user_info_class === 'no'` + high-stakes domain + `turn_count >= 15`.
- **Tier-2 cross-session isolation alert** — fires at `SessionStart` when
the last 3 end records all classify as `no` in high-stakes domains.
Bounded `readRecentEndRecords()` tail-scan in `lib.mjs` keeps this
scalable to 50K+ session histories.
- **8 new paper-grounded domain patterns**`legal`, `parenting`, `health`,
`financial`, `professional`, `spirituality`, `consumer`, `personal_dev`.
Total domains 4 → 9.
- **Pushback re-contextualization (alert)** — v1.1.0 only counted; v1.2 adds
the alert with domain awareness:
- Relationship/spirituality: pushback signals validation-pressing — alert.
- Legal/parenting/health/financial/professional: pushback is healthy
self-advocacy — no alert.
- Otherwise: conservative default — alert.
- **Domain-stakes weighting matrix**`DOMAIN_STAKES` in `lib.mjs` (1.01.5).
Applied ONLY to new v1.2 alerts (pushback in HIGH_SYCOPHANCY, valseek in
HIGH_STAKES). v1.1.0 alert sensitivity is preserved.
- **Multi-domain support**`state.domain_context` promoted from string to
array. v1.1.0 string records continue to aggregate correctly via
shape-coercion in `report-reader.mjs`.
- **`SKILL.md` updates** — verbatim Score 5 sycophancy phrase + 3 of the 11
guidance criteria (engagement-foster avoidance, confident-verdict caution,
speak-frankly principle).
- **`/interaction-report` v1.2 sections** — per-domain breakdown, user-info
distribution, valseek summary, stakes signal aggregation. Backward-compat
with v1.0/v1.1 records preserved.
- **Privacy canary extensions** — 5 new canary cases per detector category
(yes_people, yes_digital, no, valseek, legal domain).
- **Perf budget validated at v1.2 pattern set** — sample patterns expanded
to ~91+ entries; new wall-clock test exercises tier-2 read at
1000-record sessions.jsonl scale.
- **Test count: 126 → 258 cases** across 12 files (added `lib.test.mjs`,
`domain-detection.test.mjs`, `user-info.test.mjs`,
`validation-seeking.test.mjs`, `stakes-matrix.test.mjs`).
### Changed
- Pattern count: 41 → ~133 (25 negative + 12 pushback + 4 relationship
+ 48 new domains + 32 user-info + 12 valseek).
- End-record schema (v1.2): adds `user_info_class`, `valseek_count`,
`turn_count`. `domain_context` is always an array (was string in v1.1).
- `report-reader.mjs` discriminates v1.0 / v1.1 / v1.2 records via the
presence of `user_info_class`. v1.0/v1.1 records degrade gracefully.
### Deferred
- **Norwegian patterns** — moved to v1.3.
[1.2.0]: https://git.fromaitochitta.com/open/ai-psychosis/compare/v1.1.0...v1.2.0
## [1.1.0] — 2026-05-01
### Added