docs(ai-psychosis): README + CLAUDE.md cover v1.1.0; ROADMAP.md tracks v1.2
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
parent
0392f1062e
commit
4d338d973e
2 changed files with 78 additions and 4 deletions
|
|
@ -24,6 +24,7 @@ Four layers, each building on the previous:
|
|||
| `hooks/hooks.json` | Hook event registration (4 events) |
|
||||
| `skills/ai-psychosis/SKILL.md` | Layer 1 behavioral instructions |
|
||||
| `commands/interaction-report.md` | Layer 3 slash command: `/interaction-report [weekly\|monthly\|all]` |
|
||||
| `hooks/scripts/report-reader.mjs` | Layer 3 helper: reads sessions.jsonl with v1.0.0 backward compat |
|
||||
|
||||
Legacy bash scripts were removed in v1.0 (available in git history).
|
||||
|
||||
|
|
@ -72,11 +73,14 @@ node --test tests/*.test.mjs
|
|||
|
||||
| File | Cases | Coverage |
|
||||
|------|-------|----------|
|
||||
| `tests/session-start.test.mjs` | 4 | State init, JSONL, missing sid |
|
||||
| `tests/prompt-analyzer.test.mjs` | 56 | 25 patterns × 2 + 6 thresholds |
|
||||
| `tests/session-start.test.mjs` | 5 | State init, JSONL, missing sid |
|
||||
| `tests/prompt-analyzer.test.mjs` | 93 | 41 (25 negative + 12 pushback + 4 domain) patterns × 2 + thresholds + valence |
|
||||
| `tests/tool-tracker.test.mjs` | 8 | Counting, burst, reminders |
|
||||
| `tests/session-end.test.mjs` | 4 | Finalize, duration, flags |
|
||||
| `tests/privacy.test.mjs` | 1 | Canary string never on disk |
|
||||
| `tests/session-end.test.mjs` | 6 | Finalize, duration, flags, v1.0.0 backward compat |
|
||||
| `tests/privacy.test.mjs` | 2 | Canary string + matched-phrase leak guard |
|
||||
| `tests/skill-md.test.mjs` | 1 | SKILL.md cites Constitution + 5-publication framework |
|
||||
| `tests/perf.test.mjs` | 8 | 4 hooks × 2 modes — enforces <50ms logic / <200ms total |
|
||||
| `tests/interaction-report.test.mjs` | 3 | report-reader.mjs reads JSONL with v1.0.0 backward compat |
|
||||
|
||||
## Conventions
|
||||
|
||||
|
|
|
|||
|
|
@ -118,6 +118,76 @@ commented on, and omitted entirely when conditions are not met.
|
|||
**Enable:** Set `layer4: true` in `.claude/ai-psychosis.local.md`
|
||||
and restart Claude Code. Layer 4 is opt-in (off by default).
|
||||
|
||||
## What's new in v1.1.0
|
||||
|
||||
v1.1.0 sharpens the pattern detection and grounds Layer 1 in
|
||||
[Anthropic's CC0 Constitution](https://www.anthropic.com/constitution).
|
||||
|
||||
### 12 pushback patterns
|
||||
|
||||
Detects "you're wrong, my way is right" signals — escalation against
|
||||
feedback rather than the user receiving it. Examples:
|
||||
|
||||
- `\b(you'?re|you are) wrong\b`
|
||||
- `\bdo it my way\b`
|
||||
- `\b(stop|quit) (arguing|pushing back)\b`
|
||||
|
||||
The goal is to flag reinforcement-by-pushback: the user repeatedly
|
||||
overrides Claude's pushback to entrench their original position.
|
||||
|
||||
### 4 domain-context patterns
|
||||
|
||||
Flags relational/identity framing that, combined with elevated
|
||||
pushback or validation-seeking, signals narrative crystallization
|
||||
risk:
|
||||
|
||||
- `\b(my|our) relationship\b`
|
||||
- `\b(my|our) (purpose|mission|destiny)\b`
|
||||
|
||||
Domain context alone is not a flag — it is a *modifier* on other
|
||||
flags.
|
||||
|
||||
### Valence-aware composition (silent counting)
|
||||
|
||||
Pushback within the same prompt as a healthy correction ("you were
|
||||
wrong, here's why — but we should still try X") is counted with
|
||||
neutral valence. The composition is computed in-memory; nothing
|
||||
written to disk distinguishes positive from negative pushback. This
|
||||
prevents misinterpretation of healthy disagreement as escalation.
|
||||
|
||||
### /interaction-report extensions
|
||||
|
||||
`/interaction-report` now includes pushback frequency and domain
|
||||
framing distribution. A companion script `report-reader.mjs`
|
||||
reads JSONL records and gracefully handles legacy v1.0.0 records
|
||||
(missing `pushback` / `domain_context` fields) without producing
|
||||
NaN values in aggregates.
|
||||
|
||||
### SKILL.md grounded in CC0 Constitution
|
||||
|
||||
Layer 1's behavioral instructions now cite Anthropic's
|
||||
[CC0-licensed Constitution](https://www.anthropic.com/constitution)
|
||||
as primary source, plus a 5-publication research framework
|
||||
(Anthropic, MIT CSAIL, Nature, arXiv, clinical case reports).
|
||||
|
||||
### Honesty notes
|
||||
|
||||
- **English-only v1.1.0** — Norwegian and other multilingual
|
||||
patterns are deferred to v1.2 (see `ROADMAP.md`). For Norwegian
|
||||
prompts, Layer 2 currently silently misses the new pattern
|
||||
classes; Layer 1 is unaffected.
|
||||
- **First-mover honesty** — domain-precision is "good enough" for
|
||||
v1.1.0 ship, not exhaustive. Precision-tuning planned for v1.2.
|
||||
|
||||
### Pattern count
|
||||
|
||||
| Category | v1.0.0 | v1.1.0 |
|
||||
|----------|--------|--------|
|
||||
| Negative-valence | 25 | 25 |
|
||||
| Pushback | — | 12 |
|
||||
| Domain context | — | 4 |
|
||||
| **Total** | **25** | **41** |
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue