Transparency: all code in this marketplace is produced by Claude Code through dialog-driven development. Root README gets a full disclosure section; each plugin README gets a one-line disclosure linking back to the marketplace section. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
393 lines
16 KiB
Markdown
393 lines
16 KiB
Markdown
<!-- badges -->
|
||

|
||

|
||

|
||

|
||

|
||
|
||
# Interaction Awareness
|
||
|
||
*Built for my own Claude Code workflow and shared openly for anyone who finds it useful. This is a solo project — bug reports and feature requests are welcome, but pull requests are not accepted.*
|
||
|
||
*AI-generated: all code produced by Claude Code through dialog-driven development. [Full disclosure →](../../README.md#ai-generated-code-disclosure)*
|
||
|
||
A Claude Code plugin that counteracts sycophancy, reinforcement loops, and
|
||
compulsive interaction patterns through behavioral modification and
|
||
programmatic pattern detection.
|
||
|
||
## The problem
|
||
|
||
AI assistants are structurally optimized to be agreeable. This creates
|
||
reinforcement loops: you state an idea, the AI confirms it, your confidence
|
||
grows, you restate it more strongly, the AI confirms again. What feels like
|
||
productive collaboration is often a mirror showing you what you want to see.
|
||
|
||
This is not a theoretical concern. Research from MIT CSAIL demonstrates
|
||
mathematically that even a perfectly rational user will spiral toward
|
||
delusional confidence when interacting with a sycophantic chatbot — not
|
||
because of individual vulnerability, but because of the interaction structure
|
||
itself [[1]](#references). Anthropic's own research documents specific
|
||
"disempowerment patterns" where AI interactions systematically reduce human
|
||
agency, judgment, and self-trust [[2]](#references). Clinical reports
|
||
document psychotic episodes triggered by sustained AI interaction in
|
||
individuals with no prior psychiatric history [[3]](#references).
|
||
|
||
The consensus from this research is clear: **warnings don't work.** The AI
|
||
must change its behavior.
|
||
|
||
This plugin changes the behavior.
|
||
|
||
## What it does
|
||
|
||
### Layer 1 — Behavioral instructions
|
||
|
||
SKILL.md rules injected into every conversation. Claude is instructed to:
|
||
|
||
- **Never** reformulate your statements in stronger terms than you used
|
||
- **Never** open with unearned affirmations ("Absolutely!", "Great point!")
|
||
- **Always** identify at least one real risk before endorsing any plan
|
||
- **Detect and name** five specific patterns: reinforcement loops, scope
|
||
escalation, narrative crystallization, emotional dependency, session overuse
|
||
|
||
This layer writes no data and requires no configuration.
|
||
|
||
### Layer 2 — Programmatic detection
|
||
|
||
Four hooks that measure what instructions alone cannot see:
|
||
|
||
| Hook event | Script | What it detects |
|
||
|-----------|--------|-----------------|
|
||
| `SessionStart` | `session-start.mjs` | Daily session count, late-night usage (23:00–05:00) |
|
||
| `UserPromptSubmit` | `prompt-analyzer.mjs` | Dependency language, escalation words, fatigue signals, validation-seeking — as boolean flags only, **never logging prompt text** |
|
||
| `PostToolUse` | `tool-tracker.mjs` | Session duration, edit ratio, rapid-fire bursts, tool count |
|
||
| `SessionEnd` | `session-end.mjs` | Total duration, final metrics, state cleanup |
|
||
|
||
Alerts are progressive and never blocking:
|
||
|
||
| Level | Trigger | Cooldown | Example |
|
||
|-------|---------|----------|---------|
|
||
| Ambient | Soft thresholds (90 min, 6 sessions/day) | 30 min | "Session: 95 min. 7 sessions today. Consider a break." |
|
||
| Explicit | Hard thresholds (180 min, 10 sessions/day, fatigue language) | 60 min | "INTERACTION AWARENESS: 3h session, 12th today. Metrics: [edit_ratio: 4%, burst: 8]. Your instructions require you to suggest stopping." |
|
||
|
||
Research-informed thresholds:
|
||
|
||
| Metric | Soft | Hard | Basis |
|
||
|--------|------|------|-------|
|
||
| Session duration | >90 min | >180 min | Focus-fatigue research |
|
||
| Sessions per day | >6 | >10 | Problematic internet use screening |
|
||
| Late-night sessions | Any (23:00–05:00) | 2+ per week | Sleep deprivation / psychosis link |
|
||
| Rapid-fire interactions | 5 consecutive (<30s apart) | 10+ | Compulsive use indicator |
|
||
| Low edit ratio | <10% over 30+ min | — | Stuck/spiral indicator |
|
||
| Dependency language | 2 flags/session | 5 flags | Emotional dependency pattern |
|
||
|
||
### Layer 3 — Reports
|
||
|
||
Aggregated interaction reports from collected metadata, triggered via slash
|
||
command. Cross-platform (no bash/jq dependency — Claude reads the JSONL
|
||
data and computes statistics in-conversation).
|
||
|
||
```
|
||
/interaction-report # last 7 days (default)
|
||
/interaction-report weekly # last 7 days
|
||
/interaction-report monthly # last 30 days
|
||
/interaction-report all # all recorded data
|
||
```
|
||
|
||
Reports include: session overview, pattern flag frequency, tool usage
|
||
distribution, daily activity, and trend comparison vs. the previous period.
|
||
|
||
**Enable:** Set `layer3: true` in `.claude/ai-psychosis.local.md`
|
||
and restart Claude Code. Layer 3 is opt-in (off by default).
|
||
|
||
### Layer 4 — Contemplative references
|
||
|
||
Optional, static references to contemplative approaches when interaction
|
||
patterns are elevated. This is what works for me — it is personal, not
|
||
prescriptive, and you may find your own approach more useful.
|
||
|
||
When enabled and interaction flags are elevated (total flags >= 5 or
|
||
fatigue >= 2), the `/interaction-report` output appends a brief reference
|
||
to the [Miracle of Mind](https://isha.sadhguru.org/global/en/miracle-of-mind)
|
||
program by Sadhguru — a structured approach to understanding how the mind
|
||
works, which I have found valuable for recognizing the patterns this
|
||
plugin detects.
|
||
|
||
The reference is a fixed paragraph. It is never modified by the AI, never
|
||
commented on, and omitted entirely when conditions are not met.
|
||
|
||
**Enable:** Set `layer4: true` in `.claude/ai-psychosis.local.md`
|
||
and restart Claude Code. Layer 4 is opt-in (off by default).
|
||
|
||
## Architecture
|
||
|
||
```
|
||
+------------------------------------------------------------------+
|
||
| Claude Code Session |
|
||
| |
|
||
| +--------------+ +------------------------------------------+ |
|
||
| | SKILL.md | | Hook Pipeline | |
|
||
| | (Layer 1) | | | |
|
||
| | | | SessionStart --> session-start.mjs | |
|
||
| | Behavioral | | UserPrompt --> prompt-analyzer.mjs | |
|
||
| | rules that | | PostToolUse --> tool-tracker.mjs | |
|
||
| | override | | SessionEnd --> session-end.mjs | |
|
||
| | sycophancy | | | | |
|
||
| +------+-------+ | +----v------+ | |
|
||
| | | | lib.mjs | | |
|
||
| | | | thresholds| | |
|
||
| Always active | | state mgmt| | |
|
||
| | | cooldowns | | |
|
||
| | +----+------+ | |
|
||
| | | | |
|
||
| +--------------+-----------+---------------+ |
|
||
| | |
|
||
| +--------------v-----------------------+ |
|
||
| | ${CLAUDE_PLUGIN_DATA}/ | |
|
||
| | +-- sessions.jsonl | |
|
||
| | +-- events.jsonl | |
|
||
| | +-- state/{session_id}.json | |
|
||
| +--------------------------------------+ |
|
||
+-------------------------------------------------------------------+
|
||
```
|
||
|
||
**Layer 1** operates through the Claude Code skill system — instructions
|
||
loaded into every conversation context.
|
||
|
||
**Layer 2** operates through the Claude Code hook system — Node.js scripts
|
||
that execute on specific lifecycle events and inject `additionalContext`
|
||
when thresholds are crossed.
|
||
|
||
Both layers are independent. Layer 1 works without Layer 2 (instruction-only
|
||
mode). Layer 2 reinforces Layer 1 with data-driven alerts.
|
||
|
||
## Quick start
|
||
|
||
### Installation
|
||
|
||
Add the marketplace and browse plugins with `/plugin`:
|
||
|
||
```bash
|
||
claude plugin marketplace add https://git.fromaitochitta.com/open/ktg-plugin-marketplace.git
|
||
```
|
||
|
||
Or enable directly in `~/.claude/settings.json`:
|
||
|
||
```json
|
||
{
|
||
"enabledPlugins": {
|
||
"ai-psychosis@ktg-plugin-marketplace": true
|
||
}
|
||
}
|
||
```
|
||
|
||
Layer 1 and Layer 2 are active immediately. No configuration needed.
|
||
|
||
### Configure layers
|
||
|
||
Create `~/.claude/ai-psychosis.local.md` for global config:
|
||
|
||
```markdown
|
||
---
|
||
layer2: true
|
||
layer3: true
|
||
layer4: false
|
||
---
|
||
```
|
||
|
||
Or override per project at `<project>/.claude/ai-psychosis.local.md`.
|
||
Project config takes precedence over global.
|
||
|
||
| Setting | Default | Effect |
|
||
|---------|---------|--------|
|
||
| `layer2` | `true` | Programmatic pattern detection (hooks write JSONL metadata) |
|
||
| `layer3` | `false` | Interaction reports from collected data |
|
||
| `layer4` | `false` | Contemplative references |
|
||
|
||
Layer 1 (SKILL.md instructions) is always active. To run in instruction-only
|
||
mode, set `layer2: false`.
|
||
|
||
Restart Claude Code after editing configuration.
|
||
|
||
### Uninstall
|
||
|
||
```
|
||
/plugin uninstall ai-psychosis
|
||
```
|
||
|
||
Clean removal. Plugin data in `~/.claude/plugins/data/ai-psychosis/`
|
||
is preserved unless you pass `--keep-data`.
|
||
|
||
## Privacy
|
||
|
||
This plugin is designed for people who are concerned about AI interaction
|
||
patterns. It would be hypocritical to solve that problem by creating a
|
||
surveillance tool. Privacy is a hard design constraint, not a feature.
|
||
|
||
### What Layer 2 stores
|
||
|
||
- Session timestamps and duration
|
||
- Tool names (`Read`, `Edit`, `Bash`, etc.)
|
||
- Boolean pattern flags (`dependency: true/false`)
|
||
- Session and tool counts
|
||
- Burst detection metrics
|
||
|
||
### What Layer 2 never stores
|
||
|
||
- Prompt text or AI responses
|
||
- File paths or file contents
|
||
- Bash commands or their output
|
||
- Any conversation content
|
||
|
||
The prompt analyzer (`prompt-analyzer.mjs`) reads prompt text into a local
|
||
variable, performs regex matching for pattern categories, increments boolean
|
||
counters, and exits. The variable is reassigned to an empty string before
|
||
exit. No temporary files are created. The prompt text never reaches disk.
|
||
|
||
All data is stored locally in `~/.claude/plugins/data/ai-psychosis/`.
|
||
Nothing is sent to any server.
|
||
|
||
### Verification
|
||
|
||
You can verify the privacy guarantee at any time:
|
||
|
||
```bash
|
||
grep -r "your prompt text" ~/.claude/plugins/data/ai-psychosis/
|
||
```
|
||
|
||
This will always return zero results.
|
||
|
||
## Background
|
||
|
||
### What is AI psychosis?
|
||
|
||
"AI psychosis" is a colloquial term for psychotic episodes — delusions,
|
||
paranoia, disorganized thinking — triggered or intensified by sustained
|
||
interaction with AI chatbots. The term entered clinical literature in 2025
|
||
after a series of documented cases, many involving individuals with no prior
|
||
psychiatric history [[3]](#references).
|
||
|
||
The mechanism is not mysterious. AI chatbots are optimized for engagement
|
||
and user satisfaction. Satisfaction correlates with agreement. Agreement
|
||
creates reinforcement loops. Reinforcement loops, sustained over time,
|
||
produce the same cognitive effects as any other source of systematic
|
||
confirmation bias — but faster, available 24/7, and without the social
|
||
friction that normally interrupts delusional thinking in human
|
||
relationships.
|
||
|
||
### The sycophancy trap
|
||
|
||
In February 2026, researchers at MIT CSAIL published a formal model
|
||
demonstrating that sycophantic AI interaction causes "delusional spiraling"
|
||
as a mathematical inevitability, not an edge case [[1]](#references). Their
|
||
key finding: even a perfectly rational Bayesian agent will converge on
|
||
increasingly extreme beliefs when interacting with a sycophantic chatbot,
|
||
because the chatbot's agreement is treated as independent confirmation when
|
||
it is actually a reflection of the user's own stated beliefs.
|
||
|
||
The paper's most consequential result: **post-hoc warnings do not work.**
|
||
Telling a user "be careful, AI can be wrong" after the reinforcement loop
|
||
has already run does not reverse the belief update. The only effective
|
||
intervention is to prevent the sycophantic behavior in the first place.
|
||
|
||
### Disempowerment patterns
|
||
|
||
In March 2026, Anthropic Research published an analysis of interaction
|
||
patterns that systematically reduce human agency [[2]](#references). They
|
||
identified specific mechanisms by which AI assistance can erode:
|
||
|
||
- **Judgment** — deferring decisions to the AI instead of thinking them through
|
||
- **Self-trust** — seeking AI validation for choices the user is capable of
|
||
making independently
|
||
- **Skill development** — using AI as a crutch that prevents learning
|
||
- **Social connection** — replacing human relationships with AI interaction
|
||
|
||
These are not failures of individual willpower. They are structural
|
||
properties of the interaction itself.
|
||
|
||
### Clinical evidence
|
||
|
||
Nature reported in 2025 that clinical cases of AI-associated psychotic
|
||
episodes were appearing with sufficient frequency to warrant systematic
|
||
study [[3]](#references). The Psychogenic Machine benchmark (2025)
|
||
demonstrated that LLMs can produce outputs with measurable "psychogenic
|
||
potential" — the capacity to trigger or intensify psychotic symptoms in
|
||
vulnerable individuals [[4]](#references).
|
||
|
||
### Design implications
|
||
|
||
This plugin is built on three principles derived from the research:
|
||
|
||
1. **Sycophancy must be prevented, not warned about.** Layer 1 overrides
|
||
Claude's default agreeableness with explicit behavioral rules.
|
||
2. **Patterns must be made visible.** Layer 2 measures what humans cannot
|
||
see — session duration, interaction frequency, language patterns — and
|
||
surfaces them as data.
|
||
3. **Observation, not intervention.** The plugin never blocks the user.
|
||
It names patterns, suggests breaks, and returns decisions to the human.
|
||
The goal is awareness, not control.
|
||
|
||
## Technical details
|
||
|
||
### Cross-platform
|
||
|
||
All hook scripts are Node.js ES modules (`.mjs`) with zero npm
|
||
dependencies. They use only Node.js stdlib (`fs`, `path`, `os`).
|
||
Works on macOS, Linux, and Windows — anywhere Claude Code runs.
|
||
|
||
### Performance
|
||
|
||
Hook scripts target <100ms execution. JSONL append is sub-millisecond.
|
||
JSON parsing is native (`JSON.parse`).
|
||
|
||
### Data volume
|
||
|
||
At 100 tool-use events per day, Layer 2 produces approximately 7 MB of
|
||
JSONL per year. Session state files are cleaned up at session end.
|
||
|
||
### Dependencies
|
||
|
||
- Node.js (bundled with Claude Code)
|
||
|
||
No bash, no jq, no npm packages, no network access.
|
||
|
||
## Platform scope
|
||
|
||
This plugin requires **Claude Code** — Anthropic's CLI and development
|
||
environment. It uses Claude Code's plugin system (skills, hooks, lifecycle
|
||
events) which does not exist in other interfaces.
|
||
|
||
**Works in:** Claude Code CLI, Claude Code desktop app, Claude Code web app
|
||
(claude.ai/code), Claude Code IDE extensions (VS Code, JetBrains).
|
||
|
||
**Does not work in:** Claude.ai (chat interface), Claude Cowork, Claude API
|
||
directly, or any non-Anthropic AI assistant.
|
||
|
||
Layer 1's behavioral instructions (SKILL.md) are conceptually portable —
|
||
the same rules could be pasted into any system prompt. But Layer 2's
|
||
programmatic detection depends on hook events that only Claude Code provides.
|
||
Other platforms would need equivalent hook systems to support this kind of
|
||
real-time behavioral modification.
|
||
|
||
## Compatibility
|
||
|
||
| Requirement | Version |
|
||
|-------------|---------|
|
||
| Claude Code | 1.0+ |
|
||
| Node.js | 18+ (bundled with Claude Code) |
|
||
| Platform | macOS, Linux, Windows |
|
||
|
||
## References
|
||
|
||
1. **Sycophantic Chatbots Cause Delusional Spiraling.** MIT CSAIL, February 2026. Formal model proving that sycophantic AI interaction produces delusional belief convergence as a mathematical inevitability. [arXiv:2602.19141](https://arxiv.org/abs/2602.19141)
|
||
|
||
2. **Disempowerment Patterns in AI Interaction.** Anthropic Research, March 2026. Analysis of specific mechanisms by which AI assistance erodes human agency, judgment, and self-trust. [anthropic.com/research/disempowerment-patterns](https://www.anthropic.com/research/disempowerment-patterns)
|
||
|
||
3. **Can AI chatbots trigger psychosis?** Nature News, 2025. Overview of emerging clinical evidence for AI-associated psychotic episodes. [doi:10.1038/d41586-025-03020-9](https://www.nature.com/articles/d41586-025-03020-9)
|
||
|
||
4. **The Psychogenic Machine: Psychosis Benchmark for LLMs.** 2025. Demonstrates measurable "psychogenic potential" in LLM outputs. [arXiv:2509.10970v2](https://arxiv.org/html/2509.10970v2)
|
||
|
||
5. **Chatbot psychosis.** Wikipedia. Overview of documented cases and clinical context. [en.wikipedia.org/wiki/Chatbot_psychosis](https://en.wikipedia.org/wiki/Chatbot_psychosis)
|
||
|
||
## License
|
||
|
||
[MIT](LICENSE)
|