v1 (autonomous 5-step + reviewer swarm + /trekreview gate) produced confident fabrications and burned context for what are usually 0-1 line edits. v2 is a slow, dialogic, one-command-per-session gate with the operator as truth source and every mechanical claim tool-grounded. Method, command-class predicates, session queue, and STATE.md end-of-session handoff are documented in plan.md. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
6.9 KiB
LinkedIn Studio — Command Hardening Plan v2 (interactive quality gate)
Supersedes the v1 autonomous plan (recoverable in git history: commit
2f90880). Motivated by the 2026-05-31 S2 fabrication incident: the v1 method (autonomous, prose-heavy 5-step per command, per-command Opus reviewer swarm,/trekreviewgate) produced confident fabrications (Claude wrote SIMULATE/EVALUATE/HARDEN narratives — char counts, line numbers, before/after — for files it had not actually observed) and burned context without value (full-file re-reads + long logs + reviewer agents for what are usually 0–1-line edits). A large parallel tool-batch also destabilized the tool-result stream (out-of-order / duplicated / truncated returns). Nothing false shipped — the failed edits + empty greps + real reads caught it, and it was reverted — but the method had to change.v2 is a slow, dialogic, human-in-the-loop gate: ONE command per session, the operator as the truth source, every mechanical claim tool-grounded.
Intent
Harden each of the 29 commands to its own intent — one command per session — with the
operator in the loop. Last quality gate before GUI/M0. Brief: brief.md.
The locked interactive method (identical every session)
Per command, 8 steps. Claude STOPS and waits at each "agree" point — no autonomous run-ahead:
- Agree intention — show the command's own description + its journey role; agree in one line.
- Agree test-method — the class predicates (below) + what the simulation must prove.
- Agree persona — who invokes + with what input.
- Run the test — tool-grounded mechanical facts (each with
file:line) + a GROUNDED persona simulation (a real invocation → concrete output, anchored to the shown file region). - Talk — the operator's judgment is the truth source.
- Agree improvements — surgical, or "no change".
- Implement — Edit → grep-confirm landed → show the confirmed diff.
- Move on — run the end-of-session ritual; the next command is its own session.
Two guards (non-negotiable)
- Read-and-show BEFORE simulate — the simulation is anchored to observed text, shown to the operator.
- Every mechanical claim from a tool — char count, ✓/✗, line number: shown from a tool before stated.
Anti-drift through dialogue
The per-command dialogue IS the drift-reduction mechanism: Claude proposes at each agree-point (intention / test-method / persona / improvement) and waits for the operator to confirm or correct before proceeding. Frequent grounding checkpoints are what keep Claude from confabulating — never race past an agree-point.
Execution discipline (the S1+S2 incident lessons)
- ONE command per session. Never hold multiple command files in context at once.
- Never batch dependent tool calls in parallel. Read→Edit→grep→log is strictly sequential. Large parallel batches destabilize the tool-result stream.
- Guard greps:
grep/grep -cexit non-zero on zero matches → append|| echo NONEso a batch sibling isn't torn down. - Never write a HARDEN/after claim before the edit is grep-confirmed landed.
- If the tool layer degrades (slow / truncated / empty / replayed returns): PAUSE and resume in a fresh session. Do not push through a degraded window.
Command classes & mechanical predicates (the testmethod per class)
- post-emitting — hook 110–140 present · length band present · no-body-link present ·
buzzword check present · topic→5 pillars present · every
subagent_type: linkedin-studio:Xresolves toagents/X.md. - routing — every emitted
/linkedin:Yresolves tocommands/Y.md. - guided/stateful — primary promised artifact actually produced · subagent targets resolve · graceful degradation present.
- analytics — graceful degradation present · saves/dwell honesty intact.
Synthetic test data (fixtures) — for data-dependent commands
Some commands cannot be meaningfully tested without input data — analytics (import, report,
analyze, audit, ab-test) need a CSV/JSON; stateful commands (calendar, firsthour) need a
queue/state. For these, step 2 (agree test-method) includes agreeing a small fictitious dataset
that exercises the command's real path:
- Isolated + throwaway — built in a temp/scratch location (e.g.
/tmp/…or a gitignored path), NEVER the real analytics/state dirs, NEVER committed. - Minimal but realistic — just enough rows/fields to trigger the behavior under test (e.g. a 3–5-row CSV with the real column names; a 2-experiment ab-test file).
- Clearly fictitious — obviously-fake values, impossible to mistake for real data.
- Discarded after the test — the fixture proves the command works; it is not a deliverable.
Session queue (ONE command each)
S1 ✅ quick · onboarding · first-post · setup — DONE under v1; do NOT re-harden.
| S2 post | S9 newsletter* | S16 analyze | S23 profile |
| S3 react | S10 headless-review | S17 audit | S24 create |
| S4 multiplatform | S11 pivot | S18 ab-test | S25 measure |
| S5 carousel | S12 firsthour | S19 strategy | S26 linkedin |
| S6 video | S13 calendar | S20 competitive | |
| S7 batch | S14 import | S21 monetize | |
| S8 pipeline | S15 report | S22 outreach |
*S9 newsletter (16-phase) may split into S9a/S9b. Otherwise one command = one session.
End-of-session ritual (every session — STATE.md handoff baked in)
bash scripts/test-runner.sh→Failed: 0+ counts 29/19/25/6.- Terse, tool-grounded
log.mdline for the command (ONLY after verify):PASS/ edit +file:line. - Commit own file(s) only, LOCALLY (
fix:for command edits; explicitgit add <path>, never-u/-A). Do NOT push hardening work — see Push policy. - Overwrite STATE.md: mark this command ✅; point at the NEXT single command + its class; carry the method + discipline rules → ready for the next "Les STATE.md og følg instruksjonene".
- (Optional) update Voyage
.session-state.local.jsonnext-label for/trekcontinuecoherence.
Defaults (adjustable between sessions)
- Log: keep, terse, tool-grounded, after-verify only.
- Persona: ICP author default; fresh-adopter for onboarding/first-post/setup; deviate explicitly elsewhere.
- Gate/commit: once per session = per command.
Push policy (operator rule, 2026-05-31)
Hardening work — "det vi GJØR": command edits + log.md entries — is committed locally
only, never pushed. Only method / process / tooling improvements (this plan, a future
checker, STATE conventions) may be pushed, and only on explicit operator OK
("eventuelle forbedringer").
Amendment policy
Improve the method/plan between sessions if a session surfaces a gap (operator-approved). This plan is a living contract, not frozen.