ktg-plugin-marketplace/plugins/linkedin-studio/docs/hardening/brief.md
Kjell Tore Guttormsen 2f90880f7a docs(linkedin-studio): hardening-phase foundation (brief + adversarially-reviewed plan)
New phase after the baseline-audit remediation (S1-S17, 2633d32, complete):
a command-hardening pass that simulates each of the 29 commands and tightens
it to its stated intention (intention-fidelity + prompt-quality only — no
structural redesign, no new features, no GUI/M0). Runs over ~8 journey-grouped
Voyage sessions, gated by lint + /trekreview ALLOW before push.

Foundation laid this session (execution starts next session):
- docs/hardening/brief.md (valid; 3 locked forks: hybrid simulation,
  intention-fidelity+prompt-quality, per-journey cadence; research skipped —
  the 2026 bar is frozen in algorithm-signals-reference.md)
- docs/hardening/plan.md (8 sessions, Grade B+ 87)

Adversarial review: brief-reviewer PROCEED_WITH_RISKS (3 findings folded:
self-graded quality axis, SC-H deferral contradiction, no stopping rule);
plan-critic REVISE (2 blockers + 10 major — all addressed: anchored coverage
predicate kills substring false-positives, full-session cold-reviewer oracle
on concrete before/after output, per-type mechanical predicate, Failed:0 gate
not literal-74, S1 HALT circuit-breaker); scope-guardian ALIGNED.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 05:31:52 +02:00

267 lines
15 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
type: trekbrief
brief_version: "2.1"
created: 2026-05-31
task: "Command hardening pass — simulate every linkedin-studio command and harden it to its stated intention (intention-fidelity + prompt-quality), one command at a time, across ~8 journey-grouped sessions, before the GUI/M0 track"
slug: hardening
project_dir: docs/hardening/
research_topics: 0
research_status: skipped
auto_research: false
interview_turns: 1
source: interview
phase_signals:
- phase: plan
effort: high
model: opus
- phase: execute
effort: high
model: opus
- phase: review
effort: high
model: opus
---
# Task: linkedin-studio command hardening
> Generated 2026-05-31 (operator-driven, one clarification turn — three forks
> locked via AskUserQuestion). This brief is the contract between requirements and
> planning. `/trekplan` reads it to produce the multi-session plan. Every decision
> in the plan must trace back to content here.
>
> **Predecessor of record:** the baseline-audit remediation (`docs/remediation/`)
> is **complete** (S1S17, last commit `2633d32`, clean ALLOW). That phase fixed
> *structure, correctness, and honesty*. This phase is the next, distinct layer:
> does each command actually **deliver what it promises** when run?
## Intent
The remediation made every claim honest, wired every orphan agent, rebuilt the
lint, and reconciled the algorithm bar. What it did **not** do is exercise each
command's *workflow* end-to-end and judge the **quality of what it produces**
against the command's own stated intention. A command can be structurally correct
(right frontmatter, right agent wired, lint-green) and still under-deliver: a step
that under-determines the next move, a question that yields a weak answer, a prompt
that produces generic output, a missing graceful-degradation path.
This is a **hardening phase**: a deliberate, per-command pass that simulates a
realistic invocation, judges the result against intention + the 2026 algorithm
bar + the content-quality rules, and tightens the command definition where it
falls short. It is the last quality gate **before** the GUI/M0 track (the two
briefs `docs/linkedin-studio-ui-brief.md` + `docs/linkedin-studio-persona-brief.md`)
— hardening the engine before building a dashboard on top of it.
**Division of labour (explicit):** the operator tests the commands **live** over
the coming weeks (real inputs, real friction). This phase is the **complementary**
intention-fidelity pass from the definition side. The two converge through a
field-notes inbox (below): the operator's real-world findings outrank the
simulated ones and steer per-session priority.
## Goal
For every one of the 29 command surfaces, produce: (1) a recorded **intention**,
(2) a **simulated** run with a concrete persona, (3) an **evaluation** against four
axes, (4) a **hardened** command definition where gaps were found, (5) a green
**verification**. Leave the command set structurally identical (29/19/25/6) and the
plugin in a state where each command reliably delivers its description's promise.
## Non-Goals
- **No structural redesign.** No command merged, split, added, or removed; no new
command; no journey re-tiering. 14a already proved zero redundancy; the surface
count (29) and journey layer (v4.1.0) are fixed. *(Locked fork: "intention-fidelity
+ prompt-quality", not "structural".)*
- **No new features / capability accretion.** Hardening tightens existing
intention; it does not add surfaces the audit/remediation deliberately scoped out
(no auto-publish, no dwell measurement, no analytics-API integration, no new agents).
- **No GUI/M0 work.** The dashboard, the per-user data-dir migration (M0), and the
provider register are the *next* track and require their own mandate + brief. This
phase only hardens the command engine they will sit on.
- **No re-litigation of the algorithm bar.** `references/algorithm-signals-reference.md`
is the single source of truth (settled in remediation). Hardening *applies* it; it
does not re-research or re-open it. No `/trekresearch` is run.
- **No version/count churn per session.** Prompt/spec refinement is not a surface
change; sessions do not bump the version or touch counts (S11S16 refinement
precedent). A single optional minor bump (v4.2.0 "command hardening pass") may be
taken at phase end.
## Constraints
- **Opus on everything** (sub-agents + orchestration). Max-discipline default.
- **No hidden costs.** `/trekplan`, `/trekcontinue`, `/trekreview` and any Workflow
fan-out are cost-warned in plain text before running; the operator's standing yes
(2026-05-30) covers the routine `/trekcontinue`·`/trekreview` gate — run, don't block.
- **Gate before push (no WARN-override):** per session, `bash scripts/test-runner.sh`
exit 0 + `node --test` green where touched + `/trekreview` **ALLOW** → commit (own
files only, explicit staging, never `git add -u/-A`) → push to Forgejo. In-session
fix of the session's *own* misses = completion; genuine pre-existing design findings
→ next session.
- **Three-doc rule** applies only if a hardening edit changes a feature/surface/count/
version (most will not — they refine prompt text). Pure prompt-quality edits are
`fix:`/`refactor:` and do not trigger the feat-gate.
- **Platform:** bash 3.2, Node-only hooks, no npm deps in hooks/scanners.
- **Untracked NOT-mine (never commit):** `docs/linkedin-studio-persona-brief.md`,
`docs/linkedin-studio-ui-brief.md`, `docs/voyage-build/progress.json`. `*.local.md`
+ `.session-state.local.json` + `review.html` + STATE.md are gitignored.
## Preferences
- **Simulation = hybrid (locked fork).** Per command: role-play a realistic
invocation (concrete persona + scenario, mock answers), produce/sketch the actual
output the command would generate, **and** cold-read the spec against intention.
Harden from both signals — catches "spec is broken" *and* "output is mediocre".
- **Personas:** drawn from the plugin's real ICP and state — a solo Norwegian/English
AI-advisor creator (the author's own profile: ~1048 followers, "Validation" phase,
5 expertise pillars), plus at least one **fresh adopter** (no voice samples, no
analytics yet) to exercise progressive-onboarding + graceful-degradation paths. The
`persona-brief` may be consulted (read-only) for richer personas; it is optional, not
a hard dependency.
- **Session granularity (locked fork):** one journey per session where small
(Start/Engage/Measure/Grow); split **Create** (8 commands incl. the 16-phase
`newsletter`) across 23 sessions. ~8 sessions total.
- **Field-notes inbox:** before hardening a journey, check
`docs/hardening/field-notes.local.md` for the operator's live-test findings on those
commands and let them steer priority; the operator's real friction outranks the
simulated friction.
- **Per-command log:** record intention → simulation (inputs + output sketch +
friction) → evaluation (4 axes) → harden diff → verify, in `docs/hardening/log.md`,
so the operator's parallel live-testing can cross-reference every change and its why.
## Non-Functional Requirements
- **No regression.** After each session: lint exits 0 with **unchanged counts**
(Commands 29 · Agents 19 · Reference docs 25 · Skills 6); hooks `node --test` 98/98
and analytics 116/116 stay green if those surfaces are touched; cross-references
(router ↔ commands ↔ agents ↔ skills) stay consistent.
- **Determinism.** Every success criterion is falsifiable by a command or a recorded
observation; the hardening log is the audit trail.
- **Independence.** The per-session `/trekreview` runs cold (independent reviewers),
same as the remediation gate.
## Success Criteria
*Falsifiable per command and per phase.*
**Per command (the hardening unit):**
- **SC-A (intention recorded):** `docs/hardening/log.md` has, for the command, a
one-paragraph intention distilled from its frontmatter `description` + journey role
+ the content-quality rules + the algorithm signals it must honor.
- **SC-B (simulated):** a concrete persona + scenario, the mock inputs, and the
produced/sketched output are recorded, with a friction log of every ambiguity,
dead-end, or under-determined step encountered walking the workflow as written.
- **SC-C (evaluated on 4 axes):** the simulated output + spec carry an explicit
pass/gap verdict against (a) **intention fidelity** (does it deliver the
description's promise?), (b) **algorithm bar** (`algorithm-signals-reference.md`),
(c) **content-quality rules** (hook 110-140, length bands, no body links, no
buzzwords, topic-relevance, topic-rotation), (d) **agent-wiring + graceful
degradation** (right `subagent_type`; sensible fallback when an agent/tool/mcp/CLI
is unavailable).
- **SC-D (hardened or deferred):** every gap is either fixed (diff recorded in the
log) or explicitly deferred with a rationale — **no gap left unrecorded**.
- **SC-E (verified):** after hardening, `test-runner.sh` exits 0 with unchanged
counts, `node --test` is green where touched, and a re-read/re-simulation confirms
the previously-failing axis now passes.
**Per phase:**
- **SC-F (coverage):** all 29 command surfaces (27 atomic + the 2 front-doors
`create`/`measure`) have a complete `log.md` entry through SC-A…SC-E.
- **SC-G (no structural drift):** `ls commands/*.md | wc -l` == 29 throughout; no
merge/split/add/remove; the lint's count/version guards stay green; `grep` for any
stale count returns 0.
- **SC-H (clean gate):** every session was pushed on `/trekreview` **ALLOW** (no
WARN-override); the per-session `review.md` shows 0 open findings.
- **SC-I (operator findings honored):** where `field-notes.local.md` carries a live
finding for a session's commands, it is addressed or explicitly triaged in that
session's log.
## Hardening method (the per-command contract `/trekplan` must encode)
```
1. INTENT distil the command's promise (description + journey role + quality
rules + algorithm signals it must honor) → one paragraph
2. SIMULATE pick a realistic persona + scenario; walk the workflow exactly as
written; answer its questions; produce/sketch the real output; log
every friction / ambiguity / dead-end / under-determined step
3. EVALUATE judge output + spec on 4 axes: intention · algorithm bar · quality
rules · agent-wiring + graceful degradation → pass/gap per axis
4. HARDEN edit the command .md (surgical; agent/reference only if the intention
requires it): fix dead-ends/contradictions/wrong-wiring + strengthen
prompts/questions/steps so output hits the bar. NO structural redesign.
5. VERIFY lint exit 0 + counts unchanged + node --test green where touched +
re-simulation passes the failing axis; record before/after in log.md
```
## Proposed session decomposition (to be confirmed/refined by `/trekplan`)
| Session | Scope | Notes |
|---|---|---|
| **S1** | **Method calibration** on one command (`quick` — simplest, highest-volume) → show output + harden diff → lock the method. Then **Start**: `onboarding`, `first-post`, `setup` | de-risks the method before scaling |
| **S2** | **Create I** (atomic short-form): `post`, `react` (+ `quick` if not finished in S1) | |
| **S3** | **Create II** (visual/video): `carousel`, `video`, `multiplatform` | mcp-image / aspect-ratio / SRT graceful degradation in focus |
| **S4** | **Create III**: `batch`, `newsletter` | newsletter = orchestration only (16-phase; gate-agents already reviewed in remediation); may spill into its own session |
| **S5** | **Engage**: `firsthour`, `calendar`, `pipeline` | state-mutation + publish + first-hour plan paths |
| **S6** | **Measure**: `import`, `report`, `analyze`, `audit`, `ab-test` | analytics CLI graceful degradation + directional A/B framing |
| **S7** | **Grow**: `strategy`, `competitive`, `monetize`, `outreach`, `profile` | ~1K soft-gating + tracked-pipeline + SEO paths |
| **S8** | **Front-doors + router**: `create`, `measure`, `linkedin` | delegation/routing surfaces — verify they route correctly to the now-hardened commands |
## Research Plan
**None.** `/trekresearch` is deliberately skipped. The external 2026 bar
(algorithm signals, analytics/publish boundaries, coverage-gap specs) was fully
researched and triangulated in the remediation phase and is frozen in
`references/algorithm-signals-reference.md` (the single source of truth) +
`docs/remediation/research/01-03`. Hardening is an **internal intention-fidelity**
exercise: it *applies* that established bar, it does not extend it. If a simulation
surfaces a genuinely new external question, it is logged as an Open Question for the
operator — not researched mid-phase.
## Open Questions / Assumptions
- **[ASSUMPTION]** `/trekplan` orders work as the 8 sessions above, one (or part of a)
journey per Voyage session, with `/trekreview` as the per-session release gate.
- **[ASSUMPTION]** `quick` is the S1 calibration command (operator did not object;
overridable next session).
- **[ASSUMPTION]** No `/trekresearch` (operator proposed-and-agreed to skip — the
bar is settled).
- **[ASSUMPTION]** Version stays v4.1.0 through the phase; an optional v4.2.0
"command hardening pass" minor bump is a phase-end decision, not pre-committed.
- **[OPEN]** Whether `newsletter` (16-phase) needs its own dedicated session — decide
during S4 once its orchestration simulation is scoped.
- **[OPEN]** Whether the operator wants the simulated outputs themselves preserved
(full text) in the log, or only the friction/verdict summary (token cost vs.
reviewability trade-off) — default: summary + output sketch, full output only when
it's the evidence for a gap.
## Prior Attempts
The baseline-audit remediation (`docs/remediation/`, S1S17, complete 2026-05-31,
commit `2633d32`) is the immediate predecessor. It closed every still-real audit
finding (correctness, honesty, orphan-agent wiring, dead lint, generalization) and
its S17 triage confirmed 0 still-real findings remain. No command has yet been
exercised end-to-end for **output quality against intention** — that gap is exactly
what this phase fills. The plugin is at v4.1.0, 29 commands / 19 agents, stable.
## Metadata
- **Created:** 2026-05-31
- **Interview turns:** 1 (three forks locked: simulation depth = hybrid; change
appetite = intention-fidelity + prompt-quality; cadence = per-journey, large groups split)
- **Auto-research opted in:** no (bar already settled in remediation)
- **Source:** operator-driven foundation session; execution begins next session
---
## How to continue
```bash
# Plan (this session — lays the foundation):
/trekplan --project docs/hardening/
# Then, from next session onward, one journey-group per session:
/trekcontinue --project docs/hardening/ # S1: method calibration (quick) + Start
# … gate each: test-runner.sh + node --test + /trekreview ALLOW → commit (own files) → push
```
Driven per the operating model: the operator types **«Les STATE.md og følg
instruksjonene.»**; Claude invokes the `/trek*` skills itself. STATE.md points at the
next session; this brief is the contract `/trekplan` consumes now.