New phase after the baseline-audit remediation (S1-S17, 2633d32, complete):
a command-hardening pass that simulates each of the 29 commands and tightens
it to its stated intention (intention-fidelity + prompt-quality only — no
structural redesign, no new features, no GUI/M0). Runs over ~8 journey-grouped
Voyage sessions, gated by lint + /trekreview ALLOW before push.
Foundation laid this session (execution starts next session):
- docs/hardening/brief.md (valid; 3 locked forks: hybrid simulation,
intention-fidelity+prompt-quality, per-journey cadence; research skipped —
the 2026 bar is frozen in algorithm-signals-reference.md)
- docs/hardening/plan.md (8 sessions, Grade B+ 87)
Adversarial review: brief-reviewer PROCEED_WITH_RISKS (3 findings folded:
self-graded quality axis, SC-H deferral contradiction, no stopping rule);
plan-critic REVISE (2 blockers + 10 major — all addressed: anchored coverage
predicate kills substring false-positives, full-session cold-reviewer oracle
on concrete before/after output, per-type mechanical predicate, Failed:0 gate
not literal-74, S1 HALT circuit-breaker); scope-guardian ALIGNED.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
267 lines
15 KiB
Markdown
267 lines
15 KiB
Markdown
---
|
||
type: trekbrief
|
||
brief_version: "2.1"
|
||
created: 2026-05-31
|
||
task: "Command hardening pass — simulate every linkedin-studio command and harden it to its stated intention (intention-fidelity + prompt-quality), one command at a time, across ~8 journey-grouped sessions, before the GUI/M0 track"
|
||
slug: hardening
|
||
project_dir: docs/hardening/
|
||
research_topics: 0
|
||
research_status: skipped
|
||
auto_research: false
|
||
interview_turns: 1
|
||
source: interview
|
||
phase_signals:
|
||
- phase: plan
|
||
effort: high
|
||
model: opus
|
||
- phase: execute
|
||
effort: high
|
||
model: opus
|
||
- phase: review
|
||
effort: high
|
||
model: opus
|
||
---
|
||
|
||
# Task: linkedin-studio command hardening
|
||
|
||
> Generated 2026-05-31 (operator-driven, one clarification turn — three forks
|
||
> locked via AskUserQuestion). This brief is the contract between requirements and
|
||
> planning. `/trekplan` reads it to produce the multi-session plan. Every decision
|
||
> in the plan must trace back to content here.
|
||
>
|
||
> **Predecessor of record:** the baseline-audit remediation (`docs/remediation/`)
|
||
> is **complete** (S1–S17, last commit `2633d32`, clean ALLOW). That phase fixed
|
||
> *structure, correctness, and honesty*. This phase is the next, distinct layer:
|
||
> does each command actually **deliver what it promises** when run?
|
||
|
||
## Intent
|
||
|
||
The remediation made every claim honest, wired every orphan agent, rebuilt the
|
||
lint, and reconciled the algorithm bar. What it did **not** do is exercise each
|
||
command's *workflow* end-to-end and judge the **quality of what it produces**
|
||
against the command's own stated intention. A command can be structurally correct
|
||
(right frontmatter, right agent wired, lint-green) and still under-deliver: a step
|
||
that under-determines the next move, a question that yields a weak answer, a prompt
|
||
that produces generic output, a missing graceful-degradation path.
|
||
|
||
This is a **hardening phase**: a deliberate, per-command pass that simulates a
|
||
realistic invocation, judges the result against intention + the 2026 algorithm
|
||
bar + the content-quality rules, and tightens the command definition where it
|
||
falls short. It is the last quality gate **before** the GUI/M0 track (the two
|
||
briefs `docs/linkedin-studio-ui-brief.md` + `docs/linkedin-studio-persona-brief.md`)
|
||
— hardening the engine before building a dashboard on top of it.
|
||
|
||
**Division of labour (explicit):** the operator tests the commands **live** over
|
||
the coming weeks (real inputs, real friction). This phase is the **complementary**
|
||
intention-fidelity pass from the definition side. The two converge through a
|
||
field-notes inbox (below): the operator's real-world findings outrank the
|
||
simulated ones and steer per-session priority.
|
||
|
||
## Goal
|
||
|
||
For every one of the 29 command surfaces, produce: (1) a recorded **intention**,
|
||
(2) a **simulated** run with a concrete persona, (3) an **evaluation** against four
|
||
axes, (4) a **hardened** command definition where gaps were found, (5) a green
|
||
**verification**. Leave the command set structurally identical (29/19/25/6) and the
|
||
plugin in a state where each command reliably delivers its description's promise.
|
||
|
||
## Non-Goals
|
||
|
||
- **No structural redesign.** No command merged, split, added, or removed; no new
|
||
command; no journey re-tiering. 14a already proved zero redundancy; the surface
|
||
count (29) and journey layer (v4.1.0) are fixed. *(Locked fork: "intention-fidelity
|
||
+ prompt-quality", not "structural".)*
|
||
- **No new features / capability accretion.** Hardening tightens existing
|
||
intention; it does not add surfaces the audit/remediation deliberately scoped out
|
||
(no auto-publish, no dwell measurement, no analytics-API integration, no new agents).
|
||
- **No GUI/M0 work.** The dashboard, the per-user data-dir migration (M0), and the
|
||
provider register are the *next* track and require their own mandate + brief. This
|
||
phase only hardens the command engine they will sit on.
|
||
- **No re-litigation of the algorithm bar.** `references/algorithm-signals-reference.md`
|
||
is the single source of truth (settled in remediation). Hardening *applies* it; it
|
||
does not re-research or re-open it. No `/trekresearch` is run.
|
||
- **No version/count churn per session.** Prompt/spec refinement is not a surface
|
||
change; sessions do not bump the version or touch counts (S11–S16 refinement
|
||
precedent). A single optional minor bump (v4.2.0 "command hardening pass") may be
|
||
taken at phase end.
|
||
|
||
## Constraints
|
||
|
||
- **Opus on everything** (sub-agents + orchestration). Max-discipline default.
|
||
- **No hidden costs.** `/trekplan`, `/trekcontinue`, `/trekreview` and any Workflow
|
||
fan-out are cost-warned in plain text before running; the operator's standing yes
|
||
(2026-05-30) covers the routine `/trekcontinue`·`/trekreview` gate — run, don't block.
|
||
- **Gate before push (no WARN-override):** per session, `bash scripts/test-runner.sh`
|
||
exit 0 + `node --test` green where touched + `/trekreview` **ALLOW** → commit (own
|
||
files only, explicit staging, never `git add -u/-A`) → push to Forgejo. In-session
|
||
fix of the session's *own* misses = completion; genuine pre-existing design findings
|
||
→ next session.
|
||
- **Three-doc rule** applies only if a hardening edit changes a feature/surface/count/
|
||
version (most will not — they refine prompt text). Pure prompt-quality edits are
|
||
`fix:`/`refactor:` and do not trigger the feat-gate.
|
||
- **Platform:** bash 3.2, Node-only hooks, no npm deps in hooks/scanners.
|
||
- **Untracked NOT-mine (never commit):** `docs/linkedin-studio-persona-brief.md`,
|
||
`docs/linkedin-studio-ui-brief.md`, `docs/voyage-build/progress.json`. `*.local.md`
|
||
+ `.session-state.local.json` + `review.html` + STATE.md are gitignored.
|
||
|
||
## Preferences
|
||
|
||
- **Simulation = hybrid (locked fork).** Per command: role-play a realistic
|
||
invocation (concrete persona + scenario, mock answers), produce/sketch the actual
|
||
output the command would generate, **and** cold-read the spec against intention.
|
||
Harden from both signals — catches "spec is broken" *and* "output is mediocre".
|
||
- **Personas:** drawn from the plugin's real ICP and state — a solo Norwegian/English
|
||
AI-advisor creator (the author's own profile: ~1048 followers, "Validation" phase,
|
||
5 expertise pillars), plus at least one **fresh adopter** (no voice samples, no
|
||
analytics yet) to exercise progressive-onboarding + graceful-degradation paths. The
|
||
`persona-brief` may be consulted (read-only) for richer personas; it is optional, not
|
||
a hard dependency.
|
||
- **Session granularity (locked fork):** one journey per session where small
|
||
(Start/Engage/Measure/Grow); split **Create** (8 commands incl. the 16-phase
|
||
`newsletter`) across 2–3 sessions. ~8 sessions total.
|
||
- **Field-notes inbox:** before hardening a journey, check
|
||
`docs/hardening/field-notes.local.md` for the operator's live-test findings on those
|
||
commands and let them steer priority; the operator's real friction outranks the
|
||
simulated friction.
|
||
- **Per-command log:** record intention → simulation (inputs + output sketch +
|
||
friction) → evaluation (4 axes) → harden diff → verify, in `docs/hardening/log.md`,
|
||
so the operator's parallel live-testing can cross-reference every change and its why.
|
||
|
||
## Non-Functional Requirements
|
||
|
||
- **No regression.** After each session: lint exits 0 with **unchanged counts**
|
||
(Commands 29 · Agents 19 · Reference docs 25 · Skills 6); hooks `node --test` 98/98
|
||
and analytics 116/116 stay green if those surfaces are touched; cross-references
|
||
(router ↔ commands ↔ agents ↔ skills) stay consistent.
|
||
- **Determinism.** Every success criterion is falsifiable by a command or a recorded
|
||
observation; the hardening log is the audit trail.
|
||
- **Independence.** The per-session `/trekreview` runs cold (independent reviewers),
|
||
same as the remediation gate.
|
||
|
||
## Success Criteria
|
||
|
||
*Falsifiable per command and per phase.*
|
||
|
||
**Per command (the hardening unit):**
|
||
- **SC-A (intention recorded):** `docs/hardening/log.md` has, for the command, a
|
||
one-paragraph intention distilled from its frontmatter `description` + journey role
|
||
+ the content-quality rules + the algorithm signals it must honor.
|
||
- **SC-B (simulated):** a concrete persona + scenario, the mock inputs, and the
|
||
produced/sketched output are recorded, with a friction log of every ambiguity,
|
||
dead-end, or under-determined step encountered walking the workflow as written.
|
||
- **SC-C (evaluated on 4 axes):** the simulated output + spec carry an explicit
|
||
pass/gap verdict against (a) **intention fidelity** (does it deliver the
|
||
description's promise?), (b) **algorithm bar** (`algorithm-signals-reference.md`),
|
||
(c) **content-quality rules** (hook 110-140, length bands, no body links, no
|
||
buzzwords, topic-relevance, topic-rotation), (d) **agent-wiring + graceful
|
||
degradation** (right `subagent_type`; sensible fallback when an agent/tool/mcp/CLI
|
||
is unavailable).
|
||
- **SC-D (hardened or deferred):** every gap is either fixed (diff recorded in the
|
||
log) or explicitly deferred with a rationale — **no gap left unrecorded**.
|
||
- **SC-E (verified):** after hardening, `test-runner.sh` exits 0 with unchanged
|
||
counts, `node --test` is green where touched, and a re-read/re-simulation confirms
|
||
the previously-failing axis now passes.
|
||
|
||
**Per phase:**
|
||
- **SC-F (coverage):** all 29 command surfaces (27 atomic + the 2 front-doors
|
||
`create`/`measure`) have a complete `log.md` entry through SC-A…SC-E.
|
||
- **SC-G (no structural drift):** `ls commands/*.md | wc -l` == 29 throughout; no
|
||
merge/split/add/remove; the lint's count/version guards stay green; `grep` for any
|
||
stale count returns 0.
|
||
- **SC-H (clean gate):** every session was pushed on `/trekreview` **ALLOW** (no
|
||
WARN-override); the per-session `review.md` shows 0 open findings.
|
||
- **SC-I (operator findings honored):** where `field-notes.local.md` carries a live
|
||
finding for a session's commands, it is addressed or explicitly triaged in that
|
||
session's log.
|
||
|
||
## Hardening method (the per-command contract `/trekplan` must encode)
|
||
|
||
```
|
||
1. INTENT distil the command's promise (description + journey role + quality
|
||
rules + algorithm signals it must honor) → one paragraph
|
||
2. SIMULATE pick a realistic persona + scenario; walk the workflow exactly as
|
||
written; answer its questions; produce/sketch the real output; log
|
||
every friction / ambiguity / dead-end / under-determined step
|
||
3. EVALUATE judge output + spec on 4 axes: intention · algorithm bar · quality
|
||
rules · agent-wiring + graceful degradation → pass/gap per axis
|
||
4. HARDEN edit the command .md (surgical; agent/reference only if the intention
|
||
requires it): fix dead-ends/contradictions/wrong-wiring + strengthen
|
||
prompts/questions/steps so output hits the bar. NO structural redesign.
|
||
5. VERIFY lint exit 0 + counts unchanged + node --test green where touched +
|
||
re-simulation passes the failing axis; record before/after in log.md
|
||
```
|
||
|
||
## Proposed session decomposition (to be confirmed/refined by `/trekplan`)
|
||
|
||
| Session | Scope | Notes |
|
||
|---|---|---|
|
||
| **S1** | **Method calibration** on one command (`quick` — simplest, highest-volume) → show output + harden diff → lock the method. Then **Start**: `onboarding`, `first-post`, `setup` | de-risks the method before scaling |
|
||
| **S2** | **Create I** (atomic short-form): `post`, `react` (+ `quick` if not finished in S1) | |
|
||
| **S3** | **Create II** (visual/video): `carousel`, `video`, `multiplatform` | mcp-image / aspect-ratio / SRT graceful degradation in focus |
|
||
| **S4** | **Create III**: `batch`, `newsletter` | newsletter = orchestration only (16-phase; gate-agents already reviewed in remediation); may spill into its own session |
|
||
| **S5** | **Engage**: `firsthour`, `calendar`, `pipeline` | state-mutation + publish + first-hour plan paths |
|
||
| **S6** | **Measure**: `import`, `report`, `analyze`, `audit`, `ab-test` | analytics CLI graceful degradation + directional A/B framing |
|
||
| **S7** | **Grow**: `strategy`, `competitive`, `monetize`, `outreach`, `profile` | ~1K soft-gating + tracked-pipeline + SEO paths |
|
||
| **S8** | **Front-doors + router**: `create`, `measure`, `linkedin` | delegation/routing surfaces — verify they route correctly to the now-hardened commands |
|
||
|
||
## Research Plan
|
||
|
||
**None.** `/trekresearch` is deliberately skipped. The external 2026 bar
|
||
(algorithm signals, analytics/publish boundaries, coverage-gap specs) was fully
|
||
researched and triangulated in the remediation phase and is frozen in
|
||
`references/algorithm-signals-reference.md` (the single source of truth) +
|
||
`docs/remediation/research/01-03`. Hardening is an **internal intention-fidelity**
|
||
exercise: it *applies* that established bar, it does not extend it. If a simulation
|
||
surfaces a genuinely new external question, it is logged as an Open Question for the
|
||
operator — not researched mid-phase.
|
||
|
||
## Open Questions / Assumptions
|
||
|
||
- **[ASSUMPTION]** `/trekplan` orders work as the 8 sessions above, one (or part of a)
|
||
journey per Voyage session, with `/trekreview` as the per-session release gate.
|
||
- **[ASSUMPTION]** `quick` is the S1 calibration command (operator did not object;
|
||
overridable next session).
|
||
- **[ASSUMPTION]** No `/trekresearch` (operator proposed-and-agreed to skip — the
|
||
bar is settled).
|
||
- **[ASSUMPTION]** Version stays v4.1.0 through the phase; an optional v4.2.0
|
||
"command hardening pass" minor bump is a phase-end decision, not pre-committed.
|
||
- **[OPEN]** Whether `newsletter` (16-phase) needs its own dedicated session — decide
|
||
during S4 once its orchestration simulation is scoped.
|
||
- **[OPEN]** Whether the operator wants the simulated outputs themselves preserved
|
||
(full text) in the log, or only the friction/verdict summary (token cost vs.
|
||
reviewability trade-off) — default: summary + output sketch, full output only when
|
||
it's the evidence for a gap.
|
||
|
||
## Prior Attempts
|
||
|
||
The baseline-audit remediation (`docs/remediation/`, S1–S17, complete 2026-05-31,
|
||
commit `2633d32`) is the immediate predecessor. It closed every still-real audit
|
||
finding (correctness, honesty, orphan-agent wiring, dead lint, generalization) and
|
||
its S17 triage confirmed 0 still-real findings remain. No command has yet been
|
||
exercised end-to-end for **output quality against intention** — that gap is exactly
|
||
what this phase fills. The plugin is at v4.1.0, 29 commands / 19 agents, stable.
|
||
|
||
## Metadata
|
||
|
||
- **Created:** 2026-05-31
|
||
- **Interview turns:** 1 (three forks locked: simulation depth = hybrid; change
|
||
appetite = intention-fidelity + prompt-quality; cadence = per-journey, large groups split)
|
||
- **Auto-research opted in:** no (bar already settled in remediation)
|
||
- **Source:** operator-driven foundation session; execution begins next session
|
||
|
||
---
|
||
|
||
## How to continue
|
||
|
||
```bash
|
||
# Plan (this session — lays the foundation):
|
||
/trekplan --project docs/hardening/
|
||
|
||
# Then, from next session onward, one journey-group per session:
|
||
/trekcontinue --project docs/hardening/ # S1: method calibration (quick) + Start
|
||
# … gate each: test-runner.sh + node --test + /trekreview ALLOW → commit (own files) → push
|
||
```
|
||
|
||
Driven per the operating model: the operator types **«Les STATE.md og følg
|
||
instruksjonene.»**; Claude invokes the `/trek*` skills itself. STATE.md points at the
|
||
next session; this brief is the contract `/trekplan` consumes now.
|