New phase after the baseline-audit remediation (S1-S17, 2633d32, complete):
a command-hardening pass that simulates each of the 29 commands and tightens
it to its stated intention (intention-fidelity + prompt-quality only — no
structural redesign, no new features, no GUI/M0). Runs over ~8 journey-grouped
Voyage sessions, gated by lint + /trekreview ALLOW before push.
Foundation laid this session (execution starts next session):
- docs/hardening/brief.md (valid; 3 locked forks: hybrid simulation,
intention-fidelity+prompt-quality, per-journey cadence; research skipped —
the 2026 bar is frozen in algorithm-signals-reference.md)
- docs/hardening/plan.md (8 sessions, Grade B+ 87)
Adversarial review: brief-reviewer PROCEED_WITH_RISKS (3 findings folded:
self-graded quality axis, SC-H deferral contradiction, no stopping rule);
plan-critic REVISE (2 blockers + 10 major — all addressed: anchored coverage
predicate kills substring false-positives, full-session cold-reviewer oracle
on concrete before/after output, per-type mechanical predicate, Failed:0 gate
not literal-74, S1 HALT circuit-breaker); scope-guardian ALIGNED.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
15 KiB
| type | brief_version | created | task | slug | project_dir | research_topics | research_status | auto_research | interview_turns | source | phase_signals | |||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| trekbrief | 2.1 | 2026-05-31 | Command hardening pass — simulate every linkedin-studio command and harden it to its stated intention (intention-fidelity + prompt-quality), one command at a time, across ~8 journey-grouped sessions, before the GUI/M0 track | hardening | docs/hardening/ | 0 | skipped | false | 1 | interview |
|
Task: linkedin-studio command hardening
Generated 2026-05-31 (operator-driven, one clarification turn — three forks locked via AskUserQuestion). This brief is the contract between requirements and planning.
/trekplanreads it to produce the multi-session plan. Every decision in the plan must trace back to content here.Predecessor of record: the baseline-audit remediation (
docs/remediation/) is complete (S1–S17, last commit2633d32, clean ALLOW). That phase fixed structure, correctness, and honesty. This phase is the next, distinct layer: does each command actually deliver what it promises when run?
Intent
The remediation made every claim honest, wired every orphan agent, rebuilt the lint, and reconciled the algorithm bar. What it did not do is exercise each command's workflow end-to-end and judge the quality of what it produces against the command's own stated intention. A command can be structurally correct (right frontmatter, right agent wired, lint-green) and still under-deliver: a step that under-determines the next move, a question that yields a weak answer, a prompt that produces generic output, a missing graceful-degradation path.
This is a hardening phase: a deliberate, per-command pass that simulates a
realistic invocation, judges the result against intention + the 2026 algorithm
bar + the content-quality rules, and tightens the command definition where it
falls short. It is the last quality gate before the GUI/M0 track (the two
briefs docs/linkedin-studio-ui-brief.md + docs/linkedin-studio-persona-brief.md)
— hardening the engine before building a dashboard on top of it.
Division of labour (explicit): the operator tests the commands live over the coming weeks (real inputs, real friction). This phase is the complementary intention-fidelity pass from the definition side. The two converge through a field-notes inbox (below): the operator's real-world findings outrank the simulated ones and steer per-session priority.
Goal
For every one of the 29 command surfaces, produce: (1) a recorded intention, (2) a simulated run with a concrete persona, (3) an evaluation against four axes, (4) a hardened command definition where gaps were found, (5) a green verification. Leave the command set structurally identical (29/19/25/6) and the plugin in a state where each command reliably delivers its description's promise.
Non-Goals
- No structural redesign. No command merged, split, added, or removed; no new
command; no journey re-tiering. 14a already proved zero redundancy; the surface
count (29) and journey layer (v4.1.0) are fixed. *(Locked fork: "intention-fidelity
- prompt-quality", not "structural".)*
- No new features / capability accretion. Hardening tightens existing intention; it does not add surfaces the audit/remediation deliberately scoped out (no auto-publish, no dwell measurement, no analytics-API integration, no new agents).
- No GUI/M0 work. The dashboard, the per-user data-dir migration (M0), and the provider register are the next track and require their own mandate + brief. This phase only hardens the command engine they will sit on.
- No re-litigation of the algorithm bar.
references/algorithm-signals-reference.mdis the single source of truth (settled in remediation). Hardening applies it; it does not re-research or re-open it. No/trekresearchis run. - No version/count churn per session. Prompt/spec refinement is not a surface change; sessions do not bump the version or touch counts (S11–S16 refinement precedent). A single optional minor bump (v4.2.0 "command hardening pass") may be taken at phase end.
Constraints
- Opus on everything (sub-agents + orchestration). Max-discipline default.
- No hidden costs.
/trekplan,/trekcontinue,/trekreviewand any Workflow fan-out are cost-warned in plain text before running; the operator's standing yes (2026-05-30) covers the routine/trekcontinue·/trekreviewgate — run, don't block. - Gate before push (no WARN-override): per session,
bash scripts/test-runner.shexit 0 +node --testgreen where touched +/trekreviewALLOW → commit (own files only, explicit staging, nevergit add -u/-A) → push to Forgejo. In-session fix of the session's own misses = completion; genuine pre-existing design findings → next session. - Three-doc rule applies only if a hardening edit changes a feature/surface/count/
version (most will not — they refine prompt text). Pure prompt-quality edits are
fix:/refactor:and do not trigger the feat-gate. - Platform: bash 3.2, Node-only hooks, no npm deps in hooks/scanners.
- Untracked NOT-mine (never commit):
docs/linkedin-studio-persona-brief.md,docs/linkedin-studio-ui-brief.md,docs/voyage-build/progress.json.*.local.md.session-state.local.json+review.html+ STATE.md are gitignored.
Preferences
- Simulation = hybrid (locked fork). Per command: role-play a realistic invocation (concrete persona + scenario, mock answers), produce/sketch the actual output the command would generate, and cold-read the spec against intention. Harden from both signals — catches "spec is broken" and "output is mediocre".
- Personas: drawn from the plugin's real ICP and state — a solo Norwegian/English
AI-advisor creator (the author's own profile: ~1048 followers, "Validation" phase,
5 expertise pillars), plus at least one fresh adopter (no voice samples, no
analytics yet) to exercise progressive-onboarding + graceful-degradation paths. The
persona-briefmay be consulted (read-only) for richer personas; it is optional, not a hard dependency. - Session granularity (locked fork): one journey per session where small
(Start/Engage/Measure/Grow); split Create (8 commands incl. the 16-phase
newsletter) across 2–3 sessions. ~8 sessions total. - Field-notes inbox: before hardening a journey, check
docs/hardening/field-notes.local.mdfor the operator's live-test findings on those commands and let them steer priority; the operator's real friction outranks the simulated friction. - Per-command log: record intention → simulation (inputs + output sketch +
friction) → evaluation (4 axes) → harden diff → verify, in
docs/hardening/log.md, so the operator's parallel live-testing can cross-reference every change and its why.
Non-Functional Requirements
- No regression. After each session: lint exits 0 with unchanged counts
(Commands 29 · Agents 19 · Reference docs 25 · Skills 6); hooks
node --test98/98 and analytics 116/116 stay green if those surfaces are touched; cross-references (router ↔ commands ↔ agents ↔ skills) stay consistent. - Determinism. Every success criterion is falsifiable by a command or a recorded observation; the hardening log is the audit trail.
- Independence. The per-session
/trekreviewruns cold (independent reviewers), same as the remediation gate.
Success Criteria
Falsifiable per command and per phase.
Per command (the hardening unit):
- SC-A (intention recorded):
docs/hardening/log.mdhas, for the command, a one-paragraph intention distilled from its frontmatterdescription+ journey role- the content-quality rules + the algorithm signals it must honor.
- SC-B (simulated): a concrete persona + scenario, the mock inputs, and the produced/sketched output are recorded, with a friction log of every ambiguity, dead-end, or under-determined step encountered walking the workflow as written.
- SC-C (evaluated on 4 axes): the simulated output + spec carry an explicit
pass/gap verdict against (a) intention fidelity (does it deliver the
description's promise?), (b) algorithm bar (
algorithm-signals-reference.md), (c) content-quality rules (hook 110-140, length bands, no body links, no buzzwords, topic-relevance, topic-rotation), (d) agent-wiring + graceful degradation (rightsubagent_type; sensible fallback when an agent/tool/mcp/CLI is unavailable). - SC-D (hardened or deferred): every gap is either fixed (diff recorded in the log) or explicitly deferred with a rationale — no gap left unrecorded.
- SC-E (verified): after hardening,
test-runner.shexits 0 with unchanged counts,node --testis green where touched, and a re-read/re-simulation confirms the previously-failing axis now passes.
Per phase:
- SC-F (coverage): all 29 command surfaces (27 atomic + the 2 front-doors
create/measure) have a completelog.mdentry through SC-A…SC-E. - SC-G (no structural drift):
ls commands/*.md | wc -l== 29 throughout; no merge/split/add/remove; the lint's count/version guards stay green;grepfor any stale count returns 0. - SC-H (clean gate): every session was pushed on
/trekreviewALLOW (no WARN-override); the per-sessionreview.mdshows 0 open findings. - SC-I (operator findings honored): where
field-notes.local.mdcarries a live finding for a session's commands, it is addressed or explicitly triaged in that session's log.
Hardening method (the per-command contract /trekplan must encode)
1. INTENT distil the command's promise (description + journey role + quality
rules + algorithm signals it must honor) → one paragraph
2. SIMULATE pick a realistic persona + scenario; walk the workflow exactly as
written; answer its questions; produce/sketch the real output; log
every friction / ambiguity / dead-end / under-determined step
3. EVALUATE judge output + spec on 4 axes: intention · algorithm bar · quality
rules · agent-wiring + graceful degradation → pass/gap per axis
4. HARDEN edit the command .md (surgical; agent/reference only if the intention
requires it): fix dead-ends/contradictions/wrong-wiring + strengthen
prompts/questions/steps so output hits the bar. NO structural redesign.
5. VERIFY lint exit 0 + counts unchanged + node --test green where touched +
re-simulation passes the failing axis; record before/after in log.md
Proposed session decomposition (to be confirmed/refined by /trekplan)
| Session | Scope | Notes |
|---|---|---|
| S1 | Method calibration on one command (quick — simplest, highest-volume) → show output + harden diff → lock the method. Then Start: onboarding, first-post, setup |
de-risks the method before scaling |
| S2 | Create I (atomic short-form): post, react (+ quick if not finished in S1) |
|
| S3 | Create II (visual/video): carousel, video, multiplatform |
mcp-image / aspect-ratio / SRT graceful degradation in focus |
| S4 | Create III: batch, newsletter |
newsletter = orchestration only (16-phase; gate-agents already reviewed in remediation); may spill into its own session |
| S5 | Engage: firsthour, calendar, pipeline |
state-mutation + publish + first-hour plan paths |
| S6 | Measure: import, report, analyze, audit, ab-test |
analytics CLI graceful degradation + directional A/B framing |
| S7 | Grow: strategy, competitive, monetize, outreach, profile |
~1K soft-gating + tracked-pipeline + SEO paths |
| S8 | Front-doors + router: create, measure, linkedin |
delegation/routing surfaces — verify they route correctly to the now-hardened commands |
Research Plan
None. /trekresearch is deliberately skipped. The external 2026 bar
(algorithm signals, analytics/publish boundaries, coverage-gap specs) was fully
researched and triangulated in the remediation phase and is frozen in
references/algorithm-signals-reference.md (the single source of truth) +
docs/remediation/research/01-03. Hardening is an internal intention-fidelity
exercise: it applies that established bar, it does not extend it. If a simulation
surfaces a genuinely new external question, it is logged as an Open Question for the
operator — not researched mid-phase.
Open Questions / Assumptions
- [ASSUMPTION]
/trekplanorders work as the 8 sessions above, one (or part of a) journey per Voyage session, with/trekreviewas the per-session release gate. - [ASSUMPTION]
quickis the S1 calibration command (operator did not object; overridable next session). - [ASSUMPTION] No
/trekresearch(operator proposed-and-agreed to skip — the bar is settled). - [ASSUMPTION] Version stays v4.1.0 through the phase; an optional v4.2.0 "command hardening pass" minor bump is a phase-end decision, not pre-committed.
- [OPEN] Whether
newsletter(16-phase) needs its own dedicated session — decide during S4 once its orchestration simulation is scoped. - [OPEN] Whether the operator wants the simulated outputs themselves preserved (full text) in the log, or only the friction/verdict summary (token cost vs. reviewability trade-off) — default: summary + output sketch, full output only when it's the evidence for a gap.
Prior Attempts
The baseline-audit remediation (docs/remediation/, S1–S17, complete 2026-05-31,
commit 2633d32) is the immediate predecessor. It closed every still-real audit
finding (correctness, honesty, orphan-agent wiring, dead lint, generalization) and
its S17 triage confirmed 0 still-real findings remain. No command has yet been
exercised end-to-end for output quality against intention — that gap is exactly
what this phase fills. The plugin is at v4.1.0, 29 commands / 19 agents, stable.
Metadata
- Created: 2026-05-31
- Interview turns: 1 (three forks locked: simulation depth = hybrid; change appetite = intention-fidelity + prompt-quality; cadence = per-journey, large groups split)
- Auto-research opted in: no (bar already settled in remediation)
- Source: operator-driven foundation session; execution begins next session
How to continue
# Plan (this session — lays the foundation):
/trekplan --project docs/hardening/
# Then, from next session onward, one journey-group per session:
/trekcontinue --project docs/hardening/ # S1: method calibration (quick) + Start
# … gate each: test-runner.sh + node --test + /trekreview ALLOW → commit (own files) → push
Driven per the operating model: the operator types «Les STATE.md og følg
instruksjonene.»; Claude invokes the /trek* skills itself. STATE.md points at the
next session; this brief is the contract /trekplan consumes now.