Kjell Tore Guttormsen 2f90880f7a docs(linkedin-studio): hardening-phase foundation (brief + adversarially-reviewed plan)

New phase after the baseline-audit remediation (S1-S17, 2633d32, complete):
a command-hardening pass that simulates each of the 29 commands and tightens
it to its stated intention (intention-fidelity + prompt-quality only — no
structural redesign, no new features, no GUI/M0). Runs over ~8 journey-grouped
Voyage sessions, gated by lint + /trekreview ALLOW before push.

Foundation laid this session (execution starts next session):
- docs/hardening/brief.md (valid; 3 locked forks: hybrid simulation,
  intention-fidelity+prompt-quality, per-journey cadence; research skipped —
  the 2026 bar is frozen in algorithm-signals-reference.md)
- docs/hardening/plan.md (8 sessions, Grade B+ 87)

Adversarial review: brief-reviewer PROCEED_WITH_RISKS (3 findings folded:
self-graded quality axis, SC-H deferral contradiction, no stopping rule);
plan-critic REVISE (2 blockers + 10 major — all addressed: anchored coverage
predicate kills substring false-positives, full-session cold-reviewer oracle
on concrete before/after output, per-type mechanical predicate, Failed:0 gate
not literal-74, S1 HALT circuit-breaker); scope-guardian ALIGNED.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-05-31 05:31:52 +02:00

43 KiB

Raw Blame History

LinkedIn Studio — Command Hardening Plan

Plan quality: B+ (87/100) — APPROVE_WITH_NOTES (post-revision; plan-critic REVISE blockers + majors addressed — see Revisions)

Generated by trekplan v2.0 on 2026-05-31 — plan_version: 1.7

Source brief: docs/hardening/brief.md. Predecessor: docs/remediation/ (S1–S17, complete, commit 2633d32). Research: none (research_status: skipped — the 2026 algorithm bar was triangulated in remediation research 01–03 and is frozen in references/algorithm-signals-reference.md).

Context

The remediation made every claim honest, wired every orphan agent, rebuilt the lint, and reconciled the algorithm bar — structure, correctness, honesty. It did not exercise each command's workflow end-to-end and judge the quality of what it produces against the command's own stated intention. A command can be structurally correct (right frontmatter, right agent wired, lint-green) and still under-deliver: a step that under-determines the next move, a question that yields a weak answer, a prompt that produces generic output, a missing graceful-degradation path.

This is a hardening phase: a per-command pass that simulates a realistic invocation, judges the result against intention + the 2026 algorithm bar + the content-quality rules, and tightens the command definition where it falls short. It is the last quality gate before the GUI/M0 track. The operator tests the commands live in parallel; this phase is the complementary intention-fidelity pass from the definition side, converging through a field-notes inbox.

Scope is deliberately bounded (locked forks): intention-fidelity + prompt-quality only — no structural redesign, no new features, no GUI/M0, no re-litigation of the algorithm bar. The command set stays 29/19/25/6.

Architecture Diagram

graph TD
    subgraph "Hardening pass — 5-step method per command, 8 journey-grouped sessions"
        M["docs/hardening/brief.md<br/>(method contract)"] --> LOG["docs/hardening/log.md<br/>(per-command audit trail,<br/>unique anchor per entry)"]
        FN["field-notes.local.md<br/>(operator live findings, optional)"] -.steers priority.-> LOG
        REF["references/algorithm-signals-reference.md<br/>+ content-quality rules<br/>(the frozen bar)"] --> EVAL

        subgraph "Per command"
            INT["1 INTENT"] --> SIM["2 SIMULATE<br/>(persona + CONCRETE output)"]
            SIM --> EVAL["3 EVALUATE<br/>4 axes (mechanical predicate per type)"]
            EVAL --> HARD["4 HARDEN<br/>(edit commands/&lt;x&gt;.md, surgical)"]
            HARD --> VER["5 VERIFY<br/>lint + before/after + counts"]
        end

        VER --> CMD["commands/*.md (29)"]
        CMD -. invokes .-> AG["agents/*.md (19)"]
        VER --> GATE["per-session gate:<br/>test-runner.sh (Failed:0) + node --test<br/>+ /trekreview ALLOW — reviewer adjudicates<br/>EVERY hardened command's before/after"]
        GATE --> PUSH["commit own files → push (Forgejo)"]
    end

Codebase Analysis

Tech stack: Markdown command/agent definitions (Claude Code plugin); Node.js (.mjs) hooks + helpers (zero deps); TypeScript analytics CLI (scripts/analytics/, tsx); bash structural lint (scripts/test-runner.sh, set -e, dynamic PASS/FAIL counters, EXPECT_* count-guards); node:test.
Key patterns: commands invoke agents via subagent_type: linkedin-studio:<name> (namespaced) — sometimes as a literal frontmatter-style line, sometimes via a --type → subagent_type table (e.g. headless-review.md); content commands auto-copy to clipboard via clipboard-helper.mjs; PreToolUse content-quality gate (hooks/prompts/content-quality-gate.md); progressive onboarding (score hidden < 3 posts, voice guardian < 5 samples).
Relevant files (verified, 216 source files):
- commands/*.md — 29 surfaces (inventory confirmed). Agent wiring per command:
  - with subagent_type: line: ab-test→content-optimizer · analyze/report→analytics-interpreter · batch/pipeline→content-planner,trend-spotter · calendar→post-feedback-monitor · carousel/quick/react→differentiation-checker · firsthour→engagement-coach,post-feedback-monitor · newsletter→content-repurposer,editorial-reviewer,fact-checker (+ persona/voice/headless deeper) · outreach→network-builder · post→content-optimizer,differentiation-checker · setup→voice-trainer · strategy→strategy-advisor · video→differentiation-checker,video-scripter
  - invokes via a --type → subagent_type table (not a bare line): headless-review → content-reviewer/language-reviewer/fact-reviewer/persona-reviewer (CONFIRMED present, headless-review.md:77-79,141-142 — wiring is correct, just a different syntactic form).
  - no agent (prose/CLI/routing — verify intention + a type-specific mechanical predicate, not wiring): audit, competitive, create, first-post, import, linkedin, measure, monetize, multiplatform, onboarding, pivot, profile.
- references/algorithm-signals-reference.md — the single source of truth (the bar).
- references/*.md (25), skills/linkedin-studio/SKILL.md, hooks/prompts/content-quality-gate.md, scripts/test-runner.sh, hooks/scripts/state-updater.mjs (recordFirstHourPlan, recordOutreachContact), scripts/analytics/ (parseOptionalCount).
Reusable code: the hardening method (brief §"Hardening method"); the four mechanically-checkable content-quality rules (hook 110–140 chars, length bands, no body links, no buzzwords); the S17 disposition record as the log model.
External tech (researched): none — the bar is frozen; no /trekresearch.
Recent git activity: S17 (2633d32) closed remediation; plugin v4.1.0, stable, sync 0/0.

Implementation Plan

Unit of work = one command (operator: "én og én kommando"). For tractable execution + per-session checkpoints, the 29 commands are grouped into 8 journey-sessions; each session-step applies the 5-step method to every command it lists and records the result in docs/hardening/log.md.

Log-entry anchor convention (closes plan-critic blockers #1/#2 — substring false-positives): every log.md entry begins with a UNIQUE, anchored header of the exact form ### /linkedin:<command-name> — <one-line intent>. Coverage and per-step checks grep the anchored header (^### /linkedin:<name>), never the bare word — so "measure" no longer collides with the "Measure journey" label, nor "post" with "post-publish".

The per-command procedure for every command in every step:

INTENT — distil the promise (frontmatter description + journey role + quality rules + algorithm signals it must honor) → one paragraph under the command's anchored log.md header.

SIMULATE — pick a concrete persona (default: the plugin's ICP — a solo AI-advisor creator, ~1048 followers, "Validation", 5 pillars; PLUS a fresh adopter with no voice samples/analytics for graceful-degradation + progressive-onboarding paths). Walk the workflow exactly as written; answer its questions; produce the CONCRETE output the command would generate from those exact mock inputs (the actual draft/report/plan/routing-decision text — reproducible from the inputs, NOT a vague paraphrase). Log every friction/dead-end/under-determined step. This concrete before output is the artifact the oracle adjudicates.

EVALUATE on 4 axes → pass/gap each:

(a) intention fidelity — does it deliver the description's promise? (judgment)

(b) algorithm bar — consistent with algorithm-signals-reference.md? (cite)

(c) MECHANICAL predicate (hard, per command — never "N/A → judgment"):

emits a post/hook (post, quick, react, carousel, video, batch, pipeline, first-post): hook 110–140 chars · length band (1,200–1,800 std / 150–500 quick) · grep no body link · grep no banned buzzword · topic maps to the 5 pillars.

routing/front-door (linkedin, create, measure): every routing target the simulation emits resolves to a real commands/<x>.md (grep).

analytics/CLI (import, report, analyze, audit, ab-test): the documented graceful-degradation path is present (grep) and the directional/saves honesty wording is intact.

guided/stateful (onboarding, setup, calendar, firsthour, monetize, outreach, strategy, competitive, profile, multiplatform, newsletter, headless-review, pivot): the command's PRIMARY promised artifact is actually produced in the simulation (draft / plan / funnel / analysis / checklist / routing) and there is no dead-end hand-off string; for agent-invoking ones, the promised subagent_type targets resolve to real agents/<x>.md.

(d) agent-wiring + graceful degradation — right subagent_type (literal line OR --type table); sensible fallback when an agent/tool/mcp/CLI is unavailable.

HARDEN — edit commands/<x>.md (surgical; agent/reference only if intention requires). Stopping rule (anti-gold-plating): harden until every axis returns pass OR a recorded deferral; do NOT add NICE-only polish beyond axis-pass. No structural redesign. Record the concrete after output for every hardened command.

VERIFY — test-runner.sh exit 0 + Failed: 0 + counts unchanged (29/19/25/6)

node --test green where touched + the before/after pair shows the failing axis now passes. Record before/after + disposition under the command's log.md anchor.

Independent oracle (closes plan-critic #3/#12 — self-grading): the per-session /trekreview code-correctness reviewer reads, in log.md, the before/after concrete output of EVERY command hardened that session (sessions are ≤5 commands — tractable) and independently judges whether each claimed gap is genuinely closed. The cold reviewer — not the author — signs off. A session is not ALLOW until every hardened command's before/after has been adjudicated.

Manifest scope (closes plan-critic #10): manifests anchor on the guaranteed artifact — the log.md entry with its unique per-command anchor (proving the command was processed). The actual command-file hardening edits are verified by the per-session /trekreview, which reads the real diffs. Manifest proves processing; review proves the edit. (A command may legitimately need zero edits if it passes all axes — so "command file modified" cannot be a manifest predicate.)

Prior-disposition rule (closes plan-critic #11): where a step relies on a prior session's disposition (S15-B1, S17 F-VIDEO, overlap-measurement, the CWD fix), spot-confirm it still holds with one grep/read before relying on it — never "do not re-verify".

Step 1: Method calibration on `quick`, then harden the Start journey

Files: commands/quick.md, commands/onboarding.md, commands/first-post.md, commands/setup.md, docs/hardening/log.md (new), docs/hardening/field-notes.local.md (read-if-present)
Changes: First run the full 5-step method on commands/quick.md ALONE as a method calibration — produce the concrete before/after output, present the diff + the log.md entry shape to the operator, and lock the method before scaling. Then apply the method to the Start journey: onboarding (spot-confirm the S15-B1 inline-draft still delivers, then re-evaluate), first-post, setup (voice-trainer wiring + 5-pillar capture). Create docs/hardening/log.md; one anchored entry per command (### /linkedin:<name> — …). Harden each .md only where an axis gaps. (log.md is a new file.)
Reuses: the hardening method (docs/hardening/brief.md); references/algorithm-signals-reference.md; the mechanical content-quality rules (hooks/prompts/content-quality-gate.md); persona basis from the state file + voice-samples; the S17 record (docs/remediation/c13-c46-triage.md) as the log model.
Verify:
- bash scripts/test-runner.sh; echo "exit=$?" → expected: output contains Failed: 0 and exit=0 (gate on Failed: 0 + exit 0 + unchanged EXPECT_* counts, NOT the literal Passed: 74 — it is incidental and set -e-fragile).
- ls commands/*.md | wc -l → 29.
- for c in quick onboarding first-post setup; do grep -qE "^### /linkedin:$c" docs/hardening/log.md || echo "MISSING $c"; done → no output.
On failure: HALT — S1 is the method-calibration gate that gates S2–S8 (hard dependency, not advisory). If calibration reveals the method itself is unsound, do NOT proceed to S2; revert command edits (git checkout -- commands/quick.md commands/onboarding.md commands/first-post.md commands/setup.md, keep log.md), and re-lock the method WITH THE OPERATOR before any later session runs.
Checkpoint: git commit -m "fix(linkedin-studio): S1 hardening — method calibration (quick) + Start journey"

Manifest:

manifest:
  expected_paths:
    - docs/hardening/log.md
  min_file_count: 1
  commit_message_pattern: "^fix\\(linkedin-studio\\): S1 hardening"
  bash_syntax_check: []
  forbidden_paths:
    - docs/linkedin-studio-persona-brief.md
    - docs/linkedin-studio-ui-brief.md
    - docs/voyage-build/progress.json
  must_contain:
    - path: docs/hardening/log.md
      pattern: "^### /linkedin:quick"

Step 2: Harden Create I (atomic short-form)

quick is COMPLETE after S1 and is NOT re-hardened here (closes plan-critic #16). Session-grouping refinement (closes plan-critic #6): the brief's session table was explicitly "to be confirmed/refined by /trekplan". multiplatform moves to S2 (atomic cross-format short-form), batch to S3 (with the visual cluster), headless-review+pivot to S4 (the long-form cluster with newsletter). Coverage preserved — 29, none dropped/doubled.

Files: commands/post.md, commands/react.md, commands/multiplatform.md, docs/hardening/log.md
Changes: Apply the 5-step method to post (content-optimizer + differentiation-checker; hook/length mechanical pass), react (URL→post; de-AI gate), multiplatform (cross-format adaptation; confirm long-form correctly routes to /linkedin:newsletter). Append anchored entries; harden where an axis gaps.
Reuses: hardening method; algorithm-signals-reference.md; content-quality gate; the differentiation-checker wiring pattern from S1.
Verify:
- bash scripts/test-runner.sh; echo "exit=$?" → Failed: 0 + exit=0 + counts unchanged.
- for c in post react multiplatform; do grep -qE "^### /linkedin:$c" docs/hardening/log.md || echo "MISSING $c"; done → no output.
On failure: revert — git checkout -- commands/post.md commands/react.md commands/multiplatform.md (keep log.md); if /trekreview ≠ ALLOW, HALT until resolved.
Checkpoint: git commit -m "fix(linkedin-studio): S2 hardening — Create I (post, react, multiplatform)"

Manifest:

manifest:
  expected_paths:
    - docs/hardening/log.md
  min_file_count: 1
  commit_message_pattern: "^fix\\(linkedin-studio\\): S2 hardening"
  bash_syntax_check: []
  forbidden_paths:
    - docs/linkedin-studio-persona-brief.md
    - docs/linkedin-studio-ui-brief.md
    - docs/voyage-build/progress.json
  must_contain:
    - path: docs/hardening/log.md
      pattern: "^### /linkedin:post"

Step 3: Harden Create II (visual + batch)

Files: commands/carousel.md, commands/video.md, commands/batch.md, docs/hardening/log.md
Changes: Apply the method to carousel (spot-confirm the S15-B3 full-deck clipboard still holds; differentiation-checker; mcp-image graceful degradation to text-only), video (video-scripter + differentiation-checker; spot-confirm the S17 F-VIDEO disposition (deliberate aspect-ratio decision) before relying on it; caption guidance; graceful degradation), batch (content-planner + trend-spotter; pillar-rotation across the week). Focus axis (d) graceful degradation when mcp-image/tools are unavailable.
Reuses: hardening method; references/linkedin-formats.md, references/video-strategy-guide.md; the S17 F-VIDEO disposition (spot-confirmed, not assumed).
Verify:
- bash scripts/test-runner.sh; echo "exit=$?" → Failed: 0 + exit=0 + counts unchanged.
- for c in carousel video batch; do grep -qE "^### /linkedin:$c" docs/hardening/log.md || echo "MISSING $c"; done → no output.
On failure: revert — git checkout -- commands/carousel.md commands/video.md commands/batch.md (keep log.md); if /trekreview ≠ ALLOW, HALT.
Checkpoint: git commit -m "fix(linkedin-studio): S3 hardening — Create II (carousel, video, batch)"

Manifest:

manifest:
  expected_paths:
    - docs/hardening/log.md
  min_file_count: 1
  commit_message_pattern: "^fix\\(linkedin-studio\\): S3 hardening"
  bash_syntax_check: []
  forbidden_paths:
    - docs/linkedin-studio-persona-brief.md
    - docs/linkedin-studio-ui-brief.md
    - docs/voyage-build/progress.json
  must_contain:
    - path: docs/hardening/log.md
      pattern: "^### /linkedin:carousel"

Step 4: Harden Create III (long-form cluster)

Files: commands/newsletter.md, commands/headless-review.md, commands/pivot.md, docs/hardening/log.md
Changes: Apply the method to the long-form cluster. newsletter = orchestration only (the 16-phase gate sequence + per-gate agents were heavily reviewed in remediation; simulate the operator's path THROUGH the phases — banner, phase transitions, lock/visual-assets/headless gates — NOT a re-review of each gate agent; spot-confirm overlap-measurement.md exists before citing "do not re-measure"). headless-review — wiring is CONFIRMED present (headless-review.md:77-79,141-142, --type → subagent_type table); the step VERIFIES that table resolves to real agents (mechanical axis-c) and simulates the consolidated-report output — no wiring fix expected. pivot (re-open-gates heuristic). May split into its own session if newsletter's orchestration simulation is large — decide at session start, log the split explicitly (not a silent cap).
Reuses: hardening method; references/newsletter-strategy-guide.md; references/longform-quality-rules.md; docs/remediation/overlap-measurement.md (spot-confirmed).
Verify:
- bash scripts/test-runner.sh; echo "exit=$?" → Failed: 0 + exit=0 + counts unchanged.
- for c in newsletter headless-review pivot; do grep -qE "^### /linkedin:$c" docs/hardening/log.md || echo "MISSING $c"; done → no output.
- grep -qE "linkedin-studio:(content|language|fact)-reviewer" commands/headless-review.md → present (wiring confirmed).
On failure: revert — git checkout -- commands/newsletter.md commands/headless-review.md commands/pivot.md (keep log.md); if /trekreview ≠ ALLOW, HALT.
Checkpoint: git commit -m "fix(linkedin-studio): S4 hardening — Create III long-form (newsletter, headless-review, pivot)"

Manifest:

manifest:
  expected_paths:
    - docs/hardening/log.md
  min_file_count: 1
  commit_message_pattern: "^fix\\(linkedin-studio\\): S4 hardening"
  bash_syntax_check: []
  forbidden_paths:
    - docs/linkedin-studio-persona-brief.md
    - docs/linkedin-studio-ui-brief.md
    - docs/voyage-build/progress.json
  must_contain:
    - path: docs/hardening/log.md
      pattern: "^### /linkedin:newsletter"

Step 5: Harden the Engage journey

Files: commands/firsthour.md, commands/calendar.md, commands/pipeline.md, docs/hardening/log.md
Changes: Apply the method to firsthour (engagement-coach + post-feedback-monitor; recordFirstHourPlan state path), calendar (queue view + publish action; spot-confirm the auto-publish honesty boundary holds; state mutation), pipeline (content-planner + trend-spotter). Focus axis (c) guided/stateful predicate + axis (d) state-mutation + honesty boundaries.
Reuses: hardening method; hooks/scripts/state-updater.mjs (recordFirstHourPlan, queue); the remediation scheduling-boundary wording (spot-confirmed).
Verify:
- bash scripts/test-runner.sh; echo "exit=$?" → Failed: 0 + exit=0 + counts unchanged.
- node --test hooks/scripts/__tests__/*.test.mjs → Failed: 0 (pass 98) if any hook touched.
- for c in firsthour calendar pipeline; do grep -qE "^### /linkedin:$c" docs/hardening/log.md || echo "MISSING $c"; done → no output.
On failure: revert — git checkout -- commands/firsthour.md commands/calendar.md commands/pipeline.md (keep log.md); if /trekreview ≠ ALLOW, HALT.
Checkpoint: git commit -m "fix(linkedin-studio): S5 hardening — Engage journey (firsthour, calendar, pipeline)"

Manifest:

manifest:
  expected_paths:
    - docs/hardening/log.md
  min_file_count: 1
  commit_message_pattern: "^fix\\(linkedin-studio\\): S5 hardening"
  bash_syntax_check: []
  forbidden_paths:
    - docs/linkedin-studio-persona-brief.md
    - docs/linkedin-studio-ui-brief.md
    - docs/voyage-build/progress.json
  must_contain:
    - path: docs/hardening/log.md
      pattern: "^### /linkedin:firsthour"

Step 6: Harden the Measure journey

Files: commands/import.md, commands/report.md, commands/analyze.md, commands/audit.md, commands/ab-test.md, docs/hardening/log.md
Changes: Apply the method to import (analytics CLI graceful degradation — spot-confirm the remediation CWD/npm install fix holds; the S16 saves manual-entry path), report/analyze (analytics-interpreter; saves + directional framing), audit (quarterly strategy audit; type-c = guided/stateful predicate), ab-test (directional-not-significant framing; content-optimizer). Focus axis (c) analytics graceful-degradation + (a) directional-A/B + saves honesty.
Reuses: hardening method; scripts/analytics/; the remediation's saves/dwell honesty wording; the S16 parseOptionalCount() saves path (spot-confirmed).
Verify:
- bash scripts/test-runner.sh; echo "exit=$?" → Failed: 0 + exit=0 + counts unchanged.
- if analytics touched: cd scripts/analytics && npm test → Failed: 0 (pass 116).
- for c in import report analyze audit ab-test; do grep -qE "^### /linkedin:$c" docs/hardening/log.md || echo "MISSING $c"; done → no output.
On failure: revert — git checkout -- commands/import.md commands/report.md commands/analyze.md commands/audit.md commands/ab-test.md (keep log.md); if /trekreview ≠ ALLOW, HALT.
Checkpoint: git commit -m "fix(linkedin-studio): S6 hardening — Measure journey (import, report, analyze, audit, ab-test)"

Manifest:

manifest:
  expected_paths:
    - docs/hardening/log.md
  min_file_count: 1
  commit_message_pattern: "^fix\\(linkedin-studio\\): S6 hardening"
  bash_syntax_check: []
  forbidden_paths:
    - docs/linkedin-studio-persona-brief.md
    - docs/linkedin-studio-ui-brief.md
    - docs/voyage-build/progress.json
  must_contain:
    - path: docs/hardening/log.md
      pattern: "^### /linkedin:ab-test"

Step 7: Harden the Grow journey

Files: commands/strategy.md, commands/competitive.md, commands/monetize.md, commands/outreach.md, commands/profile.md, docs/hardening/log.md
Changes: Apply the method to strategy (phase guidance/trajectory; strategy-advisor), competitive (niche analysis; type-c guided predicate; remediation un-gated it), monetize (~1K soft-gating; readiness scoring), outreach (collab + speaking; network-builder; tracked-pipeline via recordOutreachContact), profile (profile-SEO surface from remediation). Focus (a) intention for the ~1K-gated commands (soft-gate guidance, not dead-ends) + (c)/(d) tracked-pipeline state.
Reuses: hardening method; the remediation's profile-SEO additions, outreach pipeline tracker (state-updater.mjs), ~1K soft-gating rule.
Verify:
- bash scripts/test-runner.sh; echo "exit=$?" → Failed: 0 + exit=0 + counts unchanged.
- for c in strategy competitive monetize outreach profile; do grep -qE "^### /linkedin:$c" docs/hardening/log.md || echo "MISSING $c"; done → no output.
On failure: revert — git checkout -- commands/strategy.md commands/competitive.md commands/monetize.md commands/outreach.md commands/profile.md (keep log.md); if /trekreview ≠ ALLOW, HALT.
Checkpoint: git commit -m "fix(linkedin-studio): S7 hardening — Grow journey (strategy, competitive, monetize, outreach, profile)"

Manifest:

manifest:
  expected_paths:
    - docs/hardening/log.md
  min_file_count: 1
  commit_message_pattern: "^fix\\(linkedin-studio\\): S7 hardening"
  bash_syntax_check: []
  forbidden_paths:
    - docs/linkedin-studio-persona-brief.md
    - docs/linkedin-studio-ui-brief.md
    - docs/voyage-build/progress.json
  must_contain:
    - path: docs/hardening/log.md
      pattern: "^### /linkedin:outreach"

Step 8: Harden the front-doors + router, and close the phase

Files: commands/create.md, commands/measure.md, commands/linkedin.md, docs/hardening/log.md
Changes: Apply the method to the routing surfaces: create (Create front-door — simulate "what should I make?" and confirm axis-c routing predicate: every target it emits resolves to a real, now-hardened creation command), measure (Measure front-door — routes to the correct analytics command), linkedin (the five-journey router — every journey + command listed and reachable, no stale entries). Brand-consistency judgment (from plan-critic #5): evaluate commands/linkedin.md:4,16 ("LinkedIn thought leadership assistant/commands") — these are domain descriptors (S17 ruled them not-stale plugin-name misnomers), but the router is the brand front-door, so harden to "LinkedIn Studio" phrasing IF it improves intention clarity (a hardening judgment, not forced). Phase close: confirm log.md has all 29 anchored entries (SC-F); decide the optional v4.2.0 bump (if taken, it is the ONE feat-gated, three-doc, version-synced commit of the phase — run the full version grep — NOT a fix:).
Reuses: hardening method; docs/remediation/journey-layer-design.md (the five-journey design); the hardened commands S1–S7 as routing targets.
Verify:
- bash scripts/test-runner.sh; echo "exit=$?" → Failed: 0 + exit=0 + counts unchanged.
- SC-F coverage (anchored, closes blockers #1/#2): for c in $(ls commands/*.md | xargs -n1 basename | sed 's/.md//'); do grep -qE "^### /linkedin:$c( |—|$)" docs/hardening/log.md || echo "MISSING: $c"; done → no output (all 29 anchored entries present).
- SC-G stale-brand (broadened, closes #5): grep -rn "thought leadership plugin" commands/ skills/ references/ README.md → 0 (the plugin-NAME misnomer; domain phrases are allowed).
On failure: revert — git checkout -- commands/create.md commands/measure.md commands/linkedin.md (keep log.md); escalate if SC-F coverage is incomplete.
Checkpoint: git commit -m "fix(linkedin-studio): S8 hardening — front-doors + router; phase complete"

Manifest:

manifest:
  expected_paths:
    - docs/hardening/log.md
  min_file_count: 1
  commit_message_pattern: "^fix\\(linkedin-studio\\): S8 hardening"
  bash_syntax_check: []
  forbidden_paths:
    - docs/linkedin-studio-persona-brief.md
    - docs/linkedin-studio-ui-brief.md
    - docs/voyage-build/progress.json
  must_contain:
    - path: docs/hardening/log.md
      pattern: "^### /linkedin:linkedin"

Alternatives Considered

Approach	Pros	Cons	Why rejected
One Voyage session per command (29 sessions)	Deepest per command; granular checkpoints	~29 sessions; high startup overhead; uneven	Operator locked "per journey, large groups split" — 8 sessions
Spec-audit only (no simulation)	Faster, cheaper	Misses output-quality gaps — the phase's whole reason	Operator locked "hybrid: run + spec-audit"
Allow structural redesign (merge/split commands)	Could fix root-cause structure	Churn right after remediation; rocks the count/journey invariants; re-opens 14a	Operator locked "intention-fidelity + prompt-quality, no structural"
Full exploration swarm (6–8 agents) in planning	Canonical /trekplan rigor	Re-maps an already-fully-mapped, frozen codebase — wasted spend	Scaled down per the "adaptive / don't over-spawn" rule; kept the high-value brief-review + plan-critic + scope-guardian gates
Oracle = "≥1 command/session spot-check" (original draft)	Cheaper review	Leaves 80%+ of a session self-graded — does not break the loop	Replaced (plan-critic #3) by "reviewer adjudicates EVERY hardened command's before/after"

Test Strategy

Framework: no unit tests are written by this phase (it edits Markdown command prompts). The "test" per command is the simulation + per-type mechanical predicate + the concrete before/after output pair, recorded in docs/hardening/log.md and adjudicated by the cold /trekreview reviewer (the independent oracle). The existing suites (test-runner.sh, hooks node --test, analytics npm test) are regression guards, gated on Failed: 0 (not a literal pass-count).
Existing patterns: evidence-per-row disposition records (S17 c13-c46-triage.md).
New tests in this plan: 0 new automated tests; the concrete before/after pair + per-type mechanical predicate is the falsifiable evidence, adjudicated independently.

Tests to write

Type	File	Verifies	Model
Evidence record	`docs/hardening/log.md`	per-command INTENT→VERIFY + concrete before/after + mechanical predicate; unique anchor per entry	`docs/remediation/c13-c46-triage.md`
Regression	(existing) `scripts/test-runner.sh`	counts 29/19/25/6 + structural guards; gate on `Failed: 0` + exit 0	—
Regression	(existing) hooks/analytics `node --test`	`Failed: 0` (98 / 116) when those surfaces touched	—

Risks and Mitigations

Priority	Risk	Location	Impact	Mitigation
High	Self-graded quality axis (brief-review #1, plan-critic #3/#12)	method 3+5	"hardened" claimed without provable improvement	Concrete (reproducible) before/after output + per-type mechanical predicate + `/trekreview` reviewer adjudicates every hardened command's before/after (not a sample)
High	Coverage gate false-positive (plan-critic #1/#2)	SC-F + per-step greps	a dropped command reads as covered	Unique anchored log header `^### /linkedin:<name>`; all greps anchored, never bare-word
Medium	Gold-plating / unbounded depth (brief-review #3)	method 4	token blowup; uneven coverage	Stopping rule: axes pass-or-defer; no NICE polish; per-session split logged if scope balloons
Medium	SC-H vs deferral contradiction (brief-review #2)	brief SC-H	gate unsatisfiable, or "open finding" redefined	SC-H = "0 un-triaged findings; each fixed or recorded-deferral; `/trekreview` ALLOW with deferrals noted"
Medium	`Passed: 74` brittle under `set -e` (plan-critic #7, scope-guardian caveat)	every Verify	spurious gate failure on any added check	Gate on `Failed: 0` + exit 0 + unchanged `EXPECT_*` counts; `74` is incidental
Medium	No circuit-breaker on the S1 method gate (plan-critic #9)	S1 On-failure	headless run proceeds with an unsound method	S1 On-failure = HALT; S1→S2..S8 is a hard dependency; any non-ALLOW `/trekreview` halts the phase
Medium	Mechanical axis absent on ~13 no-output commands (plan-critic #4)	method 3(c)	the non-circular axis evaporates → pure judgment	Per-type mechanical predicate defined for every command class (routing/analytics/guided), never "N/A → judgment"
Low	Simulation ≠ live run	method 2	a gap only real testing surfaces is missed	Field-notes inbox: operator's live findings outrank simulated; checked before each journey; SC-I
Low	Prior-disposition premise misremembered (plan-critic #11)	S1/S3/S4/S6	an axis skipped on a false premise	Spot-confirm each cited disposition with one grep/read before relying
Low	Stale-brand grep mis-scoped (plan-critic #5)	SC-G	a plugin-name misnomer in commands/ slips	SC-G greps commands/+skills/+references/+README; S8 also judges the router's domain phrasing
Low	Touching a not-mine untracked file	repo `docs/`	accidental commit of operator/UI briefs	`forbidden_paths` in every manifest + explicit staging (own files only)
Low	Optional v4.2.0 bump fumbled as `fix:` (brief-review #4)	S8 phase-close	version-skew	If taken, the bump is the ONE feat-gated three-doc version-synced commit; run the full version grep

Assumptions

#	Assumption	Why unverifiable	Impact if wrong
1	`quick` is the right S1 calibration command	operator preference, not yet confirmed	trivial — swap at S1 start
2	SC-H means "0 un-triaged findings", deferrals allowed	reconciling brief SC-H with SC-D	a legitimately-deferred finding would otherwise fail the gate
3	`newsletter` orchestration fits S4 with headless-review + pivot	depends on simulation size	S4 splits into its own session (logged) — permitted
4	~~headless-review wiring may be broken~~ RESOLVED — wiring confirmed present (`headless-review.md:77-79,141-142`)	n/a	n/a — S4 verifies the table resolves, no fix expected
5	Version stays v4.1.0 through the phase; optional v4.2.0 at phase end	phase-end decision	none — pre-committed as optional

Verification

Per-step manifest verification runs automatically during execution. End-to-end / phase-level checks:

SC-F (coverage, anchored): for c in $(ls commands/*.md | xargs -n1 basename | sed 's/.md//'); do grep -qE "^### /linkedin:$c( |—|$)" docs/hardening/log.md || echo "MISSING $c"; done → no output (all 29 anchored entries; no substring false-positives).
SC-A…SC-E (per command): each anchored log.md entry has intention + simulation (persona + CONCRETE before-output) + 4-axis evaluation + (if hardened) concrete after-output + verify line.
SC-C (mechanical, hard, per type): every command records a pass/fail on its type's mechanical predicate (post-emitting → 4 content rules; routing → targets resolve; analytics → graceful-degradation present; guided → primary artifact produced) — never "N/A → judgment".
SC-G (no structural drift): ls commands/*.md | wc -l → 29 every session; bash scripts/test-runner.sh → Failed: 0 + exit 0 + EXPECT_* counts unchanged; grep -rn "thought leadership plugin" commands/ skills/ references/ README.md → 0.
SC-H (clean gate, reconciled): each session's review.md shows 0 un-triaged findings (each fixed in-session or carrying a recorded deferral rationale) and /trekreview returns ALLOW (deferrals noted, not WARN-override).
Oracle (independent): each session's /trekreview adjudicated the before/after of every command hardened that session (recorded in review.md) — not a sample.
SC-I (field-notes): where docs/hardening/field-notes.local.md has a finding for a session's commands, the log addresses or triages it; absent file → graceful no-op (does not block; the inbox is operator-owned).
Regression: node --test hooks/scripts/__tests__/*.test.mjs → Failed: 0 and cd scripts/analytics && npm test → Failed: 0 whenever those surfaces are touched.

Estimated Scope

Files to modify: up to 29 command .md files (only where an axis gaps — some pass clean with no edit) + occasional agent/reference edits where intention requires.
Files to create: 1 (docs/hardening/log.md); field-notes.local.md is operator-created (gitignored).
Complexity: medium — low risk per edit (surgical prompt changes, no code/structure), broad (29 surfaces) and judgment-heavy, de-risked by the per-type mechanical predicate + the full-session cold-reviewer oracle.

Execution Strategy

8 sessions, sequential (each depends on the method locked in S1 and shares the single log.md append target → not parallelizable without log-merge conflicts). Run one per Voyage session via /trekcontinue. Session-grouping note (plan-critic #6): the brief's session table was explicitly "to be confirmed/refined by /trekplan"; this plan refines it — multiplatform→S2, batch→S3, headless-review+pivot→S4 — with full coverage preserved (29, none dropped/doubled). quick is complete after S1.

Session 1: Method calibration + Start

Steps: 1 · Wave: 1 · Depends on: none
Scope fence: Touch: commands/{quick,onboarding,first-post,setup}.md, docs/hardening/log.md. Never touch: other commands, the 3 not-mine untracked files.
Circuit-breaker: if S1 calibration fails, HALT — no later session runs until the method is re-locked with the operator.

Session 2: Create I (atomic short-form)

Steps: 2 · Wave: 2 · Depends on: Session 1 (hard gate — method locked)
Scope fence: Touch: commands/{post,react,multiplatform}.md, log.md. Never touch: other commands.

Session 3: Create II (visual + batch)

Steps: 3 · Wave: 3 · Depends on: Session 1
Scope fence: Touch: commands/{carousel,video,batch}.md, log.md. Never touch: other commands.

Session 4: Create III (long-form cluster)

Steps: 4 · Wave: 4 · Depends on: Session 1
Scope fence: Touch: commands/{newsletter,headless-review,pivot}.md, log.md. Never touch: other commands. (May split 4a/4b if newsletter orchestration is large — logged.)

Session 5: Engage

Steps: 5 · Wave: 5 · Depends on: Session 1
Scope fence: Touch: commands/{firsthour,calendar,pipeline}.md, log.md. Never touch: other commands.

Session 6: Measure

Steps: 6 · Wave: 6 · Depends on: Session 1
Scope fence: Touch: commands/{import,report,analyze,audit,ab-test}.md, log.md. Never touch: other commands.

Session 7: Grow

Steps: 7 · Wave: 7 · Depends on: Session 1
Scope fence: Touch: commands/{strategy,competitive,monetize,outreach,profile}.md, log.md. Never touch: other commands.

Session 8: Front-doors + router (phase close)

Steps: 8 · Wave: 8 · Depends on: Sessions 1–7 (routing targets must be hardened first)
Scope fence: Touch: commands/{create,measure,linkedin}.md, log.md. Never touch: other commands.

Execution Order

Wave 1: Session 1 (calibration — hard gate; HALT-on-failure)
Waves 2–7: Sessions 2–7 (each depends only on S1; sequential because they share log.md)
Wave 8: Session 8 (after all targets hardened)

Grouping rules applied

Unit = one command; grouped by journey per the locked cadence.
Create (8 cmd) split across S2–S4; small journeys = one session each.
Sessions share the single log.md append target → sequential, not parallel.
Session 8 (routers) last — depends on its targets being hardened.

Plan Quality Score

Recomputed post-revision (the pre-review self-score was struck per plan-critic #17).

Dimension	Weight	Score	Notes
Structural integrity	0.15	90	8 steps dependency-ordered; S1 now a HALT-enforced gate; brief-vs-plan reassignment justified
Step quality	0.20	85	per-command 5-step method; VERIFY now backed by an independent oracle, not self-re-sim; headless-review resolved
Coverage completeness	0.20	90	anchored coverage predicate (no substring false-positives); all 29 mapped, none dropped/doubled
Specification quality	0.15	86	concrete files/verify; gate on `Failed: 0` not literal 74; stale-brand grep broadened; prior dispositions spot-confirmed
Risk & pre-mortem	0.15	88	brief-review + plan-critic blockers/majors folded with mitigations
Headless readiness	0.10	85	On-failure + Checkpoint per step; S1 HALT circuit-breaker; non-ALLOW halts the phase
Manifest quality	0.05	78	manifests anchor on the unique-anchored log entry; edit-correctness explicitly delegated to `/trekreview` (honest separation)
Weighted total	1.00	87	Grade: B+

Adversarial review:

Plan critic: REVISE — 2 blockers + 10 major + 5 minor (Grade C, 66). All blockers + majors addressed in this revision; see Revisions. (1 claim, #8 headless-review, found RESOLVED on verification; #5 found overstated, scope broadened anyway.)
Scope guardian: ALIGNED — 0 creep, 0 material gaps; the Failed: 0-not-74 caveat is adopted.

Revisions

Added by adversarial review (Phase 9) — plan-critic REVISE + scope-guardian caveat.

#	Finding	Severity	Resolution
1	SC-F coverage grep matches command names as substrings (dropped command reads as covered)	blocker	Unique anchored log header `### /linkedin:<name> —`; all coverage/per-step greps anchored `^### /linkedin:$c`
2	`measure`/`create` collide with the Measure/Create journey labels	blocker	Same anchored-header fix; `:measure` ≠ the word "Measure"
3	Oracle samples ≥1/session of author-written sketches (80%+ self-graded)	major	Oracle now adjudicates every hardened command's before/after per session; simulation captures CONCRETE reproducible output, not a sketch
4	Mechanical axis inapplicable to ~13 no-output commands → pure judgment	major	Per-type mechanical predicate defined for every command class (routing/analytics/guided); never "N/A → judgment"
5	Stale-brand grep scoped to skills/ only	major→adjusted	Verified: `"thought leadership plugin"` is absent everywhere; grep broadened to commands/+skills/+references/+README; S8 also judges the router's domain phrasing
6	Brief-vs-plan session reassignment unjustified	major	Explicit refinement note (brief table was "to be confirmed by /trekplan"); coverage preserved
7	`Passed: 74` brittle under `set -e`; 74 hardcoded	major	All Verify/gate predicates → `Failed: 0` + exit 0 + unchanged `EXPECT_*` counts
8	headless-review wiring assumed-broken-then-deferred	major→resolved	Verified wiring is PRESENT (`--type → subagent_type` table, lines 77-79/141-142); S4 verifies, no fix expected; Assumption #4 struck
9	No circuit-breaker; S1 failure doesn't halt S2–S8	major	S1 On-failure = HALT; S1→S2..S8 hard dependency; any non-ALLOW `/trekreview` halts the phase
10	Manifests anchor on log.md; hardened command file invisible	major	Documented honest separation: manifest proves processing (unique anchor); `/trekreview` proves the edit (reads real diffs); a clean-pass command may have 0 edits
11	Prior dispositions cited but unverified ("do not re-verify")	major	Spot-confirm each cited disposition with one grep/read before relying
12	SC-E re-simulation is circular	major	Same as #3 — independent oracle + concrete captured output
13	node --test counts hardcoded/conditional	minor	Framed as `Failed: 0` (counts incidental)
16	`quick` double-count (S1 + brief's S2 spillover)	minor	Explicit "quick complete after S1; not re-hardened in S2" note
17	Plan pre-assigned Grade A before review	minor	Score recomputed post-revision (B+ 87); header note added

43 KiB Raw Blame History Unescape Escape

LinkedIn Studio — Command Hardening Plan

Context

Architecture Diagram

Codebase Analysis

Implementation Plan

Step 1: Method calibration on quick, then harden the Start journey

Step 2: Harden Create I (atomic short-form)

Step 3: Harden Create II (visual + batch)

Step 4: Harden Create III (long-form cluster)

Step 5: Harden the Engage journey

Step 6: Harden the Measure journey

Step 7: Harden the Grow journey

Step 8: Harden the front-doors + router, and close the phase

Alternatives Considered

Test Strategy

Tests to write

Risks and Mitigations

Assumptions

Verification

Estimated Scope

Execution Strategy

Session 1: Method calibration + Start

Session 2: Create I (atomic short-form)

Session 3: Create II (visual + batch)

Session 4: Create III (long-form cluster)

Session 5: Engage

Session 6: Measure

Session 7: Grow

Session 8: Front-doors + router (phase close)

Execution Order

Grouping rules applied

Plan Quality Score

Revisions

43 KiB

Raw Blame History

Step 1: Method calibration on `quick`, then harden the Start journey