ktg-plugin-marketplace/plugins/linkedin-studio/scripts/test-runner.sh
Kjell Tore Guttormsen baca30feb1 feat(linkedin-studio): S14 — journey layer (create/measure front-doors + 5-journey router), v4.1.0
14a's cold command-rationalization found ZERO redundancy across the 27 commands
(no defensible merge/cut), so the operator reframed S14 from "merge/cut" to
"add a journey layer over the kept atomics".

- Add /linkedin:create + /linkedin:measure — delegate-only guided front-doors
  (Read/Glob/AskUserQuestion only; route to the command that owns the work)
- Re-tier commands/linkedin.md into 5 journeys (Start/Create/Engage/Measure/Grow);
  onboarding/strategy elevated as Start/Grow front-doors; Engage = calendar+firsthour tier
- 14a honesty nits: router now lists firsthour; calendar cross-links to firsthour;
  competitive confirmed UNGATED (the claimed 1K-gating inconsistency was unfounded)
- Lockstep: EXPECT_COMMANDS 27->29, v4.0.0->4.1.0 across plugin.json / README badges /
  plugin+root CLAUDE.md / README / CHANGELOG; new README commands-badge lint guard
- 14a deliverable corrected: multiplatform entry added (26/27 -> 27/27), header counts,
  competitive nit withdrawn
- Independent /trekreview (2 Opus reviewers) raised 3 MAJORs (stale commands badge,
  router competitive self-contradiction, STATE.md count) — ALL remediated in-session;
  verdict ALLOW

Gate: test-runner.sh 74/0/0; node --test 98/98; commands=29; v4.1.0 consistent.
Additive/minor — no command removed/renamed/behavior-changed; reload registers create+measure.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 21:27:06 +02:00

476 lines
20 KiB
Bash
Executable file

#!/bin/bash
# LinkedIn Studio Plugin — Structure Validator
# Validates the REAL v3.1 layout: registration counts (derived dynamically),
# frontmatter shape, hook drift, and plugin.json fields. Counts are asserted
# against the declared contract below, which is kept in sync with the
# CLAUDE.md "## Agents (N)" / "## Commands (N)" headers (cross-checked here)
# and the STATE.md "Telling" block. Adding or removing an agent, command,
# reference, or skill breaks the count-equality and fails the lint — this is
# the registration guard that gates the remediation plan's later steps.
#
# The stat-consistency grep (one magnitude per algorithm effect across the
# tree) was added in remediation Step 3; the version-consistency grep in
# Step 21; the agent model-consistency guard (each agents/<name>.md frontmatter
# model: must match every surface declaration, and canonical rosters must list
# every agent) in S11; the render-chain propagation guard (no honesty pattern a
# command was cleaned of survives in the reference it renders from) in S12; the
# `$`-safety guard (no untrusted value reaches a String.replace replacement STRING
# in state-updater.mjs — proven behaviorally, coverage-complete, self-testing) in
# S13. All five are live below (Sections 8, 9, 10, 11 and 12).
#
# Usage: bash scripts/test-runner.sh
# bash 3.2-safe: plain arrays only, no `declare -A`, no `mapfile`/`readarray`.
set -e
PLUGIN_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
cd "$PLUGIN_ROOT"
PASS=0
FAIL=0
WARN=0
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[0;33m'
NC='\033[0m' # No Color
pass() { echo -e "${GREEN}${NC} $1"; PASS=$((PASS + 1)); }
fail() { echo -e "${RED}${NC} $1"; FAIL=$((FAIL + 1)); }
warn() { echo -e "${YELLOW}${NC} $1"; WARN=$((WARN + 1)); }
# --- Declared registration contract (the "Telling" block) ---
# Source of truth: CLAUDE.md headers + STATE.md Telling. Bump these together
# with the files when adding/removing an agent, command, reference, or skill.
EXPECT_AGENTS=19
EXPECT_COMMANDS=29
EXPECT_REFS=25
EXPECT_SKILLS=6
echo "================================================"
echo "LinkedIn Studio Plugin — Structure Validator"
echo "Plugin root: $PLUGIN_ROOT"
echo "================================================"
echo ""
# --- Section 1: Core Files ---
echo "--- Core Files ---"
for f in ".claude-plugin/plugin.json" "CLAUDE.md" "CHANGELOG.md" "README.md" "config/REMEMBER.template.md"; do
if [ -f "$f" ]; then
pass "$f exists"
else
fail "$f MISSING"
fi
done
echo ""
# --- Section 2: Registration Counts (dynamic) ---
echo "--- Registration Counts ---"
AGENTS=$(ls agents/*.md 2>/dev/null | wc -l | tr -d ' ')
COMMANDS=$(ls commands/*.md 2>/dev/null | wc -l | tr -d ' ')
REFS=$(ls references/*.md 2>/dev/null | wc -l | tr -d ' ')
SKILLS=$(ls skills/*/SKILL.md 2>/dev/null | wc -l | tr -d ' ')
assert_count() {
# $1 label, $2 actual, $3 expected
if [ "$2" -eq "$3" ]; then
pass "$1: $2 (expected $3)"
else
fail "$1: $2 (expected $3) — registration drift"
fi
}
assert_count "agents/*.md" "$AGENTS" "$EXPECT_AGENTS"
assert_count "commands/*.md" "$COMMANDS" "$EXPECT_COMMANDS"
assert_count "references/*.md" "$REFS" "$EXPECT_REFS"
assert_count "skills/*/SKILL.md" "$SKILLS" "$EXPECT_SKILLS"
# Cross-check the CLAUDE.md declared headers against the contract (doc-drift guard)
DOC_AGENTS=$(grep -oE '^## Agents \([0-9]+\)' CLAUDE.md | grep -oE '[0-9]+' | head -1)
DOC_COMMANDS=$(grep -oE '^## Commands \([0-9]+\)' CLAUDE.md | grep -oE '[0-9]+' | head -1)
if [ "$DOC_AGENTS" = "$EXPECT_AGENTS" ]; then
pass "CLAUDE.md '## Agents ($DOC_AGENTS)' matches contract"
else
fail "CLAUDE.md agents header ($DOC_AGENTS) != contract ($EXPECT_AGENTS)"
fi
if [ "$DOC_COMMANDS" = "$EXPECT_COMMANDS" ]; then
pass "CLAUDE.md '## Commands ($DOC_COMMANDS)' matches contract"
else
fail "CLAUDE.md commands header ($DOC_COMMANDS) != contract ($EXPECT_COMMANDS)"
fi
# README shields commands-count badge must match the contract too. Added after an
# S14 /trekreview found the badge stale at commands-27 while the surface shipped 29:
# the version-consistency grep (Section 9) checks only the version badge, and the
# count guards above check the CLAUDE.md header, so the README count badge slipped
# both. This closes that gap (the count-badge analogue of the version-badge check).
if grep -q "badge/commands-${EXPECT_COMMANDS}-" README.md; then
pass "README commands badge declares ${EXPECT_COMMANDS}"
else
fail "README commands badge != ${EXPECT_COMMANDS} (expected shields badge/commands-${EXPECT_COMMANDS}-)"
fi
echo ""
# --- Section 3: Agent Frontmatter ---
echo "--- Agent Frontmatter ---"
for f in agents/*.md; do
if head -1 "$f" | grep -q "^---"; then
if grep -q "^name:" "$f" && grep -q "^description:" "$f"; then
pass "$f (frontmatter OK)"
else
fail "$f (missing name:/description:)"
fi
else
fail "$f (no YAML frontmatter)"
fi
done
echo ""
# --- Section 4: Command Frontmatter ---
echo "--- Command Frontmatter ---"
for f in commands/*.md; do
if head -1 "$f" | grep -q "^---"; then
if grep -q "^name:" "$f" && grep -q "^description:" "$f"; then
pass "$f (frontmatter OK)"
else
fail "$f (missing name:/description:)"
fi
else
fail "$f (no YAML frontmatter)"
fi
done
echo ""
# --- Section 5: Hook Configuration (drift) ---
echo "--- Hook Configuration ---"
if [ -f "hooks/hooks.json" ]; then
pass "hooks/hooks.json exists"
if python3 hooks/scripts/compile-hooks.py --check >/dev/null 2>&1; then
pass "hooks.json matches compiled template (no drift)"
else
fail "hooks.json DRIFT — run: python3 hooks/scripts/compile-hooks.py"
fi
else
fail "hooks/hooks.json MISSING"
fi
echo ""
# --- Section 6: Plugin.json Validation ---
echo "--- Plugin.json Validation ---"
if python3 -c "
import json, sys
with open('.claude-plugin/plugin.json') as f:
data = json.load(f)
required = ['name', 'version', 'description']
missing = [field for field in required if field not in data]
if missing:
print('Missing fields:', missing)
sys.exit(1)
print('Version:', data['version'])
" 2>/dev/null; then
pass "plugin.json structure valid (name/version/description)"
else
fail "plugin.json structure invalid"
fi
echo ""
# --- Section 7: Analytics Source ---
echo "--- Analytics Source ---"
if [ -f "scripts/analytics/src/cli.ts" ]; then
pass "scripts/analytics/src/cli.ts exists"
else
fail "scripts/analytics/src/cli.ts MISSING"
fi
echo ""
# --- Section 8: Algorithm-Stat Consistency ---
echo "--- Algorithm-Stat Consistency ---"
# The single source of truth for algorithm magnitudes is
# references/algorithm-signals-reference.md. After the Phase-0 reconciliation,
# stale/competing magnitudes — the retired engagement-coefficient folklore, the
# unpublishable model params/brand, and the deployment date — must not reappear
# anywhere else (cite the reference, do not restate). This enforces "one magnitude
# per algorithm effect" by forbidding EVERY retired-class value from returning, so
# the same grep that defines the Phase-0 Success Criterion fails on any survivor.
#
# S9 rebuild: the S8 list forbade only the two S7-named strings and went green
# over six more survivors (the coefficient system in analytics-interpreter/
# content-optimizer/pipeline/glossary, the playbook 15x/5x, the 150B model). This
# list is rebuilt to the FULL criterion. Forbidden classes (each maps to a
# canonical statement in the reference):
# - Carousel-rate folklore: 6.6% / 6.60% / 1.92% → reference: "~7% top format"
# - Link-penalty folklore: 40-50% / 25-40% / -40-60% → reference: one ~38% correlational band
# - Comment-multiplier folklore: "15x more reach/algorithmic", "5x more effective/
# less valuable/reach than" → reference: order only, comment ≈ 2x a like
# - Video-multiplier folklore: "5x more conversations" → reference: video declining, no multiplier
# - Engagement-coefficient system: 7-9x, 2.5x, 0.2x, (10x), (8x), "10x weight"
# → reference: "never hard coefficients to optimize against"
# - Model params/brand/date: the PATTERN CLASS [0-9]+[ -]?(B|billion)?[ -]?param
# (covers 150-parameter / 150B param / 150 billion param) / 360Brew / January 2026
# → reference: "Not publishable as fact"
#
# S10: the model-precision token is now the pattern CLASS, not a literal-token
# list. S9 forbade only "150 ?B param|150 billion param"; a hyphenated
# "150-parameter" (no "B") slipped both the discovery grep and the lint, surviving
# in glossary.md:10. The criterion is "no asserted model precision in ANY surface
# form", so the lint now enforces the shape (a number adjacent to "param"), not an
# enumeration. An adjacent digit is REQUIRED, so legitimate "param" uses with no
# leading number — "Language parameter", "parameterized", "different parameters",
# "«parametere»", "175-milliarders parametermodell" — do not match.
# Bare "10x"/"15x"/"5x" are deliberately NOT forbidden — they carry legitimate
# uses (collaboration "10x your reach" hyperbole, "5x5x5", posting cadence, pixel
# dims like 1080x1350), so each token targets the retired *phrasing*, not the bare
# number.
#
# Scope covers every dir the criterion's grep covers, including assets/checklists/
# (the 360Brew survivor lived there, outside the S8 scan), assets/templates/, and
# CHANGELOG.md (S10: the 360Brew/January-2026 survivor lived there, outside the S9
# scope). assets/{templates,checklists}/ — not all of assets/ — keeps the scan off
# gitignored runtime data (assets/analytics/, assets/drafts/, voice-samples/).
STALE_STATS='40-50%|25-40%|6\.6%|6\.60%|1\.92%|15x more reach|15x more algorithmic|5x more effective|5x less valuable|5x more reach than|5x more conversations|7-9x|2\.5x|0\.2x|\(10x\)|\(8x\)|10x weight|-40-60%|[0-9]+[ -]?(B|billion)?[ -]?param|360Brew|January 2026'
# Non-vacuity self-test (S10). A grep criterion is only meaningful if it actually
# MATCHES the forbidden forms and does NOT match legitimate ones. S7→S9 each
# shipped a lint that passed green while a survivor slipped, because the proof was
# run once by hand and never committed — so a hyphenated "150-parameter" form was
# never re-checked. This makes the proof PERMANENT: it runs on every invocation
# BEFORE the real scan, so narrowing STALE_STATS back to a literal-token list fails
# the suite instead of silently certifying an unenforced criterion. The positive
# set covers all three model-precision surface forms (incl. the exact S10
# "150-parameter" survivor); the negative set covers the legitimate "param"/x uses
# that live in the tree today.
SELFTEST_OK=1
while IFS= read -r probe; do
[ -z "$probe" ] && continue
if ! echo "$probe" | grep -qE "$STALE_STATS"; then
SELFTEST_OK=0; echo " non-vacuity FAIL: forbidden form not caught -> $probe"
fi
done <<'POSITIVE'
40-50% link penalty
6.6% carousel rate
1.92% reach
15x more reach
5x more conversations
7-9x weight
(10x) coefficient
10x weight
150-parameter foundation model
150B parameter foundation model
150 billion parameter model
360Brew
January 2026 algorithm update
POSITIVE
while IFS= read -r probe; do
[ -z "$probe" ] && continue
if echo "$probe" | grep -qE "$STALE_STATS"; then
SELFTEST_OK=0; echo " false-positive FAIL: legitimate form caught -> $probe"
fi
done <<'NEGATIVE'
5x5x5 pre-posting method
post 3x per week
1080x1350 pixels
10x your reach
Language parameter (configurable)
parameterized content-gatekeeper
Start over with different parameters
175-milliarders parametermodell
NEGATIVE
if [ "$SELFTEST_OK" -eq 1 ]; then
pass "STALE_STATS self-test: 13 forbidden forms caught, 8 legitimate forms ignored"
else
fail "STALE_STATS self-test failed — the lint no longer enforces the full criterion"
fi
STAT_HITS=$(grep -rnE "$STALE_STATS" references/ commands/ skills/ hooks/prompts/ agents/ assets/templates/ assets/checklists/ CLAUDE.md README.md CHANGELOG.md .claude-plugin/plugin.json 2>/dev/null | grep -v 'algorithm-signals-reference' || true)
if [ -z "$STAT_HITS" ]; then
pass "no stale algorithm magnitudes / model brand outside the canonical reference"
else
fail "stale algorithm stat(s) reintroduced — cite algorithm-signals-reference.md instead:"
echo "$STAT_HITS"
fi
echo ""
# --- Section 9: Version Consistency ---
echo "--- Version Consistency ---"
# Single source of truth for the plugin version: .claude-plugin/plugin.json.
# Its value must be declared identically in the README badge, the CLAUDE.md
# header, and the CHANGELOG top entry. Historical references to older versions
# (CHANGELOG history, the README version-history table, "vX added Y" prose) are
# NOT checked here — only the current-version DECLARATIONS must agree.
VERSION=$(python3 -c "import json; print(json.load(open('.claude-plugin/plugin.json'))['version'])" 2>/dev/null)
if [ -z "$VERSION" ]; then
fail "could not read version from plugin.json"
else
pass "plugin.json version: $VERSION"
if grep -q "version-${VERSION}-blue" README.md; then
pass "README badge declares v$VERSION"
else
fail "README badge does not declare v$VERSION (expected version-${VERSION}-blue)"
fi
if grep -q "LinkedIn Studio Plugin (v${VERSION})" CLAUDE.md; then
pass "CLAUDE.md header declares v$VERSION"
else
fail "CLAUDE.md header does not declare (v$VERSION)"
fi
if grep -q "^## \[${VERSION}\]" CHANGELOG.md; then
pass "CHANGELOG has a [$VERSION] entry"
else
fail "CHANGELOG missing a [$VERSION] entry"
fi
fi
echo ""
# --- Section 10: Agent Model-Consistency ---
echo "--- Agent Model-Consistency ---"
# Each agents/<name>.md frontmatter `model:` is the source of truth; every
# surface that DECLARES an agent's model (README, CLAUDE.md, the capability
# matrix, the SKILL rosters) must match it, and the canonical rosters must list
# every agent. Added in S11 after a cold full-brief review found
# post-feedback-monitor published as Haiku across four surfaces while the agent
# runs Opus — declaration drift the version/count/stat guards could not see. The
# checker self-tests its own non-vacuity on every run (see the .mjs header):
# a deliberately-mismatched probe must be caught and a correct one ignored, else
# the suite fails instead of certifying an unenforced criterion.
if node scripts/check-model-consistency.mjs; then
pass "agent model-consistency: all surface declarations match frontmatter + canonical rosters complete"
else
fail "agent model-consistency drift — see check-model-consistency.mjs output above"
fi
echo ""
# --- Section 11: Render-Chain Propagation ---
echo "--- Render-Chain Propagation ---"
# Commands render from the references they inline via
# ${CLAUDE_PLUGIN_ROOT}/references/…. An honesty pattern removed from a command
# surface must NOT survive in the reference that command renders from — otherwise
# the user still hits it. Added in S12 after a cold full-brief review found the
# banned A/B significance-verdict column (`Significant? (>20%)` with Yes/No cells)
# still shipping in references/ab-testing-framework.md while commands/ab-test.md had
# already been cleaned to the honest "Directional?" framing. The command-level fix
# never propagated to its render-source, and Section 8's STALE_STATS grep targets
# magnitudes, not this construct, so the survivor passed green. This generalizes the
# fix from "clean the command" to "the banned construct is forbidden across the
# WHOLE render chain (commands AND references)". Future propagation-class patterns
# get appended to PROP_FORBIDDEN, mirroring how Section 8's STALE_STATS grew to the
# full criterion rather than the single named token.
#
# Forbidden: the significance-VERDICT column — `Significant?` adjacent to a `(` (the
# `(>20%)` verdict parenthetical) or a table pipe (`| Significant?`). The defect is a
# column steering users to record a statistical-significance call that organic
# personal-post volume never reaches; "directional" is the honest frame. Legitimate
# descriptive prose ("Significantly higher", "Significant capability", "statistical
# significance", a bare sentence-final "significant?") carries no `(`/`|`-adjacency
# and is left alone.
PROP_FORBIDDEN='Significant\?[[:space:]]*\(|\|[[:space:]]*Significant\?'
# Non-vacuity self-test (mirrors Section 8): the criterion is only meaningful if it
# MATCHES the verdict-column forms and IGNORES legitimate prose. Runs on every
# invocation BEFORE the real scan, so weakening the pattern fails the suite instead
# of silently certifying an unenforced guard. The positive set covers the exact S12
# survivor + its bare-column variant; the negative set covers the honest
# "Directional?" fix and every legitimate "Significant"/"significance" string the
# tree carries today.
PROP_SELFTEST_OK=1
while IFS= read -r probe; do
[ -z "$probe" ] && continue
if ! echo "$probe" | grep -qE "$PROP_FORBIDDEN"; then
PROP_SELFTEST_OK=0; echo " non-vacuity FAIL: forbidden form not caught -> $probe"
fi
done <<'PROP_POSITIVE'
| Difference | Significant? (>20%) |
Significant? (>20%)
| Significant? |
PROP_POSITIVE
while IFS= read -r probe; do
[ -z "$probe" ] && continue
if echo "$probe" | grep -qE "$PROP_FORBIDDEN"; then
PROP_SELFTEST_OK=0; echo " false-positive FAIL: legitimate form caught -> $probe"
fi
done <<'PROP_NEGATIVE'
| Difference | Directional? (>20% gap) |
Significantly higher weight than generic responses
Significant capability breakthroughs
Significantly Behind (<50%)
LinkedIn analytics does not support statistical significance tests
Is the difference significant? Probably not.
PROP_NEGATIVE
if [ "$PROP_SELFTEST_OK" -eq 1 ]; then
pass "render-chain propagation self-test: 3 verdict-column forms caught, 6 legitimate forms ignored"
else
fail "render-chain propagation self-test failed — the guard no longer enforces the criterion"
fi
# Real scan across the whole user-facing render chain (commands + every reference
# they inline) plus the adjacent surfaces a copy could migrate the table into.
PROP_HITS=$(grep -rnE "$PROP_FORBIDDEN" references/ commands/ skills/ hooks/prompts/ agents/ assets/templates/ assets/checklists/ 2>/dev/null || true)
if [ -z "$PROP_HITS" ]; then
pass "no significance-verdict column survives in any command or its render-source reference"
else
fail "significance-verdict column reintroduced — use the honest 'Directional?' framing (see commands/ab-test.md):"
echo "$PROP_HITS"
fi
echo ""
# --- Section 12: `$`-Safety (String.replace replacement) ---
echo "--- \$-Safety (String.replace replacement) ---"
# state-updater.mjs mutates the state file from untrusted user content (post
# topics, hooks, targets, partners, …). In a JS replacement *string*, `$&`/`` $` ``/
# `$'`/`$$`/`$n` are special, so a `$`-bearing value rewrites the field; a
# replacement *function* inserts its return verbatim. Added in S13 after a cold
# full-brief review found the LAST member of this class: S12 converted the 5
# section-append sites to functions but left `replaceField` (the scalar writer) on a
# replacement string, and the S12 `$`-test asserted only the section entry — never
# the `last_post_topic` scalar — so the corruption shipped green. This is the S9→S12
# "close the class, not the line" lesson on the `$`-axis: rather than grep a
# syntactic proxy (which cannot tell a replacement-position template literal from a
# RegExp-pattern one across multi-line calls), check-replace-safety.mjs drives EVERY
# exported mutator with an adversarial payload of every special token in every
# free-text + date field and asserts verbatim survival. Two structural backstops run
# inside it on every invocation: COVERAGE-COMPLETENESS (a new export without
# `$`-coverage fails) and a NON-VACUITY SELF-TEST (a naive string-replace MUST
# corrupt the payload, a function MUST preserve it — else a PASS is meaningless),
# mirroring Section 8/10/11.
if node scripts/check-replace-safety.mjs; then
pass "\$-safety: no untrusted value reaches a String.replace replacement string (behavioral, coverage-complete, self-testing)"
else
fail "\$-safety guard failed — a state-updater String.replace replacement is \$-unsafe; see check-replace-safety.mjs output above"
fi
echo ""
# --- Summary ---
echo "================================================"
echo "RESULTS"
echo "================================================"
echo -e "${GREEN}Passed: $PASS${NC}"
echo -e "${RED}Failed: $FAIL${NC}"
echo -e "${YELLOW}Warnings: $WARN${NC}"
echo ""
if [ $FAIL -eq 0 ]; then
echo -e "${GREEN}All structural checks passed!${NC}"
exit 0
else
echo -e "${RED}$FAIL check(s) failed. Review above.${NC}"
exit 1
fi