fix(linkedin-studio): downgrade A/B significance claim to directional

Wave 2 / Step 6 of the remediation plan.

Organic personal-post A/B tests gather a handful of posts per variant — far below
the volume a significance test needs — so the tool must not imply statistical
significance:
- Rename the results-table "Significant?" column to "Directional?" and define it
  as "clears the ~20% minimum-meaningful-difference AND points the same way across
  most posts" — a direction to test further, not a significant result.
- Reword the "20% significance rule" to a minimum-meaningful-difference effect-size
  heuristic (explicitly NOT statistical significance).
- Replace the "3 = Medium, 5+ = High" confidence ladder with a directional-only
  confidence section: treat every result as directional (not significant) given
  realistic volume is well under ~50 conversions/variant; name a direction, not a
  winner.

The 20% minimum-meaningful-difference threshold itself stays — it is a legitimate
effect-size heuristic; only the significance framing was the false claim.

Verify: no "Significant?"/"20% significance" remain; "directional" present;
structural lint 0 failed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Kjell Tore Guttormsen 2026-05-30 00:25:26 +02:00
commit 55bfe309eb

View file

@ -274,7 +274,7 @@ Read each file and check if both variants have 3+ posts logged. Present only tes
Read the test file. For each variant:
- Calculate average for each metric (impressions, engagement rate, comments, reposts)
- Calculate percentage difference: ((B_avg - A_avg) / A_avg) * 100
- Apply the 20% significance rule from the framework
- Apply the framework's minimum-meaningful-difference threshold (default 20%). This is an effect-size heuristic for "is the gap worth acting on" — NOT a test of statistical significance (organic personal-post volume rarely reaches it)
### 2c.3: Cross-Reference Analytics Data
@ -298,13 +298,15 @@ Output the analysis in this format:
**Posts per variant:** A: [X], B: [Y]
### Results Comparison
| Metric | Variant A (Avg) | Variant B (Avg) | Difference | Significant? |
| Metric | Variant A (Avg) | Variant B (Avg) | Difference | Directional? |
|--------|----------------|----------------|------------|--------------|
| Impressions | X | X | +X% | Yes/No |
| Engagement Rate | X% | X% | +X% | Yes/No |
| Comments | X | X | +X% | Yes/No |
| Reposts | X | X | +X% | Yes/No |
_"Directional?" = the gap clears the ~20% minimum-meaningful-difference AND points the same way across most posts. It is a direction to test further, not a statistically significant result._
### Verdict
[Clear recommendation based on the data:]
- **Adopt B:** If B wins with >20% difference on primary metric
@ -312,11 +314,17 @@ Output the analysis in this format:
- **Inconclusive:** If results are mixed or inconsistent across posts
- **Extend test:** If sample size is borderline or results are close to 20% threshold
### Confidence Level
**[High/Medium/Low]**
- Based on sample size (3 = Medium, 5+ = High)
- Based on consistency across individual posts
- Based on alignment with secondary metrics
### Confidence Level (directional only)
**[Directional signal: weak / moderate / strong]**
Organic personal-post volume rarely reaches statistical significance: with the
handful of posts per variant a creator realistically gathers (well under the
~50 conversions/variant a significance test would need), treat every result as
**directional, not significant**. Do not declare a statistically confident
"winner" — name a direction to test further. Judge the strength of that signal on:
- Consistency across individual posts (did B beat A on most posts, or one outlier?)
- Size of the gap relative to the ~20% minimum-meaningful-difference threshold
- Alignment with secondary metrics
### Key Insight
[One sentence capturing the most important learning for their content strategy]