fix(linkedin-studio): downgrade A/B significance claim to directional
Wave 2 / Step 6 of the remediation plan. Organic personal-post A/B tests gather a handful of posts per variant — far below the volume a significance test needs — so the tool must not imply statistical significance: - Rename the results-table "Significant?" column to "Directional?" and define it as "clears the ~20% minimum-meaningful-difference AND points the same way across most posts" — a direction to test further, not a significant result. - Reword the "20% significance rule" to a minimum-meaningful-difference effect-size heuristic (explicitly NOT statistical significance). - Replace the "3 = Medium, 5+ = High" confidence ladder with a directional-only confidence section: treat every result as directional (not significant) given realistic volume is well under ~50 conversions/variant; name a direction, not a winner. The 20% minimum-meaningful-difference threshold itself stays — it is a legitimate effect-size heuristic; only the significance framing was the false claim. Verify: no "Significant?"/"20% significance" remain; "directional" present; structural lint 0 failed. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
parent
911871ff53
commit
55bfe309eb
1 changed files with 15 additions and 7 deletions
|
|
@ -274,7 +274,7 @@ Read each file and check if both variants have 3+ posts logged. Present only tes
|
||||||
Read the test file. For each variant:
|
Read the test file. For each variant:
|
||||||
- Calculate average for each metric (impressions, engagement rate, comments, reposts)
|
- Calculate average for each metric (impressions, engagement rate, comments, reposts)
|
||||||
- Calculate percentage difference: ((B_avg - A_avg) / A_avg) * 100
|
- Calculate percentage difference: ((B_avg - A_avg) / A_avg) * 100
|
||||||
- Apply the 20% significance rule from the framework
|
- Apply the framework's minimum-meaningful-difference threshold (default 20%). This is an effect-size heuristic for "is the gap worth acting on" — NOT a test of statistical significance (organic personal-post volume rarely reaches it)
|
||||||
|
|
||||||
### 2c.3: Cross-Reference Analytics Data
|
### 2c.3: Cross-Reference Analytics Data
|
||||||
|
|
||||||
|
|
@ -298,13 +298,15 @@ Output the analysis in this format:
|
||||||
**Posts per variant:** A: [X], B: [Y]
|
**Posts per variant:** A: [X], B: [Y]
|
||||||
|
|
||||||
### Results Comparison
|
### Results Comparison
|
||||||
| Metric | Variant A (Avg) | Variant B (Avg) | Difference | Significant? |
|
| Metric | Variant A (Avg) | Variant B (Avg) | Difference | Directional? |
|
||||||
|--------|----------------|----------------|------------|--------------|
|
|--------|----------------|----------------|------------|--------------|
|
||||||
| Impressions | X | X | +X% | Yes/No |
|
| Impressions | X | X | +X% | Yes/No |
|
||||||
| Engagement Rate | X% | X% | +X% | Yes/No |
|
| Engagement Rate | X% | X% | +X% | Yes/No |
|
||||||
| Comments | X | X | +X% | Yes/No |
|
| Comments | X | X | +X% | Yes/No |
|
||||||
| Reposts | X | X | +X% | Yes/No |
|
| Reposts | X | X | +X% | Yes/No |
|
||||||
|
|
||||||
|
_"Directional?" = the gap clears the ~20% minimum-meaningful-difference AND points the same way across most posts. It is a direction to test further, not a statistically significant result._
|
||||||
|
|
||||||
### Verdict
|
### Verdict
|
||||||
[Clear recommendation based on the data:]
|
[Clear recommendation based on the data:]
|
||||||
- **Adopt B:** If B wins with >20% difference on primary metric
|
- **Adopt B:** If B wins with >20% difference on primary metric
|
||||||
|
|
@ -312,11 +314,17 @@ Output the analysis in this format:
|
||||||
- **Inconclusive:** If results are mixed or inconsistent across posts
|
- **Inconclusive:** If results are mixed or inconsistent across posts
|
||||||
- **Extend test:** If sample size is borderline or results are close to 20% threshold
|
- **Extend test:** If sample size is borderline or results are close to 20% threshold
|
||||||
|
|
||||||
### Confidence Level
|
### Confidence Level (directional only)
|
||||||
**[High/Medium/Low]**
|
**[Directional signal: weak / moderate / strong]**
|
||||||
- Based on sample size (3 = Medium, 5+ = High)
|
|
||||||
- Based on consistency across individual posts
|
Organic personal-post volume rarely reaches statistical significance: with the
|
||||||
- Based on alignment with secondary metrics
|
handful of posts per variant a creator realistically gathers (well under the
|
||||||
|
~50 conversions/variant a significance test would need), treat every result as
|
||||||
|
**directional, not significant**. Do not declare a statistically confident
|
||||||
|
"winner" — name a direction to test further. Judge the strength of that signal on:
|
||||||
|
- Consistency across individual posts (did B beat A on most posts, or one outlier?)
|
||||||
|
- Size of the gap relative to the ~20% minimum-meaningful-difference threshold
|
||||||
|
- Alignment with secondary metrics
|
||||||
|
|
||||||
### Key Insight
|
### Key Insight
|
||||||
[One sentence capturing the most important learning for their content strategy]
|
[One sentence capturing the most important learning for their content strategy]
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue