Adds 6 files in tests/synthetic/ exercising the determinism pipeline at the SC7 brief floor (Jaccard >= 0.833). Plan fixture pair: 40 step titles each with 38 shared (Jaccard 0.905). Review fixture pair: 30 finding-IDs each with 28 shared (Jaccard 0.875). Reuses lib/parsers/jaccard.mjs + lib/parsers/finding-id.mjs. The new pair coexists with tests/lib/review-determinism.test.mjs which holds the older SC4 (0.70) floor against tests/fixtures/ultrareview/. The lower floor protects pipeline regressions; the higher floor anchors the speedup brief's determinism aspiration. [skip-docs]
2.9 KiB
2.9 KiB
| type | plan_version | created | task | slug | run_id | steps | ||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ultraplan-synthetic | 1.7 | 2026-05-04 | Add --verbose flag to CLI | verbose-flag | B |
|
Synthetic plan run B — Add --verbose flag to CLI
This fixture represents a second synthesized run of /ultraplan-local against
the same hand-calibrated brief used for plan-run-A.md. The two runs differ
on 2 step titles (modeling realistic LLM variation).
How this fixture is used
See plan-run-A.md for the determinism contract.
Fixture math
- A has 40 unique step titles
- B has 40 unique step titles
- Intersection (shared titles): 38
- Union: 42
- Jaccard: 38/42 ≈ 0.9047 (well above 0.833 floor)
Differences from run A
- A includes "Add benchmark for verbose log emission cost" → B replaces with "Add benchmark for verbose log capture overhead"
- A includes "Document benchmark methodology in PERF.md" → B replaces with "Document overhead methodology in PERF.md"
These represent the kind of paraphrase variation a stochastic planner may produce on consecutive runs against an identical brief.