test(ultraplan-local): add plan-determinism + review-determinism synthetic fixtures (SC7 floor)
Adds 6 files in tests/synthetic/ exercising the determinism pipeline at the SC7 brief floor (Jaccard >= 0.833). Plan fixture pair: 40 step titles each with 38 shared (Jaccard 0.905). Review fixture pair: 30 finding-IDs each with 28 shared (Jaccard 0.875). Reuses lib/parsers/jaccard.mjs + lib/parsers/finding-id.mjs. The new pair coexists with tests/lib/review-determinism.test.mjs which holds the older SC4 (0.70) floor against tests/fixtures/ultrareview/. The lower floor protects pipeline regressions; the higher floor anchors the speedup brief's determinism aspiration. [skip-docs]
This commit is contained in:
parent
b1738b419c
commit
0c0a87e709
6 changed files with 425 additions and 0 deletions
77
plugins/ultraplan-local/tests/synthetic/plan-run-B.md
Normal file
77
plugins/ultraplan-local/tests/synthetic/plan-run-B.md
Normal file
|
|
@ -0,0 +1,77 @@
|
|||
---
|
||||
type: ultraplan-synthetic
|
||||
plan_version: "1.7"
|
||||
created: 2026-05-04
|
||||
task: "Add --verbose flag to CLI"
|
||||
slug: verbose-flag
|
||||
run_id: B
|
||||
steps:
|
||||
- "Add config entry for verbose flag in package.json"
|
||||
- "Define types for verbose mode in types.ts"
|
||||
- "Update parseArgs to recognize --verbose flag"
|
||||
- "Pass verbose context through main entry point"
|
||||
- "Add log level enum (silent, normal, verbose)"
|
||||
- "Wire log level into logger module"
|
||||
- "Replace console.log with logger.info in handler.ts"
|
||||
- "Add tests for parseArgs --verbose recognition"
|
||||
- "Add tests for log level enum mapping"
|
||||
- "Update README with --verbose flag documentation"
|
||||
- "Add CHANGELOG entry for verbose flag"
|
||||
- "Bump package.json minor version"
|
||||
- "Add lint rule blocking direct console usage"
|
||||
- "Run lint and fix new violations"
|
||||
- "Add CLI integration test for --verbose end-to-end"
|
||||
- "Add fixture file for verbose log capture"
|
||||
- "Document verbose output format in docs/cli.md"
|
||||
- "Add jsdoc for new logger API"
|
||||
- "Verify all existing tests pass with verbose disabled"
|
||||
- "Add backward-compat test for legacy quiet behavior"
|
||||
- "Add edge-case test for repeated --verbose flags"
|
||||
- "Add edge-case test for --verbose with --silent collision"
|
||||
- "Update help text to list --verbose flag"
|
||||
- "Add usage example to docs/quickstart.md"
|
||||
- "Verify CI matrix runs on Node 18 and 20"
|
||||
- "Add npm script for verbose mode debugging"
|
||||
- "Run security audit on logger dependency tree"
|
||||
- "Verify no PII leaks in verbose log output"
|
||||
- "Add manual test checklist to CONTRIBUTING.md"
|
||||
- "Update .gitignore for verbose log dump files"
|
||||
- "Add cleanup logic for stale verbose logs"
|
||||
- "Add unit test for cleanup logic"
|
||||
- "Verify exit code on verbose mode error"
|
||||
- "Add stderr routing for warnings in verbose"
|
||||
- "Add timestamp prefix in verbose log lines"
|
||||
- "Add test for timestamp format"
|
||||
- "Update troubleshooting guide with verbose flag"
|
||||
- "Verify version sync across all docs"
|
||||
- "Add benchmark for verbose log capture overhead"
|
||||
- "Document overhead methodology in PERF.md"
|
||||
---
|
||||
|
||||
# Synthetic plan run B — Add --verbose flag to CLI
|
||||
|
||||
This fixture represents a second synthesized run of `/ultraplan-local` against
|
||||
the same hand-calibrated brief used for `plan-run-A.md`. The two runs differ
|
||||
on 2 step titles (modeling realistic LLM variation).
|
||||
|
||||
## How this fixture is used
|
||||
|
||||
See `plan-run-A.md` for the determinism contract.
|
||||
|
||||
## Fixture math
|
||||
|
||||
- A has 40 unique step titles
|
||||
- B has 40 unique step titles
|
||||
- Intersection (shared titles): 38
|
||||
- Union: 42
|
||||
- Jaccard: 38/42 ≈ 0.9047 (well above 0.833 floor)
|
||||
|
||||
## Differences from run A
|
||||
|
||||
- A includes "Add benchmark for verbose log emission cost" → B replaces with
|
||||
"Add benchmark for verbose log capture overhead"
|
||||
- A includes "Document benchmark methodology in PERF.md" → B replaces with
|
||||
"Document overhead methodology in PERF.md"
|
||||
|
||||
These represent the kind of paraphrase variation a stochastic planner may
|
||||
produce on consecutive runs against an identical brief.
|
||||
Loading…
Add table
Add a link
Reference in a new issue