Step 17 of v4.1 — escalate-handler invoked. Live LLM-budget ($60-120 for
4 plan-runs á /trekplan --profile {economy,premium} on
examples/01-add-verbose-flag/brief.md) was not authorized for the
v4.1-execute-4b session.
Per Step 17 escalate-fallback (and NEXT-SESSION-PROMPT.local.md
fallback-strategy): document economy-Plan as parked, use balanced as
low-threshold profile, defer empirical calibration to v4.2.
Files:
tests/synthetic/profile-plan-run-economy-1.md — 30 steps, parked-synthetic
tests/synthetic/profile-plan-run-economy-2.md — 30 steps, parked-synthetic
tests/synthetic/profile-plan-run-premium-1.md — 40 steps, parked-synthetic
tests/synthetic/profile-plan-run-premium-2.md — 40 steps, parked-synthetic
tests/synthetic/profile-jaccard-calibration.md — threshold 0.55 pinned per
research/02 conservative starting value
Replacement procedure documented in calibration.md "How to replace"
section. Trigger conditions for empirical re-run:
1. Cross-tier smoke-test (Step 18) flips red on a real run
2. v4.2 LLM-budget approval
3. New profile tier added
73 lines
3.1 KiB
Markdown
73 lines
3.1 KiB
Markdown
---
|
|
type: trekplan-synthetic
|
|
plan_version: "1.7"
|
|
created: 2026-05-09
|
|
task: "Add --verbose flag to CLI"
|
|
slug: verbose-flag
|
|
run_id: premium-2
|
|
profile_used: premium
|
|
status: parked-synthetic
|
|
steps:
|
|
- "Add config entry for verbose flag in package.json"
|
|
- "Define types for verbose mode in types.ts"
|
|
- "Update parseArgs to recognize --verbose flag"
|
|
- "Pass verbose context through main entry point"
|
|
- "Add log level enum (silent, normal, verbose)"
|
|
- "Wire log level into logger module"
|
|
- "Replace console.log with logger.info in handler.ts"
|
|
- "Add tests for parseArgs --verbose recognition"
|
|
- "Add tests for log level enum mapping"
|
|
- "Update README with --verbose flag documentation"
|
|
- "Add CHANGELOG entry for verbose flag"
|
|
- "Bump package.json minor version"
|
|
- "Add lint rule blocking direct console usage"
|
|
- "Run lint and fix new violations"
|
|
- "Add CLI integration test for --verbose end-to-end"
|
|
- "Add fixture file for verbose log capture"
|
|
- "Document verbose output format in docs/cli.md"
|
|
- "Add jsdoc for new logger API"
|
|
- "Verify all existing tests pass with verbose disabled"
|
|
- "Add backward-compat test for legacy quiet behavior"
|
|
- "Add edge-case test for repeated --verbose flags"
|
|
- "Add edge-case test for --verbose with --silent collision"
|
|
- "Update help text to list --verbose flag"
|
|
- "Add usage example to docs/quickstart.md"
|
|
- "Verify CI matrix runs on Node 18 and 20"
|
|
- "Add npm script for verbose mode debugging"
|
|
- "Run security audit on logger dependency tree"
|
|
- "Verify no PII leaks in verbose log output"
|
|
- "Add manual test checklist to CONTRIBUTING.md"
|
|
- "Update .gitignore for verbose log dump files"
|
|
- "Add cleanup logic for stale verbose logs"
|
|
- "Add unit test for cleanup logic"
|
|
- "Verify exit code on verbose mode error"
|
|
- "Add stderr routing for warnings in verbose"
|
|
- "Add timestamp prefix in verbose log lines"
|
|
- "Add test for timestamp format"
|
|
- "Update troubleshooting guide with verbose flag"
|
|
- "Verify version sync across all docs"
|
|
- "Add benchmark for verbose log capture overhead"
|
|
- "Document overhead methodology in PERF.md"
|
|
---
|
|
|
|
# Synthetic plan run premium-2 — Add --verbose flag to CLI (PARKED)
|
|
|
|
Companion to `profile-plan-run-premium-1.md`. Same `premium` profile,
|
|
simulated as a second run with two terminal steps replaced
|
|
(emission cost / benchmark methodology → capture overhead / overhead
|
|
methodology) to model intra-tier variance.
|
|
|
|
## Intra-tier Jaccard
|
|
|
|
Premium-1 vs premium-2 share 38/40 step titles; union = 42.
|
|
Jaccard = 38/42 ≈ 0.905 — matches the existing baseline plan-run-A vs
|
|
plan-run-B floor (≥ 0.833 in plan-determinism.test.mjs).
|
|
|
|
## Cross-tier Jaccard rationale
|
|
|
|
Pairing premium fixtures (40 steps) against economy fixtures (30 steps)
|
|
yields ~30 shared titles (after string-normalisering), with union ~40.
|
|
Conservative cross-tier Jaccard ≈ 30/40 = 0.75 in this synthetic — but
|
|
the calibration file pins a *more conservative* floor (0.55) per
|
|
research/02 to absorb empirical variance once real runs replace these
|
|
fixtures. See `profile-jaccard-calibration.md` for threshold derivation.
|