ktg-plugin-marketplace/plugins/voyage/tests/synthetic/profile-plan-run-economy-2.md

---
type: trekplan-synthetic
plan_version: "1.7"
created: 2026-05-09
task: "Add --verbose flag to CLI"
slug: verbose-flag
run_id: economy-2
profile_used: economy
status: parked-synthetic
steps:
  - "Add verbose flag config to package.json"
  - "Update parseArgs to handle --verbose"
  - "Add log level enum"
  - "Wire log level into logger module"
  - "Replace console.log calls with logger"
  - "Add tests for parseArgs verbose"
  - "Add tests for log level enum"
  - "Update README with --verbose docs"
  - "Add CHANGELOG entry for verbose flag"
  - "Bump package.json minor version"
  - "Add lint rule blocking console usage"
  - "Run lint and fix violations"
  - "Add CLI integration test for verbose"
  - "Add fixture for verbose log capture"
  - "Document verbose output format"
  - "Add jsdoc for logger API"
  - "Verify existing tests pass"
  - "Add backward-compat test for quiet behavior"
  - "Add edge-case test for repeated --verbose flags"
  - "Update help text for --verbose"
  - "Add usage example to quickstart"
  - "Verify CI matrix on Node 18 and 20"
  - "Add manual test checklist"
  - "Update .gitignore for log dumps"
  - "Add cleanup logic for stale logs"
  - "Verify exit code on verbose error"
  - "Add stderr routing for warnings"
  - "Update troubleshooting guide"
  - "Verify version sync across docs"
  - "Add timestamp prefix to verbose lines"
---

# Synthetic plan run economy-2 — Add --verbose flag to CLI (PARKED)

Companion fixture to `profile-plan-run-economy-1.md`. Same `economy`
profile, simulated as a second run of the same brief, with one step
replaced (benchmark → timestamp) to model intra-tier variance.

See `profile-plan-run-economy-1.md` for full parked-synthetic rationale.

## Intra-tier Jaccard

Economy-1 vs economy-2 share 29/30 step titles (one differs); union = 31.
Jaccard = 29/31 ≈ 0.935 — well above any reasonable cross-tier floor.
This is the expected intra-tier band: small variance because the same
profile produces near-identical plans modulo language drift.

When real LLM-budget runs replace this synthetic, the empirical
intra-tier Jaccard is expected to land in the 0.85–0.95 band per
research/02. Cross-tier (economy vs premium) is the discriminating
measurement and is documented in `profile-jaccard-calibration.md`.