generateHealthScorecard signature: 2-arg → 3-arg (areaScores, opportunityCount,
options = {}). options.humanized=true renders friendlier title, grade-context
line per overall grade, and rephrased opportunity line. options.humanized=false
(or 2-arg call) preserves v5.0.0 verbatim output for backwards-compat.
topActions also gets an optional options.humanized that swaps recommendations
through humanizeFinding lookup.
posture.mjs main():
--json → write JSON to stdout, suppress stderr scorecard
--raw → write JSON to stdout (byte-identical to --json), write v5.0.0
verbatim scorecard to stderr
default → humanized scorecard to stderr, no stdout
posture.test.mjs scorecard-prose assertions re-anchored to --raw mode (the
explicit v5.0.0 path) — Wave 0 audit only covered finding-title strings;
scorecard prose surfaces here for the first time.
Wave 3 / Step 6 of v5.1.0 humanizer.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds --json and --raw flags to scan-orchestrator CLI main(). Default mode
runs humanizeEnvelope(env) before serialization; --json and --raw bypass
the humanizer for v5.0.0 byte-equal output (SC-6 / SC-7 paths).
Save-baseline path always writes the raw v5.0.0-shape envelope so future
humanizer-data updates do not trigger false-positive drift findings.
runAllScanners() unchanged — it remains the v5.0.0-shape source of truth
for in-process callers (posture, scoring, drift, etc.).
Wave 3 / Step 5 of v5.1.0 humanizer.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Synthetic plan.md fixture with source_findings: block-style YAML list of 3
40-char hex IDs in frontmatter, plus minimal plan structure (Title +
Implementation Plan + 1 Step + Manifest). 3 tests verify:
1. plan-validator accepts a plan with source_findings (additive optional field)
2. frontmatter parser extracts source_findings as array of strings
3. each ID matches the 40-char lowercase hex format from finding-id.mjs
Closes the SC3(b) gap flagged by adversarial review (scope-guardian Gap 2).
LLM-level behavior (planner emitting source_findings) remains non-testable
without live invocation; this covers the structural contract.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Wave 2 / Step 4 of v5.1.0 plain-language UX humanizer rollout. Re-anchors
34 title-string assertions across 4 test files so they survive Wave 3's
title/description/recommendation rewriting at the CLI layer.
Anchoring strategy per scanner:
- GAP findings: scanner + category + recommendation substring (humanizer
preserves stable identifiers like CLAUDE.md, .mcp.json, hook in rec).
Hardcoded CA-GAP-NNN IDs for positive checks.
- HKV findings: scanner + evidence regex (evidence preserved verbatim).
- SET findings: scanner + evidence regex (evidence preserved verbatim).
- PLH findings: scanner + hardcoded CA-PLH-NNN IDs (no evidence on most
PLH findings, so ID is the only stable anchor for specific cases;
negative checks use scanner + title-substring spanning raw + humanized).
Per docs/v5.1.0-test-audit.md classification: only (b) WILL BREAK
assertions modified. (a) shape-only assertions (error-message formatting,
pure existence checks) untouched. tests/lib/output.test.mjs and
tests/lib/diff-engine.test.mjs and tests/scanners/fix-engine.test.mjs
unchanged (synthetic test inputs, not scanner output).
Test count unchanged: 689/689 pass. IDs harvested via deterministic
runtime dump per fixture (resetCounter + scan).
3 integration tests using the run-A/run-B fixtures:
- Jaccard(A, B) ≥ 0.70 (SC4 brief threshold)
- IDs match 40-char hex shape (lib/parsers/finding-id.mjs format)
- no duplicate IDs within a single run
Tests the Jaccard PIPELINE; real-LLM determinism deferred to v1.1.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
review-run-A.md (5 findings) and review-run-B.md (6 findings, A ⊂ B) form a
known-Jaccard fixture pair: |A ∩ B| = 5, |A ∪ B| = 6, Jaccard = 5/6 = 0.833,
above the SC4 threshold of 0.70. IDs are real 40-char SHA1s computed via
lib/parsers/finding-id.mjs from realistic (file, line, rule_key) triplets.
Both fixtures pass review-validator --strict (frontmatter + body sections +
findings shape). Real-LLM determinism measurement deferred to v1.1.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Modify "all four pipeline commands" → "all five" (adds /ultrareview-local).
Add 3 new pins: Handover 6 section in HANDOVER-CONTRACTS.md,
review-validator CLI shim, rule-catalogue 12-key size invariant.
11/11 doc-consistency tests pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Closes the iteration loop: review.md → plan via source_findings audit trail.
Adds versioning row, validator-map entry, full Handover 6 section, and
stability summary row mirroring the shape of Handovers 1-5.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Wave 1 / Step 3 of v5.1.0 plain-language UX humanizer.
scanners/lib/humanizer.mjs exports three pure functions:
- humanizeFinding(f) -> new finding object with translated
title/description/recommendation + three new fields
(userImpactCategory, userActionLanguage, relevanceContext).
- humanizeFindings(findings) -> mapped array.
- humanizeEnvelope(env) -> walks env.scanners[].findings.
Plus computeRelevanceContext(filePath) as a named export for
unit testing.
Field semantics:
- userImpactCategory: from scanner prefix per research/02 line 124
(Configuration mistake / Conflict / Wasted tokens / Dead config /
Missed opportunity / Other).
- userActionLanguage: from severity per research/02 line 134
(Fix this now / Fix soon / Fix when convenient / Optional cleanup
/ FYI).
- relevanceContext: deterministic file-path heuristic — looks for
/tests/fixtures/ or /test/fixtures/ substring (test-fixture-no-impact),
*.local.* basename (affects-this-machine-only), defaults to
affects-everyone. No subprocess, no network.
Lookup order per scanner: static[title] -> patterns regex match ->
_default -> fall through to original strings (when scanner prefix
absent).
Original id, scanner, severity, file, line, evidence, category,
autoFixable, and optional details are preserved exactly. Pure —
verified by deepEqual of input before/after.
Test (32 cases): purity, field preservation across all paths,
known/unknown scanner handling, all 5 severities, all 6 categories,
relevance heuristic for 4 path types, envelope walking, ANSI-free
guarantee. All pass.
Regression: 689/689 tests (657 + 32 new = 54 new across Wave 1).
Project: .claude/projects/2026-05-01-config-audit-ux-redesign/
Wave 1 / Step 2 of v5.1.0 plain-language UX humanizer.
scanners/lib/humanizer-data.mjs exports TRANSLATIONS keyed by
scanner prefix (CML, SET, HKV, RUL, MCP, IMP, CNF, GAP, TOK, CPS,
DIS, COL, PLH). Each scanner has:
- static: exact-title -> {title, description, recommendation}
- patterns: array of {regex, translation} for template-literal titles
- _default: graceful fallback for unknown findings
Architectural change vs. plan: keys translations by exact scanner
title (not finding ID). Reason: finding IDs are sequence-based
(global counter in lib/output.mjs:34), not stable per finding-type
— two runs can produce different IDs for the same logical issue.
Title strings ARE stable (defined as string literals or template
patterns in the scanner source).
Translations follow research/03 SR-1..SR-17:
- active voice, second person, present tense
- sentences <= 25 words
- tier1 absolute prohibitions and tier3 domain jargon are kept out
of prose
- tier1/tier3 terms are permitted inside `backtick spans` (code
references like filenames and field names) — established
technical-doc convention
Test (12 cases): all 13 scanners covered; every static and pattern
entry has the 3 required fields; tier1 and tier3 forbidden-word
checks pass (with backtick-span exclusion); reference-stable
imports. All pass.
Regression: 657/657 tests (645 + 12 new).
Project: .claude/projects/2026-05-01-config-audit-ux-redesign/
Wave 1 / Step 1 of v5.1.0 plain-language UX humanizer.
tests/lint-forbidden-words.json defines the SC-3 forbidden-words
vocabulary used by the lint runner (Wave 4 / Step 8) and the
humanizer-data translation guard (Wave 1 / Step 2).
- Tier 1: 19 absolute prohibitions (failure if matched in default
output) — sourced from Microsoft Writing Style Guide, Federal
Plain Language, GOV.UK, Google Developer Style, Apple HIG.
- Tier 2: 24 strong-avoidance terms (warning if matched) — same
sources plus Mailchimp.
- Tier 3: 12 domain-specific jargon terms (failure if matched in
default output, allowed in --raw and --json paths) — sourced
from research/03 jargon table.
Counts diverge from plan.md (18/21/11) — JSON tracks the brief's
verbatim lists at research/03 lines 200-202 plus tier3 hook entry
from the brief's table. Plan revision noted in audit-doc.
Test: 10 cases verifying parse, count, schema completeness, spot
checks per tier, no cross-tier duplicates. All pass.
Regression: 645/645 tests (635 + 10 new).
Project: .claude/projects/2026-05-01-config-audit-ux-redesign/
Wave 0 / Step 0 of the v5.1.0 plain-language UX humanizer plan.
Captures v5.0.0 baseline output for all 8 CLIs at
tests/snapshots/v5.0.0/ — these snapshots are immutable references
for SC-6 (--json byte-equal) and SC-7 (--raw byte-equal) tests in
later waves.
- 5 CLIs captured via --output-file: scan-orchestrator, posture,
token-hotspots-cli, manifest, whats-active
- 3 CLIs captured via stdout redirect (no --output-file support):
drift-cli (after baseline seed), fix-cli, plugin-health-scanner
- Posture stderr scorecard captured separately for SC-7 stderr-mode
comparison
docs/v5.1.0-test-audit.md classifies all 42 .title references in
7 known test files: 34 will break under humanization (literal
string equality / substring), 8 are safe (test fixtures or error
formatting). This document is the change list for Step 4.
Project: .claude/projects/2026-05-01-config-audit-ux-redesign/
These were committed in b37b938 by mistake — KTG's convention is that
planning docs in plugins/ultraplan-local/docs/ are local working files
and never pushed to the public marketplace.
- git rm --cached on both files (kept on disk, just untracked)
- .gitignore extended with explicit entries for the two filenames
Existing tracked docs in plugins/ultraplan-local/docs/ predate this rule
and are left alone (separate decision).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds two sibling files in plugins/ultraplan-local/docs/ that together
specify a new /ultracontinue command for zero-friction multi-session
resumption — drafted from design dialogue at the end of the config-audit
v5.0.0 release session (5 sessions, ~10 manual NEXT-SESSION-PROMPT
context-handovers — friction this work removes).
ultracontinue-brief.md (159 lines):
- Follows the /ultrabrief-local template (frontmatter brief_version: 2.0)
so /ultraplan-local can consume it directly
- Defines per-project state-file convention .claude/projects/<project>/
.session-state.local.json as the contract; /ultracontinue is read-only,
multiple writers may update
- 10 falsifiable success criteria including cross-project consistency,
no-new-deps, validator + helper command, docs sweep across plugin
README + CLAUDE.md + marketplace root README
- 3 research topics: ultraexecute end-of-session integration depth,
graceful-handoff alignment (no hard dep), Claude Code slash-command
conventions for read+execute commands
- Explicit non-goals: not replacing /ultraexecute-local --resume, not
replacing graceful-handoff, not auto-orchestrating N sessions
- Open questions and assumptions flagged for plan-critic / scope-guardian
ultracontinue-design-notes.md (117 lines):
- Captures the dialogue rationale that shaped the brief, so the
implementing session has full context without needing to read this
conversation's transcript
- Origin (config-audit v5 release pain point), key design insight
("state-fil ER kontrakten, ikke verktøyet"), 6 design decisions with
alternatives considered, anti-patterns from KTG auto-memory to respect,
recommended reading order, expected scope (1-2 execution sessions)
No code changes. Brief is ready for /ultraplan-local --brief
plugins/ultraplan-local/docs/ultracontinue-brief.md (light path) or
/ultraresearch-local for full research path.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
v5.0.0 SHIPPED 2026-05-01. Tag config-audit/v5.0.0 pushed to Forgejo.
SC-6b release-gate PASS at -0.85% delta (CLAUDE.md actual 589 vs
estimated 594, well within ±5% gate).
Per-step:
- Step 28: README/CLAUDE.md straggler-sweep + self-audit counter alignment
- Step 29: version bump 4.0.0 → 5.0.0 + consolidated CHANGELOG
- Step 30: full audit + live SC-6b gate + tag (incl. one in-step bug fix
for hotspot.path exposure, required to make calibration measurable)
635 tests still green throughout. No blockers carried forward.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The v5.0.0-rc.1 N5 implementation looked up hotspot.path in
calibrateAgainstApi() but token-hotspots.mjs only emitted hotspot.source —
calibration silently produced 0 actual_tokens because every iteration hit
the `if (!hotspot?.path) continue` guard.
Fix: file-backed hotspots now expose `path: h.absPath` in the JSON output.
MCP-server hotspots intentionally leave path unset — their tokens are
runtime tool-schema (formula-based: 500 + toolCount × 200), not file
content readable by count_tokens.
SC-6b release-gate verified against tests/fixtures/marketplace-large:
- Actual (count_tokens, claude-opus-4-7): 589 tokens for CLAUDE.md
- Estimated (4-bytes/token byte heuristic): 594 tokens
- Delta: -5 tokens / -0.85% — well within ±5% gate. PASS.
CHANGELOG: documented the fix + SC-6b result inline under [5.0.0].
All 635 tests still green. No estimateTokens tuning required for v5.0.0.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Stop hook fallback antok 200K-vindu. På Opus 4.7 (faktisk 1M) kunne
auto-handoff fyre 5–7x for tidlig — estimert 70% når reell bruk var
~14%. Erstatter enkel fallback med 4-stegs resolution-kjede:
1. payload.context_window.used_percentage (autoritativ)
2. payload.context_window.context_window_size + transcript-estimat
3. MODEL_WINDOWS[payload.model.id] + estimat
4. FALLBACK_WINDOW=1_000_000 + estimat (2026-default)
additionalContext-meldinger inkluderer nå [kilde: <source>] for innsyn.
Brief som kilde-artefakt i docs/brief-context-window-detection.md.
6 nye tester (57 totalt). Ingen regresjoner.
beta.1 wrap entry covering N1-N4 + N6 (Steps 18-22b). Includes
explicit Known breaking changes section on CA-TOK-* glob suppression
matching CA-TOK-005, and notes plugin-vs-built-in collision is
deferred to v5.0.1.
Tests: 586 → 625 (+39).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
New COL scanner detects skill-name collisions across plugins and
between user-level skills (~/.claude/skills/) and plugin-bundled
skills. Skill identity is the directory basename — matches how
enumerateSkills resolves names.
Detection rules (per docs/v5-namespace-research.md, confidence: medium):
- Plugin-vs-plugin same skill name → severity low (CA-COL-001)
- User-vs-plugin same skill name → severity medium (CA-COL-001)
- Plugin-vs-built-in collisions: out of scope for v5.0.0 (insufficient
verification — recorded for v5.0.1 follow-up).
Findings carry details.namespaces array with {source, name, path} for
every conflicting source — supports per-collision reporting downstream.
output.mjs: finding() helper now passes through optional `details`
field (scanner-specific structured payload).
scoring.mjs: COL → "Plugin Hygiene" (new area, 10 total). Posture test
updated from 9 → 10 area scores.
.gitignore: docs/v5-namespace-research.md is local-only (Step 22a
research output, gitignored per plan).
Fixture collision-plugins/fake-home/ has user skill `review` colliding
with plugin-a + plugin-b's `review` (medium severity), plus plugin-c's
unique `summarize` (no collision).
[skip-docs] reason: v5 plan fences off README/CLAUDE.md badge updates
to Session 5; Forgejo pre-commit-docs-gate hook requires this tag.
Tests: 617 → 625 (+8).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>