v4.1.0 → v4.2.0. Two-file change per Step 14 manifest (package.json +
.claude-plugin/plugin.json). Description tagline expanded from
"brief, research, plan, execute, review, continue" to include "revise"
and "+ first marketplace playground".
Out-of-scope under Step 14 forbidden_paths (left at 4.1.0 intentionally):
- lib/exporters/otlp-format.mjs (VOYAGE_SCOPE_VERSION constant)
- hooks/scripts/otel-export.mjs (User-Agent header)
These constants are touched on the next bump where the constants directory
is in scope; keeping them stale for one release is acceptable since
otel/otlp telemetry is opt-in and the version field is informational.
Verification:
- node -e "import('./package.json',{with:{type:'json'}}).then(m=>console.log(m.default.version))" → 4.2.0
- jq -r .version .claude-plugin/plugin.json → 4.2.0
- npm test: 610 pass / 0 fail / 2 skipped (Docker)
- SC11 pipeline-self-eat gate: render-artifact.mjs renders own brief.md + plan.md to non-empty HTML
Pure module computing deterministic 16-char SHA-256 prefix for annotation set.
Canonicalization: sort by id, fixed field order (id|target_artifact|target_anchor|intent|comment|timestamp), \n-join, sha256, take first 16 hex.
Brief SC4 specifies sha256-prefix; research-05 said sha1 — brief wins per Hard Rule "Brief-driven".
6 tests pass: empty digest, order-independence, intent-sensitivity, format invariant, golden value, undefined-vs-empty equivalence.
3 new test files, 24 cases (8 per validator):
- baseline (no annotation fields) still valid
- revision: 0 / revision: 5 accepted
- source_annotations list-of-dict accepted
- annotation_digest string accepted
- revision_reason accepted
- all 4 fields together accepted
- unrecognized future field tolerated (forward-compat policy)
Pin against future strict-key refactors. No production code change — pure regression pin.
Found by simulert v4.1 smoke — doc/code-drift in v4.1 ship:
docs/observability.md claims "Cloud metadata endpoints (169.254.169.254)
are permanently blocked" but the validator allowed them when
VOYAGE_OTEL_ALLOW_PRIVATE=1. Cloud metadata services expose IAM
credentials and instance secrets — operator-trust extended to
RFC-1918 home-lab access does NOT extend here, because the
blast-radius (cloud-account compromise) is qualitatively different.
New HARD_BLOCKED_HOSTS set checked BEFORE the link-local opt-in path:
- 169.254.169.254 (AWS / GCP / Azure metadata)
- 100.100.100.200 (AliCloud metadata)
- metadata.google.internal
- metadata.azure.com
New error code ENDPOINT_HARD_BLOCKED. Existing test for
ENDPOINT_LINK_LOCAL_REJECTED on 169.254.169.254 updated to assert
the new code; 3 new tests verify the hard-block holds even with
VOYAGE_OTEL_ALLOW_PRIVATE=1, plus AliCloud + GCP-hostname coverage.
Tests: 487 → 490 pass + 2 skipped.
Step 21 of v4.1 — extend-in-place per Plan-critic Blocker 2 split:
commands-only assertions land here; CLAUDE.md / README.md pinning is
deferred to Step 22 (post-write).
Changes:
1. CLAUDE.md command coverage loop now spans all SIX pipeline commands
(added /trekcontinue — was 5 of 6 pre-v4.1 per HIGH risk-assessor).
2. New: every pipeline command-file (trekbrief/research/plan/execute/
review/continue.md) must document the --profile flag.
3. New: forbidden-alias check — no command-file may use the legacy
names model_per_phase / phase_to_model / profile_phase_models.
Canonical name is "phase_models" (locked in brief).
4. New: at least one command-file must mention "phase_models" by name
so the regression detects total removal of the canonical-name
reference.
Tests: 482 pass + 2 skipped (Docker not installed).
Step 20 of v4.1 — implements drift detection in plan-validator.mjs per
brief Assumptions block 7: "Mismatch (e.g. korrupt manuell endring)
emitterer MANIFEST_PROFILE_DRIFT-warning fra plan-validator i --strict-modus."
Logic (after validateAllManifests in validatePlanContent):
1. Strict-mode only — soft mode never emits drift warnings.
2. Plan frontmatter must declare 'profile: <name>' to establish baseline.
3. For each step manifest, if profile_used is set AND differs from plan
profile, emit warning (NOT error) with code MANIFEST_PROFILE_DRIFT
and location 'step N: profile_used = X, plan profile = Y'.
Forward-compat preserved: drift is a warning, plan remains valid:true.
Operators see the drift in --strict mode without parsing breaking.
New files:
tests/validators/plan-validator-profile-drift.test.mjs — 4 tests
tests/fixtures/plan-profile-drift.md — drift fixture
Tests verify:
1. drift detected in strict mode → MANIFEST_PROFILE_DRIFT in warnings
2. drift NOT detected in soft mode → strict gate honored
3. matching profile → no drift warning
4. no plan-level profile → drift detection silent (no baseline)
Tests: 479 pass + 2 skipped (Docker not installed).
Step 19 of v4.1 — extend-in-place per brief Preferences. Three new test
blocks asserting forward-compat:
1. Legacy fixtures (plan-run-A.md, plan-run-B.md) — without profile_used
in frontmatter — still parse cleanly after manifest-yaml.mjs added
OPTIONAL_STRING_KEYS.
2. New fixtures (profile-plan-run-{economy,premium}-*.md) — with
profile_used in frontmatter — parse cleanly with correct profile
value extracted.
3. Real v4.1 plan (.claude/projects/2026-05-08-voyage-v4.1-modellprofiler/plan.md)
validates strict, emits no PLAN_VERSION_MISMATCH warning.
Tests: 475 pass + 2 skipped (Docker not installed).
Step 17 of v4.1 — escalate-handler invoked. Live LLM-budget ($60-120 for
4 plan-runs á /trekplan --profile {economy,premium} on
examples/01-add-verbose-flag/brief.md) was not authorized for the
v4.1-execute-4b session.
Per Step 17 escalate-fallback (and NEXT-SESSION-PROMPT.local.md
fallback-strategy): document economy-Plan as parked, use balanced as
low-threshold profile, defer empirical calibration to v4.2.
Files:
tests/synthetic/profile-plan-run-economy-1.md — 30 steps, parked-synthetic
tests/synthetic/profile-plan-run-economy-2.md — 30 steps, parked-synthetic
tests/synthetic/profile-plan-run-premium-1.md — 40 steps, parked-synthetic
tests/synthetic/profile-plan-run-premium-2.md — 40 steps, parked-synthetic
tests/synthetic/profile-jaccard-calibration.md — threshold 0.55 pinned per
research/02 conservative starting value
Replacement procedure documented in calibration.md "How to replace"
section. Trigger conditions for empirical re-run:
1. Cross-tier smoke-test (Step 18) flips red on a real run
2. v4.2 LLM-budget approval
3. New profile tier added
Step 16 of v4.1 — first test in tests/integration/, establishes the
skip-on-missing-tool pattern voyage will reuse for environment-dependent
integration tests. Two tests:
1. compose config parses and contains expected services
2. compose config pins required image versions
Both skip cleanly when 'docker info' fails (no Docker installed). On a
machine with Docker, both tests run docker compose config and assert the
4 services + 3 version pins are present.
Tests: 468 pass + 2 skipped (Docker not installed in dev env).
Step 14 follow-up — VOYAGE_OTEL_ENDPOINT (not VOYAGE_OTLP_ENDPOINT) per
hooks/scripts/otel-export.mjs and lib/exporters/endpoint-validator.mjs.
Adds VOYAGE_OTEL_ALLOW_PRIVATE=1 for localhost since 127.0.0.1 is
loopback and rejected by default.
Step 13 of v4.1 — adds Stop hook entry pointing to
hooks/scripts/otel-export.mjs (added in Step 12 / commit c5fb745).
Mounts the orchestrator on Claude Code's Stop event so OTel/Prometheus
export runs at session-end when VOYAGE_EXPORT_MODE is set.
HIGH-risk-mitigering: tests/hooks/hooks-json-stop-wired.test.mjs
asserter at Stop-key finnes, refererer otel-export.mjs, bruker
\${CLAUDE_PLUGIN_ROOT}-substitusjon, og har type:command.
Tests: 464 → 468 (4 new). All green.
Step 4 av v4.1-execute (Wave 2, Session 2).
Tre innebygde modellprofiler matcher brief profile-assignment matrix:
- economy: alle 6 phase_models = sonnet, parallel 2-3, external_research=false,
iter-cap=1. ~$1-3 per pipeline-sesjon.
- balanced: brief/research/execute/continue=sonnet, plan=opus, review=opus,
parallel 4-6, external_research=false (operator-override deferred
til v4.2 per NEXT-SESSION-PROMPT scope-grenser), iter-cap=2.
~$5-15 per pipeline-sesjon.
- premium: alle 6 phase_models = opus, parallel 6-8, external_research=true,
iter-cap=3. ~$20-60 per pipeline-sesjon (default, samme som v4.0).
Bruker list-of-dicts for phase_models (parser-kompatibel mot
lib/util/frontmatter.mjs:79-105). Verifisert: alle 3 filer parses uten feil
og returnerer array med 6 entries (phase+model per entry).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Step 3 av v4.1-execute (Wave 1, Session 1).
Legg ny eksportert const OPTIONAL_STRING_KEYS = ['profile_used'] parallel
til eksisterende OPTIONAL_KEYS. Utvid parseManifest med ny dispatch-loop
etter OPTIONAL_BOOLEAN_KEYS. Returnerer MANIFEST_OPTIONAL_TYPE hvis
profile_used finnes men ikke er string.
Forskjell fra OPTIONAL_BOOLEAN_KEYS: absence == not-present (NOT defaulted
til false, unlike boolean). Downstream-konsumenter kan dermed skille mellom
unset og empty-string.
Tester (5 nye, baseline 372 → 377):
- OPTIONAL_STRING_KEYS export drift-pin
- profile_used: economy parses successfully (SC #10 forward-compat)
- profile_used: numeric rejected
- absence: field NOT in parsed (string-key semantics)
- profile_used + skip_commit_check + memory_write co-existence
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Step 2 av v4.1-execute (Wave 1, Session 1).
Legg --profile i valued-arrayen for alle 6 voyage-kommandoer (trekbrief,
trekresearch, trekplan, trekexecute, trekreview, trekcontinue). Mønster
identisk med eksisterende --project/--brief valued-handling. Ingen endring
til parseArgs-logikk — utvider kun schema.
Tester (11 nye, baseline 361 → 372):
- 6 happy-path-tests (én per kommando)
- ARG_MISSING_VALUE for --profile uten verdi
- --profile + --quick kombo
- --profile + --gates edge-case (--gates parses inline, ikke i FLAG_SCHEMA)
- --profile + --project kombo
- trekcontinue --profile (validerer at tomt valued[] nå er utvidet)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Slash-command-parseren matcher !`...` selv inne i ```bash markdown-fences,
som gjorde at Phase 8 NEXT-SESSION-PROMPT-template eksekverte ved skill-load
med literale {project_dir}/{next_session_brief_path}/{next_session_label}/
{status}-strenger som argv. Det ga ENOENT på .session-state.local.json.tmp
og blokkerte hele /trekexecute skill-loadet.
Fjern !`...`-wrapperen og merk blokken eksplisitt som runtime-template.
Pattern matcher nå konvensjonen brukt andre steder i samme fil
(linje 202-208) der ```bash brukes for orkestrator-instruksjon uten
auto-eksekvering.
Wave 0 av v4.1-execute — pre-requisite for å låse opp /trekexecute
skill-invokasjon mot .claude/projects/2026-05-08-voyage-v4.1-modellprofiler/
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- renderCost: FIX — KEY_STATS_CONFIG['cost-distribution'] og inferVerdict('cost-distribution') viste "[object Object]" / returnerte alltid 'go' fordi parser-output har p50/p90 = {monthly, yearly}-objekter, ikke tall. Begge ekstraherer nå .monthly med fallback for flate fixtures.
- renderLicense: PASS — ingen kode-endring. Capability-matrix-status korrekt utledet (met/partial/missing) via parseCapabilityMatrix. Visuell QA gjenstår i sesjon 6.
- renderCompare: FIX — firstWord-heuristikk feilet når begge subjekter delte førsteord (f.eks. "Azure AI Foundry" vs "Azure ML + AKS" ga begge fw='azure', kollapset vinn-attribusjon). Erstattet med distinctive-token-matching: full-subject-substring først, deretter ord som er unike for ett subjekt. Diff-cell coloring oppdatert til samme matchSubject()-helper.
- renderUtredning: MINOR — droppet misvisende role="tab"/role="tablist" siden vi rendrer anchor-jump-TOC (alle paneler synlige), ikke ekte tab-toggle. Beholdt aria-current="true" for visuell aktiv-markør (DS-CSS hekter på den). Ekte tab-toggle defer til v1.15.0.
validate-plugin.sh: 219 PASS uendret
run-e2e.sh --playground: 272 PASS uendret
test-playground-migrations.sh: 7 PASS uendret
Refs V1.14.0-AUDIT.local.md sub-batch E (sesjon 5b).
- renderMigrate: <section class="phase-detail"> per fase erstattet med
<div class="expansion">-list (DS-supplement). Default-collapsed, klikkbar
header (Fase N: navn + duration), body = milepaeler + suksesskriterier.
Behold cycle-ribbon + mat-ladder + phases-summary-tabell + risks-tabell.
- renderPoc: speil renderMigrate. Traffic-light flyttet inn i expansion-body
(ul.traffic-list per fase med status fra fasens stepState).
- renderSummary: KEY_STATS_CONFIG['verdict'] patchet — parseTable returnerer
rader med header-baserte nokler (Metric/Verdi/Mal) ikke canonical
{label,value,unit}. Ny logikk bruker metrics_headers + heuristikk-match for
label/value/unit-kolonner, med fallback til canonical felt.
Backward-kompatibelt.
- renderAdr: verifisert PASS — ingen endring (.adr-meta + critique-cards
rendrer pent uten ekstra arbeid).
- ACTIONS['phase-expand']: ny handler registrert som alias for
requirement-expand (samme toggle-monster, eget action-navn for senere
divergens).
- Lokal CSS: hele .phase-detail-blokken (~10 linjer) slettet. Defensive-
kommentar oppsummert til 5-linjers historie-notat.
- Style-blokk effektive linjer: 147 (var 178 etter sesjon 4).
Smoke-tester:
- validate-plugin.sh: 219 PASS
- run-e2e.sh --playground: 272 PASS (202 statisk + 70 parser)
- test-playground-migrations.sh: 7 PASS
Refs V1.14.0-AUDIT.local.md sub-batch D.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Bugfixes (B-DS-1, B-DS-2, B-DS-3 fra V1.14.0-AUDIT):
- .kanban-card__name (tier3-supplement): word-break: break-all → break-word
+ overflow-wrap: anywhere. Knekket midt i ord ("Tekn isk dokumen tasjon").
- .expansion__title-main, .expansion__title-sub (tier3-supplement): legg
til display: block. Begge er <span> som flyter inline by default —
resultat: "dokumentertKilde: Art. 9" på samme linje.
- .matrix__bubble (components.css): legg til cursor: pointer, hover-scale
og focus-visible. Antas rendret som <button> i konsumenter — gir
visuell + keyboard-fokus-feedback.
Re-syncet til plugins/ms-ai-architect/playground/vendor/ via
sync-design-system.mjs. Slettet 3 lokal-overrides i playground HTML
(matrix-bubble, expansion-title, kanban-card-name). Style-blokk:
191 → 182 linjer.
Smoke-tester: validate-plugin 219 PASS, e2e --playground 272 PASS,
statisk struktur 202 PASS.
Andre plugins (llm-security, voyage, okr, config-audit) påvirkes IKKE
— beholder gammel vendored DS inntil de selv re-syncer.
Sesjon 2 av 6 i v1.14.0 root-cause-multi-sesjons-løp.
ms-ai-architect plugin-versjon ikke bumpet (sesjon 6 ship-er v1.14.0).
[skip-docs]: docs oppdateres i sesjon 6 ved v1.14.0 plugin-ship.
Refs V1.14.0-AUDIT.local.md sub-batch 1 + 4.
10 visuelle bugs identifisert av maintainer i nettleser etter v1.13.0
shipped. Patch-pakke som adresserer mismatch mellom playground-rendrere
og DS-konvensjoner som v1.13.0 ikke fanget opp.
- B7: classify "Forpliktelser" indent — lokal .report-meta CSS-reset
(DL grid max-content+1fr, h4 uppercase+bold, ul padding-left space-5)
for konsistent venstre-justering uavhengig av nestelse.
- B8a: requirement-expand handler missing — renderRequirements markup
hadde data-action="requirement-expand" på hver expansion__head, men
ingen ACTIONS-handler var registrert. R-01..R-09-radene i AI Act-krav
var derfor ikke klikkbare. Fix: register ACTIONS['requirement-expand'].
- B8b: expansion title-main + title-sub kjørte sammen — DS' spans var
inline. Lokal display:block så de stables vertikalt.
- B10: kanban-card tegnknekking — DS' word-break:break-all knekker midt
i ord. Lokal override med break-word.
- B11: DPIA matrix-bobler ikke responderer — v1.13.0 click-handler
matchet kun mot første-kolonne i Trusler-tabellen. DPIA-fixturer har
full-tekst label i matrix_cells men T-001-id i threats-tabellen, så
ingen match. Utvid til (Pass 1) exact first-cell + (Pass 2) substring-
match mot enhver celle med 40-tegn-prefiks-toleranse.
- B12, B13, B15: defensive layout for top-risks/suppressed-panel/
phase-detail/aiact-timeline — eksplisitt display:block; clear:both;
width:100% mot grid-leak fra small-multiples/kanban-board/mat-ladder.
- B14: Migrate "skal vel være tabell" — phases-summary-tabell over
phase-detail-seksjonene (Fase, Varighet, Milepæler-count, Suksesskriterier-
count, Status). Samme tabell speilet i renderPoc for konsistens.
Verifisering:
- 23/23 smoke-test PASS (B7-B15 + 5 v1.13.0-regresjoner)
- 271/271 playground E2E PASS
- 219 plugin-validering PASS
- 42 KB-update PASS
Versjon: v1.13.0 -> v1.13.1 (plugin.json, README badge, README
version-history, CHANGELOG, ROADMAP, TODO, plugin CLAUDE.md
playground-header, root README plugin-list, root CLAUDE.md plugin-list).
Berører kun lokal CSS i <style>-blokk, ACTIONS-handler-registrering,
click-handler-utvidelse, og to renderer-funksjoner. Ingen modifisering
av playground/vendor/. Vendored DS' .kanban-card__name { word-break:
break-all } står — overstyres lokalt.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Fix-pakke som speiler llm-security v7.6.1 (commit f9b555a). Samme klasse
visuelle bugs identifisert via parallell DS-analyse av playground-rendrere.
- B1: renderFindingsBlock + renderRequirements bytter <div class="findings">
outer (DS grid 360px+1fr klemte indre struktur til 360px-kolonne, lot
1fr-detail-panel-kolonnen stå tom) til <section class="report-meta">.
BEM-strukturen findings__list > findings__group > findings__items uendret.
- B2: lokal .report-table CSS for 6+ rapporter (Trusler, Kostnadsoversikt,
TCO, Risiko-tabell, Key Metrics) som manglet styling — DS implementerer
ikke klassen. Speilet lokal styling fra llm-security v7.6.1.
- B3: ROS-matrise-bobler bytter <span> til <button type="button"
data-threat-id="..." aria-label="..."> med document-level click-handler
som scroller smooth til tilsvarende rad i Trusler-tabellen og
highlighter raden i 1.6 sek. Lokal CSS for cursor:pointer, hover
scale(1.15), :focus-visible outline.
- B4: renderRadarSvg bumpet 300x300 til 380x380, R fra 100 til 125,
label-offset fra R+25 til R+28, dynamisk text-anchor basert på
horisontal-posisjon for å unngå at bottom-labels overlapper hverandre
ved 6+ akser (typisk for ROS-rapport med 7 risiko-dimensjoner).
- B5: lokal .recommendation-card__body { overflow-wrap: anywhere;
word-break: break-word } for å forhindre at lange single-line tekster
(URLer, owner-tags, dato) skubber innhold ut av viewport i grid-cellen.
tests/test-playground-v3.sh: DS-klasse-assertion oppdatert fra .findings
til .findings__list (BEM-list er fortsatt i bruk; outer grid-container
bevisst fjernet i B1).
Verifisering:
- 22/22 smoke-test PASS (B1-B5 grep-asserts)
- 271/271 playground E2E PASS (201 statisk-struktur + 70 parser-fixtures)
- 219 plugin-validering PASS
- 42 KB-update test PASS
Versjon: v1.12.0 -> v1.13.0 (plugin.json, README badge, README
version-history, CHANGELOG, ROADMAP, TODO, plugin CLAUDE.md
playground-header, root README plugin-list, root CLAUDE.md plugin-list).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CLAUDE.md OBLIGATORISK-regel: enhver feature-endring som pusher til
Forgejo MÅ oppdatere alle tre doc-nivåer i SAMME commit eller umiddelbart
etter. v7.6.1-fix-commit (f9b555a) bumpet kun versjons-badgen — denne
oppfølgings-commit-en lukker doc-gapet.
- plugins/llm-security/README.md: ny [7.6.1] history-tabell-rad
- plugins/llm-security/CLAUDE.md: header bumpet v7.6.0 → v7.6.1 +
ny v7.6.1-blurb (alle 6 fix-detaljer)
- README.md (rot): llm-security versjons-rad bumpet v7.6.0 → v7.6.1 +
v7.6.1 history-bullet over v7.6.0-bullet
Ingen kodeendringer.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>