Found by simulert v4.1 smoke — doc/code-drift in v4.1 ship:
docs/observability.md claims "Cloud metadata endpoints (169.254.169.254)
are permanently blocked" but the validator allowed them when
VOYAGE_OTEL_ALLOW_PRIVATE=1. Cloud metadata services expose IAM
credentials and instance secrets — operator-trust extended to
RFC-1918 home-lab access does NOT extend here, because the
blast-radius (cloud-account compromise) is qualitatively different.
New HARD_BLOCKED_HOSTS set checked BEFORE the link-local opt-in path:
- 169.254.169.254 (AWS / GCP / Azure metadata)
- 100.100.100.200 (AliCloud metadata)
- metadata.google.internal
- metadata.azure.com
New error code ENDPOINT_HARD_BLOCKED. Existing test for
ENDPOINT_LINK_LOCAL_REJECTED on 169.254.169.254 updated to assert
the new code; 3 new tests verify the hard-block holds even with
VOYAGE_OTEL_ALLOW_PRIVATE=1, plus AliCloud + GCP-hostname coverage.
Tests: 487 → 490 pass + 2 skipped.
Step 21 of v4.1 — extend-in-place per Plan-critic Blocker 2 split:
commands-only assertions land here; CLAUDE.md / README.md pinning is
deferred to Step 22 (post-write).
Changes:
1. CLAUDE.md command coverage loop now spans all SIX pipeline commands
(added /trekcontinue — was 5 of 6 pre-v4.1 per HIGH risk-assessor).
2. New: every pipeline command-file (trekbrief/research/plan/execute/
review/continue.md) must document the --profile flag.
3. New: forbidden-alias check — no command-file may use the legacy
names model_per_phase / phase_to_model / profile_phase_models.
Canonical name is "phase_models" (locked in brief).
4. New: at least one command-file must mention "phase_models" by name
so the regression detects total removal of the canonical-name
reference.
Tests: 482 pass + 2 skipped (Docker not installed).
Step 20 of v4.1 — implements drift detection in plan-validator.mjs per
brief Assumptions block 7: "Mismatch (e.g. korrupt manuell endring)
emitterer MANIFEST_PROFILE_DRIFT-warning fra plan-validator i --strict-modus."
Logic (after validateAllManifests in validatePlanContent):
1. Strict-mode only — soft mode never emits drift warnings.
2. Plan frontmatter must declare 'profile: <name>' to establish baseline.
3. For each step manifest, if profile_used is set AND differs from plan
profile, emit warning (NOT error) with code MANIFEST_PROFILE_DRIFT
and location 'step N: profile_used = X, plan profile = Y'.
Forward-compat preserved: drift is a warning, plan remains valid:true.
Operators see the drift in --strict mode without parsing breaking.
New files:
tests/validators/plan-validator-profile-drift.test.mjs — 4 tests
tests/fixtures/plan-profile-drift.md — drift fixture
Tests verify:
1. drift detected in strict mode → MANIFEST_PROFILE_DRIFT in warnings
2. drift NOT detected in soft mode → strict gate honored
3. matching profile → no drift warning
4. no plan-level profile → drift detection silent (no baseline)
Tests: 479 pass + 2 skipped (Docker not installed).
Step 19 of v4.1 — extend-in-place per brief Preferences. Three new test
blocks asserting forward-compat:
1. Legacy fixtures (plan-run-A.md, plan-run-B.md) — without profile_used
in frontmatter — still parse cleanly after manifest-yaml.mjs added
OPTIONAL_STRING_KEYS.
2. New fixtures (profile-plan-run-{economy,premium}-*.md) — with
profile_used in frontmatter — parse cleanly with correct profile
value extracted.
3. Real v4.1 plan (.claude/projects/2026-05-08-voyage-v4.1-modellprofiler/plan.md)
validates strict, emits no PLAN_VERSION_MISMATCH warning.
Tests: 475 pass + 2 skipped (Docker not installed).
Step 17 of v4.1 — escalate-handler invoked. Live LLM-budget ($60-120 for
4 plan-runs á /trekplan --profile {economy,premium} on
examples/01-add-verbose-flag/brief.md) was not authorized for the
v4.1-execute-4b session.
Per Step 17 escalate-fallback (and NEXT-SESSION-PROMPT.local.md
fallback-strategy): document economy-Plan as parked, use balanced as
low-threshold profile, defer empirical calibration to v4.2.
Files:
tests/synthetic/profile-plan-run-economy-1.md — 30 steps, parked-synthetic
tests/synthetic/profile-plan-run-economy-2.md — 30 steps, parked-synthetic
tests/synthetic/profile-plan-run-premium-1.md — 40 steps, parked-synthetic
tests/synthetic/profile-plan-run-premium-2.md — 40 steps, parked-synthetic
tests/synthetic/profile-jaccard-calibration.md — threshold 0.55 pinned per
research/02 conservative starting value
Replacement procedure documented in calibration.md "How to replace"
section. Trigger conditions for empirical re-run:
1. Cross-tier smoke-test (Step 18) flips red on a real run
2. v4.2 LLM-budget approval
3. New profile tier added
Step 16 of v4.1 — first test in tests/integration/, establishes the
skip-on-missing-tool pattern voyage will reuse for environment-dependent
integration tests. Two tests:
1. compose config parses and contains expected services
2. compose config pins required image versions
Both skip cleanly when 'docker info' fails (no Docker installed). On a
machine with Docker, both tests run docker compose config and assert the
4 services + 3 version pins are present.
Tests: 468 pass + 2 skipped (Docker not installed in dev env).
Step 14 follow-up — VOYAGE_OTEL_ENDPOINT (not VOYAGE_OTLP_ENDPOINT) per
hooks/scripts/otel-export.mjs and lib/exporters/endpoint-validator.mjs.
Adds VOYAGE_OTEL_ALLOW_PRIVATE=1 for localhost since 127.0.0.1 is
loopback and rejected by default.
Step 13 of v4.1 — adds Stop hook entry pointing to
hooks/scripts/otel-export.mjs (added in Step 12 / commit c5fb745).
Mounts the orchestrator on Claude Code's Stop event so OTel/Prometheus
export runs at session-end when VOYAGE_EXPORT_MODE is set.
HIGH-risk-mitigering: tests/hooks/hooks-json-stop-wired.test.mjs
asserter at Stop-key finnes, refererer otel-export.mjs, bruker
\${CLAUDE_PLUGIN_ROOT}-substitusjon, og har type:command.
Tests: 464 → 468 (4 new). All green.
Step 4 av v4.1-execute (Wave 2, Session 2).
Tre innebygde modellprofiler matcher brief profile-assignment matrix:
- economy: alle 6 phase_models = sonnet, parallel 2-3, external_research=false,
iter-cap=1. ~$1-3 per pipeline-sesjon.
- balanced: brief/research/execute/continue=sonnet, plan=opus, review=opus,
parallel 4-6, external_research=false (operator-override deferred
til v4.2 per NEXT-SESSION-PROMPT scope-grenser), iter-cap=2.
~$5-15 per pipeline-sesjon.
- premium: alle 6 phase_models = opus, parallel 6-8, external_research=true,
iter-cap=3. ~$20-60 per pipeline-sesjon (default, samme som v4.0).
Bruker list-of-dicts for phase_models (parser-kompatibel mot
lib/util/frontmatter.mjs:79-105). Verifisert: alle 3 filer parses uten feil
og returnerer array med 6 entries (phase+model per entry).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Step 3 av v4.1-execute (Wave 1, Session 1).
Legg ny eksportert const OPTIONAL_STRING_KEYS = ['profile_used'] parallel
til eksisterende OPTIONAL_KEYS. Utvid parseManifest med ny dispatch-loop
etter OPTIONAL_BOOLEAN_KEYS. Returnerer MANIFEST_OPTIONAL_TYPE hvis
profile_used finnes men ikke er string.
Forskjell fra OPTIONAL_BOOLEAN_KEYS: absence == not-present (NOT defaulted
til false, unlike boolean). Downstream-konsumenter kan dermed skille mellom
unset og empty-string.
Tester (5 nye, baseline 372 → 377):
- OPTIONAL_STRING_KEYS export drift-pin
- profile_used: economy parses successfully (SC #10 forward-compat)
- profile_used: numeric rejected
- absence: field NOT in parsed (string-key semantics)
- profile_used + skip_commit_check + memory_write co-existence
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Step 2 av v4.1-execute (Wave 1, Session 1).
Legg --profile i valued-arrayen for alle 6 voyage-kommandoer (trekbrief,
trekresearch, trekplan, trekexecute, trekreview, trekcontinue). Mønster
identisk med eksisterende --project/--brief valued-handling. Ingen endring
til parseArgs-logikk — utvider kun schema.
Tester (11 nye, baseline 361 → 372):
- 6 happy-path-tests (én per kommando)
- ARG_MISSING_VALUE for --profile uten verdi
- --profile + --quick kombo
- --profile + --gates edge-case (--gates parses inline, ikke i FLAG_SCHEMA)
- --profile + --project kombo
- trekcontinue --profile (validerer at tomt valued[] nå er utvidet)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Slash-command-parseren matcher !`...` selv inne i ```bash markdown-fences,
som gjorde at Phase 8 NEXT-SESSION-PROMPT-template eksekverte ved skill-load
med literale {project_dir}/{next_session_brief_path}/{next_session_label}/
{status}-strenger som argv. Det ga ENOENT på .session-state.local.json.tmp
og blokkerte hele /trekexecute skill-loadet.
Fjern !`...`-wrapperen og merk blokken eksplisitt som runtime-template.
Pattern matcher nå konvensjonen brukt andre steder i samme fil
(linje 202-208) der ```bash brukes for orkestrator-instruksjon uten
auto-eksekvering.
Wave 0 av v4.1-execute — pre-requisite for å låse opp /trekexecute
skill-invokasjon mot .claude/projects/2026-05-08-voyage-v4.1-modellprofiler/
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The ultra-cc-architect plugin was removed from the marketplace; voyage's
architecture-discovery contract still pointed at it by name. Replaced
verbatim references with plugin-agnostic phrasing ("upstream architect
producer") in code comments and user-facing warning messages.
CHANGELOG entries and config-audit v5.0.0 snapshots intentionally
preserved as historical records.