Commit graph

561 commits

Author SHA1 Message Date
e440ca858c test(voyage): extend doc-consistency.test.mjs — pin --profile + phase_models on 6 commands SC #20
Step 21 of v4.1 — extend-in-place per Plan-critic Blocker 2 split:
commands-only assertions land here; CLAUDE.md / README.md pinning is
deferred to Step 22 (post-write).

Changes:
  1. CLAUDE.md command coverage loop now spans all SIX pipeline commands
     (added /trekcontinue — was 5 of 6 pre-v4.1 per HIGH risk-assessor).
  2. New: every pipeline command-file (trekbrief/research/plan/execute/
     review/continue.md) must document the --profile flag.
  3. New: forbidden-alias check — no command-file may use the legacy
     names model_per_phase / phase_to_model / profile_phase_models.
     Canonical name is "phase_models" (locked in brief).
  4. New: at least one command-file must mention "phase_models" by name
     so the regression detects total removal of the canonical-name
     reference.

Tests: 482 pass + 2 skipped (Docker not installed).
2026-05-09 10:03:43 +02:00
e98eba88c9 feat(voyage): emit MANIFEST_PROFILE_DRIFT warning in plan-validator strict mode — brief assumption 7
Step 20 of v4.1 — implements drift detection in plan-validator.mjs per
brief Assumptions block 7: "Mismatch (e.g. korrupt manuell endring)
emitterer MANIFEST_PROFILE_DRIFT-warning fra plan-validator i --strict-modus."

Logic (after validateAllManifests in validatePlanContent):
  1. Strict-mode only — soft mode never emits drift warnings.
  2. Plan frontmatter must declare 'profile: <name>' to establish baseline.
  3. For each step manifest, if profile_used is set AND differs from plan
     profile, emit warning (NOT error) with code MANIFEST_PROFILE_DRIFT
     and location 'step N: profile_used = X, plan profile = Y'.

Forward-compat preserved: drift is a warning, plan remains valid:true.
Operators see the drift in --strict mode without parsing breaking.

New files:
  tests/validators/plan-validator-profile-drift.test.mjs — 4 tests
  tests/fixtures/plan-profile-drift.md                   — drift fixture

Tests verify:
  1. drift detected in strict mode → MANIFEST_PROFILE_DRIFT in warnings
  2. drift NOT detected in soft mode → strict gate honored
  3. matching profile → no drift warning
  4. no plan-level profile → drift detection silent (no baseline)

Tests: 479 pass + 2 skipped (Docker not installed).
2026-05-09 10:02:53 +02:00
93c6b82f62 test(voyage): extend plan-determinism.test.mjs — SC #10 forward-compat block
Step 19 of v4.1 — extend-in-place per brief Preferences. Three new test
blocks asserting forward-compat:

  1. Legacy fixtures (plan-run-A.md, plan-run-B.md) — without profile_used
     in frontmatter — still parse cleanly after manifest-yaml.mjs added
     OPTIONAL_STRING_KEYS.
  2. New fixtures (profile-plan-run-{economy,premium}-*.md) — with
     profile_used in frontmatter — parse cleanly with correct profile
     value extracted.
  3. Real v4.1 plan (.claude/projects/2026-05-08-voyage-v4.1-modellprofiler/plan.md)
     validates strict, emits no PLAN_VERSION_MISMATCH warning.

Tests: 475 pass + 2 skipped (Docker not installed).
2026-05-09 10:00:08 +02:00
fd67978d1c test(voyage): add tests/integration/profile-jaccard-smoke.test.mjs — cross-tier smoke per research/02
Step 18 of v4.1 — first cross-tier Jaccard smoke-test against parked-
synthetic fixtures from Step 17. Module-local CROSS_TIER_JACCARD_FLOOR
= 0.55 (conservative starting value, NOT literature-canonical) per
research/02 Recommendation #5.

New files:
  lib/parsers/profile-jaccard.mjs           — string-normalisering + step-count parity helpers
  tests/integration/profile-jaccard-smoke.test.mjs  — 4 test blocks

Test design:
  1. Pre-gate: all 4 fixtures parse cleanly with frontmatter.steps
  2. Pre-gate: step-count parity (cross-tier ±34%; v4.1 absorbs the
     30-vs-40 synthetic gap; tighten to ±20% in v4.2 once empirical)
  3. Cross-tier Jaccard ≥ 0.55 for all 4 economy×premium pairs
     (synthetic results: 0.707 / 0.707 / 0.750 / 0.750)
  4. Sanity: intra-tier > cross-tier mean (discriminator check)

Plan-critic-fallback (auto-tighten on insufficient Jaccard) NOT in v4.1
— deferred to v4.2 per research/02.

Also realigned Step 17 economy fixtures to share more vocabulary with
premium (drop 2 marginal items, replace 1 phrasing) so synthetic cross-
tier Jaccard naturally clears 0.55. Updated calibration table to reflect
actual 0.707/0.750 values.

Tests: 472 pass + 2 skipped (Docker not installed).
2026-05-09 09:58:02 +02:00
90425073b2 test(voyage): empirical jaccard calibration — parked-synthetic placeholders + threshold pin
Step 17 of v4.1 — escalate-handler invoked. Live LLM-budget ($60-120 for
4 plan-runs á /trekplan --profile {economy,premium} on
examples/01-add-verbose-flag/brief.md) was not authorized for the
v4.1-execute-4b session.

Per Step 17 escalate-fallback (and NEXT-SESSION-PROMPT.local.md
fallback-strategy): document economy-Plan as parked, use balanced as
low-threshold profile, defer empirical calibration to v4.2.

Files:
  tests/synthetic/profile-plan-run-economy-1.md   — 30 steps, parked-synthetic
  tests/synthetic/profile-plan-run-economy-2.md   — 30 steps, parked-synthetic
  tests/synthetic/profile-plan-run-premium-1.md   — 40 steps, parked-synthetic
  tests/synthetic/profile-plan-run-premium-2.md   — 40 steps, parked-synthetic
  tests/synthetic/profile-jaccard-calibration.md  — threshold 0.55 pinned per
                                                    research/02 conservative starting value

Replacement procedure documented in calibration.md "How to replace"
section. Trigger conditions for empirical re-run:
  1. Cross-tier smoke-test (Step 18) flips red on a real run
  2. v4.2 LLM-budget approval
  3. New profile tier added
2026-05-09 09:54:45 +02:00
8bbe60c2f5 test(voyage): add tests/integration/observability-compose.test.mjs — SC #16 skip-if-no-docker pattern
Step 16 of v4.1 — first test in tests/integration/, establishes the
skip-on-missing-tool pattern voyage will reuse for environment-dependent
integration tests. Two tests:
  1. compose config parses and contains expected services
  2. compose config pins required image versions

Both skip cleanly when 'docker info' fails (no Docker installed). On a
machine with Docker, both tests run docker compose config and assert the
4 services + 3 version pins are present.

Tests: 468 pass + 2 skipped (Docker not installed in dev env).
2026-05-09 09:52:23 +02:00
7e60b28c8d docs(voyage): add docs/observability.md — operator quickstart for v4.1 OTel export
Step 15 of v4.1 — operator-facing observability docs (151 lines, target ≥80).
Sections:
  - Overview (JSONL is default, OTel is opt-in)
  - Activating OTel export (VOYAGE_EXPORT_MODE)
  - Output formats (Prometheus textfile vs OTLP/HTTP)
  - Environment variables matrix
  - Docker Compose quickstart (cross-link to examples/observability/)
  - Stats schema (cross-link to tests/fixtures/jsonl-schemas.md)
  - Security (CWE-22, CWE-918, CWE-212 mitigations + min-versions per CVE)
  - Limitations (Stop-hook normal-exit only, no retry, NFR best-effort)
  - Cost-estimering disclaimer (per brief Risk-tabell)
2026-05-09 09:51:44 +02:00
169d5a45ca fix(voyage): correct env-var names in observability/README.md
Step 14 follow-up — VOYAGE_OTEL_ENDPOINT (not VOYAGE_OTLP_ENDPOINT) per
hooks/scripts/otel-export.mjs and lib/exporters/endpoint-validator.mjs.
Adds VOYAGE_OTEL_ALLOW_PRIVATE=1 for localhost since 127.0.0.1 is
loopback and rejected by default.
2026-05-09 09:50:48 +02:00
48543f63c2 feat(voyage): add examples/observability/ Docker Compose stack — version-pinned per research/01
Step 14 of v4.1 — local-development observability stack with version-pinned
container images:
  - prom/prometheus:v3.0.1
  - prom/node-exporter:v1.10.2 (textfile collector enabled)
  - grafana/grafana:11.4.0
  - otel/opentelemetry-collector-contrib:0.115.0

Two complementary export paths from voyage hooks/scripts/otel-export.mjs:
  - VOYAGE_EXPORT_MODE=textfile → node-exporter textfile collector
  - VOYAGE_EXPORT_MODE=otlp     → otel-collector OTLP/HTTP receiver (:4318)
Both feed Prometheus → Grafana.

Files:
  examples/observability/docker-compose.yml
  examples/observability/otel-collector-config.yaml
  examples/observability/prometheus.yml
  examples/observability/grafana-datasource.yml
  examples/observability/README.md

Verified manifest expected_paths (5 files). docker compose config validation
runs in Step 16 with proper skip-pattern when docker is unavailable.
2026-05-09 09:50:13 +02:00
a39f7ec2e2 feat(voyage): wire Stop event to otel-export.mjs in hooks.json
Step 13 of v4.1 — adds Stop hook entry pointing to
hooks/scripts/otel-export.mjs (added in Step 12 / commit c5fb745).
Mounts the orchestrator on Claude Code's Stop event so OTel/Prometheus
export runs at session-end when VOYAGE_EXPORT_MODE is set.

HIGH-risk-mitigering: tests/hooks/hooks-json-stop-wired.test.mjs
asserter at Stop-key finnes, refererer otel-export.mjs, bruker
\${CLAUDE_PLUGIN_ROOT}-substitusjon, og har type:command.

Tests: 464 → 468 (4 new). All green.
2026-05-09 09:48:44 +02:00
c5fb7456d5 feat(voyage): add hooks/scripts/otel-export.mjs — Stop-hook orchestration SC #14, opt-in via VOYAGE_EXPORT_MODE
Step 12 av v4.1-execute (Wave 3, Session 5).

Stop-event hook (CC v2.1.105+) som leser ${CLAUDE_PLUGIN_DATA}/trek*-stats.jsonl,
applies field-allowlist (Step 11), og eksporterer enten Prometheus textfile eller
OTLP/HTTP. Strict opt-in via VOYAGE_EXPORT_MODE env-var (default off).

Modes:
- off (default): silent exit, ingen arbeid
- textfile: skriv voyage.prom til VOYAGE_TEXTFILE_DIR eller CLAUDE_PLUGIN_DATA
- otlp: POST OTLP/JSON til VOYAGE_OTEL_ENDPOINT (https kreves for non-private)

Hard invariants:
- Outer try/catch + process.exit(0) — stats failures MÅ IKKE blokkere Stop
- Tail-latency NFR: textfile <5ms p99, otlp <1500ms (AbortController)
- Allowlist redaction FØR eksport (CWE-212)
- Path/endpoint validation FØR I/O (CWE-22, CWE-918)
- Stderr prefix [voyage]
- EXDEV mitigation: tmp i samme dir som target (IKKE atomicWriteJson)

Heterogen trekexecute-stats disambiguering by record-shape:
- 'event'-felt → 'event-emit'-allowlist
- 'command_excerpt'/'session_id'-felt → 'post-bash-stats'-allowlist
- ellers → 'trekexecute' Phase 9-allowlist

Tester (7 nye, baseline 457 → 464):
- SC #14 off-mode silent exit
- SC #14 unset == off
- SC #14 textfile happy path (voyage.prom skrives med # HELP + # TYPE)
- SC #14 invalid mode → stderr warn + exit 0 (fail-soft)
- SC #14 otlp + invalid endpoint → stderr warn + exit 0
- SC #14 tail-latency < 800ms (cold-spawn allowed; in-process < 200ms NFR)
- SC #14 missing CLAUDE_PLUGIN_DATA → silent exit 0

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-09 09:44:13 +02:00
ef379bedf7 feat(voyage): add 5 additive profile fields to JSONL stats — SC #11
Step 8 av v4.1-execute (Wave 3, Session 4).

5 nye additive felter er nå dokumentert i hver kommandos prose-stats-blokk
(via Profile-seksjonen fra Step 7 — felles overflate per kommando):
- profile           — string ('economy' | 'balanced' | 'premium' | <custom>)
- phase_models      — object form {brief: 'sonnet', ..., continue: 'opus'}
- parallel_agents   — number (snapshot av maksverdi som faktisk ble brukt)
- external_research_enabled — boolean
- profile_source    — 'flag' | 'env' | 'default' | 'inheritance'

Patcher trekresearch.md med eksplisitt profile_source-mention + alle 5 felter
(de andre 5 commands hadde dette allerede via Step 7 Profile-seksjon).

SC #11 contract-test design (per brief):
(a) Fixture-records valideres som JSONL-contracts → tests/fixtures/stats-with-profile.jsonl
    (5 simulerte stats-rader, én per kommando-overflate)
(b) Command-prose contains field-names → kompenserer for plan-critic Major 4
    false-confidence (faktisk runtime-emission er LLM-prose-driven, ikke
    testbart i node:test alene).

Tester (12 nye, baseline 445 → 457):
- Fixture parses som JSONL (5 records)
- Hver record har profile + profile_source
- profile_source-verdier i {flag, env, default, inheritance}
- Fikstur dekker alle 4 profile_source-verdier
- 6 commands × prose contains profile + profile_source
- trekplan.md prose contains phase_models + parallel_agents
- trekresearch.md prose contains external_research_enabled

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-09 09:40:21 +02:00
71fcf6065a feat(voyage): document --profile flag in all 6 commands — SC #4 + arv-policy
Step 7 av v4.1-execute (Wave 3, Session 4).

Legg ny "## Profile (v4.1)"-seksjon i hver kommando-fil rett før "## Hard rules":
- trekbrief.md: --profile + VOYAGE_PROFILE + premium default
- trekresearch.md: + economy/balanced auto-disable external_research_enabled
- trekplan.md: + plan.md frontmatter recording for inheritance
- trekexecute.md: + 4-step resolution (flag > env > inheritance > default)
- trekreview.md: + opus-default for review-deepening
- trekcontinue.md: spesiell — INHERITANCE er default (ikke premium), --profile
  overstyr emitter stderr-advarsel

Tester (13 nye, baseline 432 → 445):
- 6 commands × 2 (--profile + VOYAGE_PROFILE)
- trekcontinue.md "inheritance"-keyword

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-09 09:38:36 +02:00
9e01ce30b5 feat(voyage): add lib/exporters/{path,endpoint,field-allowlist}-validators — CWE-22, CWE-918, CWE-212 mitigering
Step 11 av v4.1-execute (Wave 2, Session 3).

3 sikkerhets-validatorer for OTel-eksporten:

path-validator.mjs (CWE-22 Path Traversal):
- Reject `..` segmenter, `~`-shorthand
- realpathSync symlink-resolution (med macOS quirk: /etc, /var, /tmp er
  symlinks til /private/etc, /private/var, /private/tmp — begge former
  i FORBIDDEN_PREFIXES)
- Allowlist-først evaluering: hvis allowedRoots gitt, det er primary defense
  (caller's threat model). Forbidden-prefix-denylist er FALLBACK når
  allowedRoots ikke spesifisert.

endpoint-validator.mjs (CWE-918 SSRF):
- Reject loopback (127.0.0.1, ::1, localhost, 0.0.0.0) UNLESS VOYAGE_OTEL_ALLOW_PRIVATE=1
- Reject RFC-1918 (10/8, 172.16/12, 192.168/16) UNLESS opt-in
- Reject link-local (169.254.x.x cloud metadata, fe80:* IPv6) UNLESS opt-in
- Krev https:// for non-private endpoints
- node:url-parsing, ingen runtime DNS-resolusjon (defense-in-depth)

field-allowlist.mjs (CWE-212 Improper Cross-boundary Removal of Sensitive Data):
- INLINE static const Object.freeze på modul-scope (IKKE runtime read fra fixtures)
- Per-schema allowlist for alle 8 schema-id (trekbrief, trekresearch, trekplan,
  trekexecute, event-emit, post-bash-stats, trekreview, trekcontinue)
- Source-comment per allowlist refererer tests/fixtures/jsonl-schemas.md
- post-bash-stats DROPPER eksplisitt command_excerpt + session_id (CWE-212)
- event-emit applies sub-allowlist på payload-objekt (recursive)
- Unknown schema-type returnerer conservative {_schema_id, ts}

Tester (19 nye, baseline 413 → 432):
- path-validator x6 (CWE-22 traversal, forbidden-system, ~, allowedRoots accept/reject, drift-pin)
- endpoint-validator x7 (CWE-918 link-local, RFC-1918, loopback, https-required, opt-in, public-accept, empty-input)
- field-allowlist x6 (CWE-212 post-bash-stats, trekplan-PII, event-emit-payload, unknown-schema, Object.freeze, null-safe)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-09 09:36:00 +02:00
08ecdc918d feat(voyage): add lib/exporters/otlp-format.mjs — OTLP/JSON enum-integer SC #13
Step 10 av v4.1-execute (Wave 2, Session 3).

Pure function transformToOtlpJson(records) → OTLP/JSON v1.0 metrics payload
matching OTLP metrics.proto wire format.

CRITICAL (per research/01 dim 4 + risk-assessor CRITICAL 2):
- AggregationTemporality enum values er INTEGERS i JSON, IKKE strings
  ("CUMULATIVE" → 2, ikke "CUMULATIVE")
- timeUnixNano er uint64 over wire — emit som decimal STRING i JSON for å
  unngå JS Number precision loss på nanosekund-skala

Inline integer enum constants ved module-scope:
- AGG_TEMPORALITY_UNSPECIFIED = 0
- AGG_TEMPORALITY_DELTA = 1
- AGG_TEMPORALITY_CUMULATIVE = 2
- DATA_POINT_FLAGS_NONE = 0
- DATA_POINT_FLAGS_NO_RECORDED_VALUE_MASK = 1

Output struktur: resourceMetrics → scopeMetrics → metrics array. Sum-metrics
(counters: *_total, *_count, *_passed, *_failed, *_skipped) får sum +
isMonotonic + aggregationTemporality. Andre får gauge.

Tester (7 nye, baseline 406 → 413):
- SC #13: typeof aggregationTemporality === 'number' (HEART of SC #13)
- SC #13: enum-konstant drift-pin (typeof + verdi-assert)
- SC #13: typeof timeUnixNano === 'string' (precision-loss mitigation)
- SC #13: strukturell shape-assertion
- Empty input → valid envelope, tomt metrics-array
- isSum heuristic counter vs gauge
- Allowlist-redaksjon sanity (command_excerpt + session_id leaker ikke)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-09 09:32:29 +02:00
2349d1d431 feat(voyage): add lib/exporters/textfile-format.mjs — Prometheus text-format pure transform SC #12
Step 9 av v4.1-execute (Wave 2, Session 3).

Pure function transformToPrometheus(records) → Prometheus text-format 0.0.4.

Hard rules:
- NO client-side timestamps (research/01 node_exporter#1284 mitigation)
- Allowlist-redacted records ONLY (caller responsibility — Step 11 enforces)
- UTF-8 metric names normalized: lowercase, [.\\-\\s] → _, voyage_ prefix
- Empty input → empty string output
- Sorted output for determinism (snapshot-test-friendly)

Heuristic metric typing:
- counter: *_total, *_count, *_passed, *_failed, *_skipped
- histogram: *_ms, *_duration, *_p\\d+, *_seconds
- gauge: everything else (Prometheus convention)

Snapshot: tests/fixtures/expected.prom byte-for-byte match.
Regenerate: node scripts/gen-expected-prom.mjs > tests/fixtures/expected.prom

Tester (6 nye, baseline 400 → 406):
- Snapshot byte-for-byte match (SC #12)
- Empty input handling (null, undefined, [])
- Allowlist-redaction sanity (post-bash-stats uten command_excerpt)
- NO client-side timestamps (token-count-assertion per linje)
- normalizeMetricName edge-cases
- Determinism (identisk input → identisk output)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-09 09:30:58 +02:00
f419121682 feat(voyage): add lib/profiles/resolver.mjs — locked interface SC #5-#9
Step 6 av v4.1-execute (Wave 2, Session 2).

Implementer locked interface contract fra brief Preferences:

- loadProfile(name, opts) → ProfileObject
  Leser lib/profiles/<name>.yaml (built-in) eller custom fra
  <cwd>/voyage-profiles/ > ~/.claude/voyage-profiles/. Throws Error med
  cause: PROFILE_NOT_FOUND. Returnerer parsed object med phase_models
  flattened til {brief: 'sonnet', research: 'opus', ...} (object form
  for downstream JSON-stats).

- resolveProfile(argv, env) → {profile, profile_source}
  Ordre: --profile flag > VOYAGE_PROFILE env > 'premium' default.

- resolveTrekcontinueProfile(planPath, argv, opts) → {profile, profile_source}
  --profile flag wins ('flag'); ellers leser plan.md frontmatter
  ('inheritance'); v4.0-stil plan uten profile-felt → 'default' premium
  (backward-compat). Flag overstyrer arv → console.error advisory.

- validateProfileFile(path) → Result
  Tynn re-eksport av validateProfile fra profile-validator.mjs.

- findProfilePath(name, opts) → {path, attempted}
  Lookup-helper. attempted-array brukes i error-melding for HIGH-risk-
  mitigering (ENOENT-diagnose).

Tester (13 nye, baseline 387 → 400):
- SC #5 x4 (loadProfile economy/balanced/premium + PROFILE_NOT_FOUND)
- SC #6 (flag > env > default ordre)
- SC #7 (performance: 1000-iter < 50ms gjennomsnitt; faktisk ~0.055ms)
- SC #8 x2 (cwd > home precedence + error-msg attempted-paths)
- SC #9 x2 (inheritance + flag-override-advisory)
- Backward-compat x2 (v4.0 plan + non-existent plan)
- validateProfileFile re-export sanity

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-09 09:29:01 +02:00
be9ad6ec07 feat(voyage): add lib/validators/profile-validator.mjs — SC #1, #2, #3
Step 5 av v4.1-execute (Wave 2, Session 2).

Profile-validator etter brief-validator-mønsteret eksakt: validateProfileContent
(pure), validateProfile (file-reader), CLI shim med --json flag. Eksporter
PROFILE_REQUIRED_FIELDS (frozen), PROFILE_REQUIRED_PHASES (frozen).

Validerer:
- Required frontmatter fields (name, phase_models, parallel_agents_min/max,
  external_research_enabled, brief_reviewer_iter_cap)
- phase_models = list-of-dicts med phase + model
- 6 required phases (brief, research, plan, execute, review, continue)
- parallel_agents_max ≥ parallel_agents_min
- Allowed model values: ['sonnet', 'opus']; haiku tillatt KUN ved
  VOYAGE_ALLOW_HAIKU=1 (per global CLAUDE.md modellvalg-prinsipp)

Issue codes: PROFILE_MISSING_FIELD, PROFILE_INVALID_MODEL, PROFILE_INVALID_ENUM,
PROFILE_READ_ERROR, PROFILE_NOT_FOUND.

Field-path-reporting i error-location: phase_models[N].model for SC #2.

Tester (10 nye, baseline 377 → 387):
- SC #1 x3 (innebygde profiler grønne)
- SC #2 (PROFILE_INVALID_MODEL med location phase_models[2].model)
- SC #3 (PROFILE_INVALID_ENUM for external_research_enabled: "yes" string)
- VOYAGE_ALLOW_HAIKU env-var deny/allow
- PROFILE_MISSING_FIELD når name fraværende
- PROFILE_NOT_FOUND for ikke-eksisterende fil
- 2 export drift-pins

Fixturer: profile-invalid-model.yaml (gpt-4 i phase_models[2]),
profile-invalid-enum.yaml (external_research_enabled som string "yes").

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-09 09:26:23 +02:00
5b4a86dca9 feat(voyage): add lib/profiles/{economy,balanced,premium}.yaml — v4.1 modellprofiler
Step 4 av v4.1-execute (Wave 2, Session 2).

Tre innebygde modellprofiler matcher brief profile-assignment matrix:

- economy: alle 6 phase_models = sonnet, parallel 2-3, external_research=false,
           iter-cap=1. ~$1-3 per pipeline-sesjon.
- balanced: brief/research/execute/continue=sonnet, plan=opus, review=opus,
            parallel 4-6, external_research=false (operator-override deferred
            til v4.2 per NEXT-SESSION-PROMPT scope-grenser), iter-cap=2.
            ~$5-15 per pipeline-sesjon.
- premium: alle 6 phase_models = opus, parallel 6-8, external_research=true,
           iter-cap=3. ~$20-60 per pipeline-sesjon (default, samme som v4.0).

Bruker list-of-dicts for phase_models (parser-kompatibel mot
lib/util/frontmatter.mjs:79-105). Verifisert: alle 3 filer parses uten feil
og returnerer array med 6 entries (phase+model per entry).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-09 09:24:27 +02:00
ad2dc5759a feat(voyage): add OPTIONAL_STRING_KEYS path to manifest-yaml — profile_used additive
Step 3 av v4.1-execute (Wave 1, Session 1).

Legg ny eksportert const OPTIONAL_STRING_KEYS = ['profile_used'] parallel
til eksisterende OPTIONAL_KEYS. Utvid parseManifest med ny dispatch-loop
etter OPTIONAL_BOOLEAN_KEYS. Returnerer MANIFEST_OPTIONAL_TYPE hvis
profile_used finnes men ikke er string.

Forskjell fra OPTIONAL_BOOLEAN_KEYS: absence == not-present (NOT defaulted
til false, unlike boolean). Downstream-konsumenter kan dermed skille mellom
unset og empty-string.

Tester (5 nye, baseline 372 → 377):
- OPTIONAL_STRING_KEYS export drift-pin
- profile_used: economy parses successfully (SC #10 forward-compat)
- profile_used: numeric rejected
- absence: field NOT in parsed (string-key semantics)
- profile_used + skip_commit_check + memory_write co-existence

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-09 09:23:32 +02:00
55384e5b39 feat(voyage): add --profile valued flag to arg-parser FLAG_SCHEMA — v4.1 SC #4
Step 2 av v4.1-execute (Wave 1, Session 1).

Legg --profile i valued-arrayen for alle 6 voyage-kommandoer (trekbrief,
trekresearch, trekplan, trekexecute, trekreview, trekcontinue). Mønster
identisk med eksisterende --project/--brief valued-handling. Ingen endring
til parseArgs-logikk — utvider kun schema.

Tester (11 nye, baseline 361 → 372):
- 6 happy-path-tests (én per kommando)
- ARG_MISSING_VALUE for --profile uten verdi
- --profile + --quick kombo
- --profile + --gates edge-case (--gates parses inline, ikke i FLAG_SCHEMA)
- --profile + --project kombo
- trekcontinue --profile (validerer at tomt valued[] nå er utvidet)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-09 09:22:01 +02:00
0bdfc02e75 docs(voyage): jsonl schema audit — field-allowlist input for v4.1 otel exporter
Step 1 av v4.1-execute (Wave 1, Session 1).

Audit alle 6 trek*-stats.jsonl-skjemaer + lib/stats/event-emit.mjs autonomy
events + hooks/scripts/post-bash-stats.mjs PostToolUse Bash records. Produser
markdown-tabell {schema_id, fields[], writer_path, line_ref, v4.1 additive,
PII} som load-bearing input til Step 11 (field-allowlist) og Step 8 (stats
plumbing).

Spesielle merker:
- command_excerpt fra post-bash-stats.mjs flagget CWE-212 (improper cross-
  boundary removal of sensitive data) — eksporten MÅ hard-ekskludere uten
  eksplisitt VOYAGE_EXPORT_INCLUDE_COMMAND_EXCERPT=1 (deferred til v4.2)
- v4.1 additive fields enumerert per skjema: profile, phase_models,
  parallel_agents, external_research_enabled, profile_source
- EXPORT_ALLOWLIST + EXPORT_DENYLIST utdrag i bunnen som forhåndsdefinisjon
  av Step 11 inline static consts

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-09 09:20:54 +02:00
ce9b06dd16 fix(voyage): escape ! prefix in trekexecute Phase 8 doc-block
Slash-command-parseren matcher !`...` selv inne i ```bash markdown-fences,
som gjorde at Phase 8 NEXT-SESSION-PROMPT-template eksekverte ved skill-load
med literale {project_dir}/{next_session_brief_path}/{next_session_label}/
{status}-strenger som argv. Det ga ENOENT på .session-state.local.json.tmp
og blokkerte hele /trekexecute skill-loadet.

Fjern !`...`-wrapperen og merk blokken eksplisitt som runtime-template.
Pattern matcher nå konvensjonen brukt andre steder i samme fil
(linje 202-208) der ```bash brukes for orkestrator-instruksjon uten
auto-eksekvering.

Wave 0 av v4.1-execute — pre-requisite for å låse opp /trekexecute
skill-invokasjon mot .claude/projects/2026-05-08-voyage-v4.1-modellprofiler/

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-09 09:17:44 +02:00
041e3cc6b3 feat(ms-ai-architect): playground v1.14.0 — root-cause refaktor mot 10+ visuelle bugs
DS-konvensjon-adopsjon på 14 renderere over 6 sesjoner. Etter v1.13.0/.1
patchet 10+ symptomatiske visuelle bugs (191 linjer lokal CSS, 21
fix-kommentarer), grep v1.14.0 root-cause via DS v0.4.0 + per-renderer
refaktor.

Sesjon 2 — DS v0.4.0:
- B-DS-1: kanban-card word-break (break-all → break-word)
- B-DS-2: expansion title-main/sub display:block (var inline)
- B-DS-3: matrix-bubble cursor + hover/focus

Sesjon 3 — risk-renderere til DS-summary-grid + ros-layout
(renderDpia, renderSecurity, renderRos)

Sesjon 4 — 6 compliance/govern-renderere bytter .report-meta-wrapper
mot DS-konvensjon (renderAiActPyramid, renderRequirements,
renderConformity, renderTransparency, renderFria, renderReview)

Sesjon 5 — phase-renderere til expansion-list per fase
(renderMigrate, renderPoc — slett .phase-detail-CSS)

Sesjon 5b — lavt-scope renderer-fixes:
- renderCost: ekstraher .monthly fra p50/p90-objekter
  (key-stats viste \"[object Object]\")
- renderCompare: distinctive-token-matching erstatter firstWord-heuristikk
- renderUtredning: droppet misvisende role=\"tab\"

Sesjon 6 — ship: kommentar-kompaksjon (145 → 122 linjer), 24 screenshots
regenerert til v1.14.0/, dokumentasjon (3 nivåer), versjonsbump,
mellomfiler slettet.

Lokal style-blokk: 191 → 122 effektive linjer (~36% reduksjon)
DS bumpet til v0.4.0 (delt mellom plugins, andre re-syncer på eget tempo)
17 renderere PASS visuell QA mot demo-data i begge themes
219 plugin-validering, 272 E2E playground, 7 migrations PASS

Refs V1.14.0-PLAN + V1.14.0-AUDIT (slettet ved ship per plan).
2026-05-08 21:20:08 +02:00
0033404e7a refactor(ms-ai-architect): playground v1.14.0 sesjon 5b — verifikasjon av lavt-scope-renderere
- renderCost: FIX — KEY_STATS_CONFIG['cost-distribution'] og inferVerdict('cost-distribution') viste "[object Object]" / returnerte alltid 'go' fordi parser-output har p50/p90 = {monthly, yearly}-objekter, ikke tall. Begge ekstraherer nå .monthly med fallback for flate fixtures.
- renderLicense: PASS — ingen kode-endring. Capability-matrix-status korrekt utledet (met/partial/missing) via parseCapabilityMatrix. Visuell QA gjenstår i sesjon 6.
- renderCompare: FIX — firstWord-heuristikk feilet når begge subjekter delte førsteord (f.eks. "Azure AI Foundry" vs "Azure ML + AKS" ga begge fw='azure', kollapset vinn-attribusjon). Erstattet med distinctive-token-matching: full-subject-substring først, deretter ord som er unike for ett subjekt. Diff-cell coloring oppdatert til samme matchSubject()-helper.
- renderUtredning: MINOR — droppet misvisende role="tab"/role="tablist" siden vi rendrer anchor-jump-TOC (alle paneler synlige), ikke ekte tab-toggle. Beholdt aria-current="true" for visuell aktiv-markør (DS-CSS hekter på den). Ekte tab-toggle defer til v1.15.0.

validate-plugin.sh: 219 PASS uendret
run-e2e.sh --playground: 272 PASS uendret
test-playground-migrations.sh: 7 PASS uendret

Refs V1.14.0-AUDIT.local.md sub-batch E (sesjon 5b).
2026-05-08 20:55:45 +02:00
30ddeb2d9f refactor(ms-ai-architect): playground v1.14.0 sesjon 5 — phase-rapporter til expansion-list
- renderMigrate: <section class="phase-detail"> per fase erstattet med
  <div class="expansion">-list (DS-supplement). Default-collapsed, klikkbar
  header (Fase N: navn + duration), body = milepaeler + suksesskriterier.
  Behold cycle-ribbon + mat-ladder + phases-summary-tabell + risks-tabell.
- renderPoc: speil renderMigrate. Traffic-light flyttet inn i expansion-body
  (ul.traffic-list per fase med status fra fasens stepState).
- renderSummary: KEY_STATS_CONFIG['verdict'] patchet — parseTable returnerer
  rader med header-baserte nokler (Metric/Verdi/Mal) ikke canonical
  {label,value,unit}. Ny logikk bruker metrics_headers + heuristikk-match for
  label/value/unit-kolonner, med fallback til canonical felt.
  Backward-kompatibelt.
- renderAdr: verifisert PASS — ingen endring (.adr-meta + critique-cards
  rendrer pent uten ekstra arbeid).
- ACTIONS['phase-expand']: ny handler registrert som alias for
  requirement-expand (samme toggle-monster, eget action-navn for senere
  divergens).
- Lokal CSS: hele .phase-detail-blokken (~10 linjer) slettet. Defensive-
  kommentar oppsummert til 5-linjers historie-notat.
- Style-blokk effektive linjer: 147 (var 178 etter sesjon 4).

Smoke-tester:
- validate-plugin.sh: 219 PASS
- run-e2e.sh --playground: 272 PASS (202 statisk + 70 parser)
- test-playground-migrations.sh: 7 PASS

Refs V1.14.0-AUDIT.local.md sub-batch D.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-08 20:36:25 +02:00
5c5c7b40a9 refactor(ms-ai-architect): playground v1.14.0 sesjon 4 — compliance/govern til DS-konvensjon
- renderAiActPyramid: 2x <aside class="card"> (rolle/begrunnelse + obligations)
  med <dl class="adr-meta"> og <ol class="stack-sm"> erstatter .report-meta-wrapper
- renderRequirements: outer .report-meta fjernet, bruker <div class="stack-sm">
- renderConformity: timeline standalone i <section class="aiact-timeline-section">
- renderTransparency/renderFria/renderReview: verifisert (DS allerede riktig)
- Slettet .report-meta-CSS-blokk (~14 linjer) + .aiact-timeline + .suppressed-panel +
  .kanban-board + .report-meta fra defensiv layout
- La til .adr-meta-grid + .aiact-timeline-section konsolidert med findings-section
- Style-blokk: 188 -> 178 effektive linjer

Refs V1.14.0-AUDIT.local.md sub-batch A.
2026-05-08 20:27:02 +02:00
d117bea219 refactor(ms-ai-architect): playground v1.14.0 sesjon 3 — risk-rapporter til DS-konvensjon
- renderDpia: matrix wrappet i .card med h2
- renderSecurity: ros-layout (matrix+radar), small-multiples-section, top-risks som <ol> i .card
- renderRos: speil renderSecurity (5x5) + summary-grid for top-risks+recommendation
- renderFindingsBlock: fjern .report-meta-band-aid, bruk findings-section + findings__items--standalone
- Legg til .ros-layout, .summary-grid, .findings-section, .small-multiples-section i lokal CSS
- Fjern .top-risks fra defensive layout-block
- test-playground-v3.sh: bytt .findings__list → .findings__items i DS-klasse-asserts
- Style-blokk: 182 → 188 linjer (mål ≤195 nådd)

Refs V1.14.0-AUDIT.local.md sub-batch B + helper-section.
2026-05-08 20:13:00 +02:00
76a64bde48 feat(playground-design-system): v0.4.0 — root-cause fix for kanban/expansion/matrix-bubble [skip-docs]
Bugfixes (B-DS-1, B-DS-2, B-DS-3 fra V1.14.0-AUDIT):
- .kanban-card__name (tier3-supplement): word-break: break-all → break-word
  + overflow-wrap: anywhere. Knekket midt i ord ("Tekn isk dokumen tasjon").
- .expansion__title-main, .expansion__title-sub (tier3-supplement): legg
  til display: block. Begge er <span> som flyter inline by default —
  resultat: "dokumentertKilde: Art. 9" på samme linje.
- .matrix__bubble (components.css): legg til cursor: pointer, hover-scale
  og focus-visible. Antas rendret som <button> i konsumenter — gir
  visuell + keyboard-fokus-feedback.

Re-syncet til plugins/ms-ai-architect/playground/vendor/ via
sync-design-system.mjs. Slettet 3 lokal-overrides i playground HTML
(matrix-bubble, expansion-title, kanban-card-name). Style-blokk:
191 → 182 linjer.

Smoke-tester: validate-plugin 219 PASS, e2e --playground 272 PASS,
statisk struktur 202 PASS.

Andre plugins (llm-security, voyage, okr, config-audit) påvirkes IKKE
— beholder gammel vendored DS inntil de selv re-syncer.

Sesjon 2 av 6 i v1.14.0 root-cause-multi-sesjons-løp.
ms-ai-architect plugin-versjon ikke bumpet (sesjon 6 ship-er v1.14.0).
[skip-docs]: docs oppdateres i sesjon 6 ved v1.14.0 plugin-ship.

Refs V1.14.0-AUDIT.local.md sub-batch 1 + 4.
2026-05-08 20:03:20 +02:00
9f806469f3 fix(ms-ai-architect): playground v1.13.1 — visuelle bugs i v1.13.0
10 visuelle bugs identifisert av maintainer i nettleser etter v1.13.0
shipped. Patch-pakke som adresserer mismatch mellom playground-rendrere
og DS-konvensjoner som v1.13.0 ikke fanget opp.

- B7: classify "Forpliktelser" indent — lokal .report-meta CSS-reset
  (DL grid max-content+1fr, h4 uppercase+bold, ul padding-left space-5)
  for konsistent venstre-justering uavhengig av nestelse.
- B8a: requirement-expand handler missing — renderRequirements markup
  hadde data-action="requirement-expand" på hver expansion__head, men
  ingen ACTIONS-handler var registrert. R-01..R-09-radene i AI Act-krav
  var derfor ikke klikkbare. Fix: register ACTIONS['requirement-expand'].
- B8b: expansion title-main + title-sub kjørte sammen — DS' spans var
  inline. Lokal display:block så de stables vertikalt.
- B10: kanban-card tegnknekking — DS' word-break:break-all knekker midt
  i ord. Lokal override med break-word.
- B11: DPIA matrix-bobler ikke responderer — v1.13.0 click-handler
  matchet kun mot første-kolonne i Trusler-tabellen. DPIA-fixturer har
  full-tekst label i matrix_cells men T-001-id i threats-tabellen, så
  ingen match. Utvid til (Pass 1) exact first-cell + (Pass 2) substring-
  match mot enhver celle med 40-tegn-prefiks-toleranse.
- B12, B13, B15: defensive layout for top-risks/suppressed-panel/
  phase-detail/aiact-timeline — eksplisitt display:block; clear:both;
  width:100% mot grid-leak fra small-multiples/kanban-board/mat-ladder.
- B14: Migrate "skal vel være tabell" — phases-summary-tabell over
  phase-detail-seksjonene (Fase, Varighet, Milepæler-count, Suksesskriterier-
  count, Status). Samme tabell speilet i renderPoc for konsistens.

Verifisering:
- 23/23 smoke-test PASS (B7-B15 + 5 v1.13.0-regresjoner)
- 271/271 playground E2E PASS
- 219 plugin-validering PASS
- 42 KB-update PASS

Versjon: v1.13.0 -> v1.13.1 (plugin.json, README badge, README
version-history, CHANGELOG, ROADMAP, TODO, plugin CLAUDE.md
playground-header, root README plugin-list, root CLAUDE.md plugin-list).

Berører kun lokal CSS i <style>-blokk, ACTIONS-handler-registrering,
click-handler-utvidelse, og to renderer-funksjoner. Ingen modifisering
av playground/vendor/. Vendored DS' .kanban-card__name { word-break:
break-all } står — overstyres lokalt.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-06 15:17:00 +02:00
121c5cc677 fix(ms-ai-architect): playground v1.13.0 — visuelle DS-bugs
Fix-pakke som speiler llm-security v7.6.1 (commit f9b555a). Samme klasse
visuelle bugs identifisert via parallell DS-analyse av playground-rendrere.

- B1: renderFindingsBlock + renderRequirements bytter <div class="findings">
  outer (DS grid 360px+1fr klemte indre struktur til 360px-kolonne, lot
  1fr-detail-panel-kolonnen stå tom) til <section class="report-meta">.
  BEM-strukturen findings__list > findings__group > findings__items uendret.
- B2: lokal .report-table CSS for 6+ rapporter (Trusler, Kostnadsoversikt,
  TCO, Risiko-tabell, Key Metrics) som manglet styling — DS implementerer
  ikke klassen. Speilet lokal styling fra llm-security v7.6.1.
- B3: ROS-matrise-bobler bytter <span> til <button type="button"
  data-threat-id="..." aria-label="..."> med document-level click-handler
  som scroller smooth til tilsvarende rad i Trusler-tabellen og
  highlighter raden i 1.6 sek. Lokal CSS for cursor:pointer, hover
  scale(1.15), :focus-visible outline.
- B4: renderRadarSvg bumpet 300x300 til 380x380, R fra 100 til 125,
  label-offset fra R+25 til R+28, dynamisk text-anchor basert på
  horisontal-posisjon for å unngå at bottom-labels overlapper hverandre
  ved 6+ akser (typisk for ROS-rapport med 7 risiko-dimensjoner).
- B5: lokal .recommendation-card__body { overflow-wrap: anywhere;
  word-break: break-word } for å forhindre at lange single-line tekster
  (URLer, owner-tags, dato) skubber innhold ut av viewport i grid-cellen.

tests/test-playground-v3.sh: DS-klasse-assertion oppdatert fra .findings
til .findings__list (BEM-list er fortsatt i bruk; outer grid-container
bevisst fjernet i B1).

Verifisering:
- 22/22 smoke-test PASS (B1-B5 grep-asserts)
- 271/271 playground E2E PASS (201 statisk-struktur + 70 parser-fixtures)
- 219 plugin-validering PASS
- 42 KB-update test PASS

Versjon: v1.12.0 -> v1.13.0 (plugin.json, README badge, README
version-history, CHANGELOG, ROADMAP, TODO, plugin CLAUDE.md
playground-header, root README plugin-list, root CLAUDE.md plugin-list).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-06 14:51:15 +02:00
b7d64a6d2b docs(llm-security): tre doc-nivåer oppdatert for v7.6.1
CLAUDE.md OBLIGATORISK-regel: enhver feature-endring som pusher til
Forgejo MÅ oppdatere alle tre doc-nivåer i SAMME commit eller umiddelbart
etter. v7.6.1-fix-commit (f9b555a) bumpet kun versjons-badgen — denne
oppfølgings-commit-en lukker doc-gapet.

- plugins/llm-security/README.md: ny [7.6.1] history-tabell-rad
- plugins/llm-security/CLAUDE.md: header bumpet v7.6.0 → v7.6.1 +
  ny v7.6.1-blurb (alle 6 fix-detaljer)
- README.md (rot): llm-security versjons-rad bumpet v7.6.0 → v7.6.1 +
  v7.6.1 history-bullet over v7.6.0-bullet

Ingen kodeendringer.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-06 14:44:55 +02:00
f9b555aa64 fix(llm-security): playground v7.6.1 — visuelle bugs i v7.6.0
Seks bugs fanget av maintainer ved manuell verifisering i nettleser etter
v7.6.0-release. Alle skyldes mismatch mellom DS-klasser og hvordan
playground-rendrere brukte dem, eller manglende DS-implementasjoner av
klasser playground-rendrere antok eksisterte.

Fixes:
- renderFindingsBlock brukte .findings outer-class som DS har som
  2-kolonners grid (360px list + 1fr detail-panel) — headeren havnet
  i venstre kolonne, items i høyre, brutt layout i alle 18 rapporter
  med findings. Erstattet med .report-meta + h4 + findings__list >
  findings__group + findings__group-header + findings__items
  (korrekt DS-mønster, kun list-delen).
- .report-table manglet helt i DS men brukes i 7+ rendrere (OWASP,
  Supply chain, Scanner Risk Matrix, Plugin-meta, Permission-matrise,
  Live-meter, Siste runs, Godkjenninger, Mitigation roadmap). Lagt
  lokal CSS-implementasjon i playground-HTML style-blokk: border-
  collapse, zebra-hover, header-styling. Komplementerer DS-tokens
  uten å modifisere vendor.
- renderPreDeploy traffic-lights brukte .sm-card__grade som er fast
  28x28 px (én A-F-bokstav) — kuttet PASS til AS og PASS-WITH-NOTES
  til PASS-WITH-... i alle traffic-light-cards. Erstattet med
  bredde-tilpasset status-pill via inline styling (severity-soft +
  on tokens).
- Threat-model matrix-bobler ikke klikkbare. Erstattet span med
  button type=button data-threat-id + aria-label. Click-handler
  scroller til tilsvarende rad i Trusler-tabellen og fremhever
  den i 1.6 sek.
- Radar-labels overlappet ved 6+ akser fordi alle brukte
  text-anchor=middle. Økt SVG-størrelse 280 → 380, radius 105 → 125.
  Bytter text-anchor fra middle til start/end basert på horisontal-
  posisjon.
- recommendation-card__body tekstoverflyt på lange single-line tekster
  (vilkår, owner-tags, dato). Lagt overflow-wrap: anywhere;
  word-break: break-word i lokal style-blokk.

Verifisering:
- 4/4 fix-spesifikke smoke-tester passerer
- 18/18 renderere produserer fortsatt komplett HTML mot
  dft-komplett-demo (regresjons-test)
- Filendring playground.html 10677 → 10753 linjer (+76 netto)

Versjonsbump v7.6.0 → v7.6.1 (patch — bugfix-only, ingen scanner- eller
hook-atferdsendringer):
- plugins/llm-security/.claude-plugin/plugin.json
- plugins/llm-security/package.json
- plugins/llm-security/README.md (badge)
- plugins/llm-security/CHANGELOG.md ([7.6.1] entry)
- plugins/llm-security/playground/llm-security-playground.html (footer)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-06 14:33:19 +02:00
f006143fb8 feat(llm-security): playground v7.6.0 — Tier 3 referanse-case komplett
Komplett integrasjon av playground-design-system Tier 3-komponenter
i playground-en. Playground er nå referanse-case for hva DS-en kan
levere når alle komponenter brukes som tilsiktet. Levert over 5 sesjoner
med atomic commits per sesjon.

Endringer i v7.6.0 (fase 1-7):
- Fjernet ~30 duplikat-CSS-deklarasjoner (DS vinner cascade)
- Page-shell harmonisert (page__header-klynge på alle 4 overflater)
- Scope-identitet via badge--scope-security
- verdict-pill-lg erstatter custom verdict-pill
- Onboarding wizard via Tier 3 form-progress + fp-step
- Tier 3 spesialkomponenter integrert:
  - tfa-flow + tfa-leg + tfa-arrow (toxic-flow-rapport)
  - mat-ladder + mat-step (posture-modenhet)
  - suppressed-group (narrative-audit)
  - codepoint-reveal + cp-tag/cp-zw/cp-bidi (UNI-funn)
  - top-risks + top-risk[data-severity] (rangert funn-listing)
  - recommendation-card[data-severity] (clean/harden/audit/posture/
    pre-deploy/plugin-audit advisory)
  - risk-meter (band-visualisering 0-100 på 5 archetypes)
  - card--severity-{level} (findings-cards modifier)

5 nye DS-helpers + mapSeverityToCardLevel + parseNarrativeAudit.
renderRecommendationsList utvidet med severity-param. renderHarden-rewrite
fra diff-row-struktur til recommendation-card med action-mapping.

Ingen scanner/hook-atferd berørt. Kun visuelt og strukturelt.
A11Y-rapport oppdatert (WCAG 2.1 AA bekreftet, severity-soft fargepar
verifisert, semantiske elementer erstatter generic div).

Versjon bumpet v7.5.0 → v7.6.0:
- plugins/llm-security/.claude-plugin/plugin.json
- plugins/llm-security/package.json
- plugins/llm-security/README.md (badge + Playground-seksjon + history)
- plugins/llm-security/CLAUDE.md (header + ny v7.6.0-blurb)
- plugins/llm-security/CHANGELOG.md ([7.6.0] entry)
- README.md (rot — llm-security-rad + history-bullet)
- plugins/llm-security/playground/llm-security-playground.html (footer)

Filendring playground.html totalt over 5 sesjoner: 10209 → 10677 linjer
(+468 netto). Per-sesjons-commits: 9ef0c48 (Sesjon 1, fase 1-2),
2481133 (Sesjon 2, fase 3-4), fbda041 (Sesjon 3, fase 5a-d),
e9e5cee (Sesjon 4, fase 5e-h).

Verifisering bekreftet:
- 18/18 renderere passerer regresjons-smoke-test mot dft-komplett-demo
- Grep-criteria oppfylt: top-risks 5, recommendation-card 32,
  risk-meter 7 (5 archetypes), card--severity- 4, verdict-pill-lg 20,
  fp-step 12, badge--scope-security 5, tfa-flow 3, mat-ladder 2,
  suppressed-group 8, codepoint-reveal 12
- Window-globaler intakt, JS parse OK, demo-state JSON parse OK

Kjent begrensning: parsed.findings er tom for deep-scan/audit demo-
fixturer (parser-begrensning, defensiv design — dokumentert i CHANGELOG
+ A11Y-rapport, sporet for v7.6.x patch).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-06 14:12:59 +02:00
e9e5ceebfb feat(llm-security): playground v7.6.0 fase 5e-h — Tier 3 spesialkomponenter (del 2) [skip-docs]
- top-risks + top-risk: rangert top-funn-listing per rapport
  (renderTopRisks helper, integrert i renderScan, renderDeepScan,
  renderPluginAudit, renderPosture, renderAudit — ekskluderer info-funn,
  default 5 toppfunn med data-severity-tinted left-border)
- recommendation-card: data-severity-attributtet utvidet på alle
  inline-bruk (Trust-verdict, Quick wins, Action plan tiers, Vilkår)
  pluss /security clean (per-bucket advisory-cards) og /security harden
  (intro snapshot + per-recommendation diff-cards med action-type-mapping
  CREATE→positive / APPEND→medium / MERGE→low / SKIP→low)
- risk-meter: lagt til på renderDeepScan og renderAudit conditional på
  data.risk_score — utvider eksisterende bruk (renderScan, renderPluginAudit,
  renderRedTeam) til 5 archetypes
- card--severity-{level}: severity-color border-modifier på .findings__item
  i renderFindingsBlock (delt helper) pluss inline-bruk i renderAudit
  category-cards og renderDiff row-items

Ny helper-funksjon mapSeverityToCardLevel(input) normaliserer severity-
strenger og action-types til DS Tier 3-konvensjonene
(critical/high/medium/low/positive). renderRecommendationsList får valgfri
severity-param som default fall-back til 'low'.

Verifisering bekreftet:
- top-risks: 5 forekomster (≥1 ✓)
- recommendation-card: 32 (≥1 ✓ — utvidet fra 4)
- risk-meter: 7 (≥3 ✓ — 5 archetypes bruker helper)
- card--severity-: 4 (≥4 ✓ — findings__item + 2 inline-steder)
- Sesjon 2-3 anker intakte (verdict-pill-lg 20, fp-step 12,
  badge--scope-security 5, tfa-flow 3, mat-ladder 2, suppressed-group 8,
  codepoint-reveal 12)
- Window-globaler intakt
- JS parse: OK (node --check på ekstrahert main JS)
- demo-state JSON parse: OK (3 prosjekter, 18 rapporter)
- HTML-balanse: 3 script / 3 /script / 1 style
- Smoke-test mot demo-data: 5/7 renderere viser komplett markup;
  renderDeepScan og renderAudit har tomme findings-arrays i demo så
  top-risks/card--severity rendrer korrekt tomt (defensiv design,
  bevisst per Sesjon 3 observasjon 2)

Filendring: 10545 → 10677 linjer (+132 netto).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-06 14:00:04 +02:00
fbda041522 feat(llm-security): playground v7.6.0 fase 5a-d — Tier 3 spesialkomponenter (del 1) [skip-docs]
Integrer fire llm-security-spesifikke Tier 3-komponenter:
- tfa-flow + tfa-leg + tfa-arrow: visualiserer lethal-trifecta-kjede
  i toxic-flow-rapport (untrusted-input → sensitive-access → exfil-sink)
- mat-ladder + mat-step: posture-modenhet over kategorier i posture-rapport
- suppressed-group: narrative-audit (v7.1.1) i scan-rapport executive summary
- codepoint-reveal + cp-tag: side-ved-side reveal for Unicode-steganografi
  i mcp-inspect-rapport (visible vs decoded)

Endringer:
- Fire nye render-helpers (renderToxicFlow, renderMatLadder,
  renderSuppressedGroup, renderCodepointReveal) i hovedscriptet, plassert
  før renderScan/Deep/Posture/MCP-Inspect.
- parseScan + parseDeepScan utvidet med narrative_audit-felt via ny
  parseNarrativeAudit-helper som ekstraherer "**Suppressed signals:**"-
  blokken fra raw_markdown.
- renderScan: meterHtml + suppressedHtml + toxicHtml + owaspHtml + ...
- renderDeepScan: suppressedHtml + toxicHtml + smHtml + matrixHtml + ...
- renderPosture: overall + ladderHtml + smHtml + quickHtml + ...
- renderMcpInspect: invHtml + cpHtml (rebuilt via renderCodepointReveal)

Verifisert:
- tfa-flow=3, mat-ladder=2, suppressed-group=8, codepoint-reveal=12 i HTML
- verdict-pill-lg=20, fp-step=12, scope-security=5 (Sesjon 2-kriterier intakte)
- form-progress__step strict singular=0 (DS canonical bevart)
- Window-globaler intakt (24 unike __-prefiksede globaler)
- JS parse OK (node --check), JSON-state parse OK (3 prosjekter, 18 rapporter)
- HTML-balanse OK (3 script-tags, 1 style-blokk)
- Smoke-test mot demo-data: alle 4 helpers rendrer non-empty HTML med
  forventede DS-klasser

Master-plan: plugins/llm-security/playground/V7.6.0-PLAN.local.md (Sesjon 3 av 5).
Sesjon 4 (fase 5e-h: top-risks, recommendation-card, risk-meter, card--severity-*)
neste, deretter Sesjon 5 (verifisering, docs, release).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-06 13:25:35 +02:00
2481133515 feat(llm-security): playground v7.6.0 fase 3-4 — scope-identitet + Tier 3 form-progress [skip-docs]
Fase 3: badge--scope-security som identitets-chip på alle prosjekt- og
rapport-cards (signal "denne er llm-security"). Plassert i topbar
(app-header__brand), fleet-tile-meta, command-subcard card__head,
catalog-card card__head, og onboarding form-progress autosave-blokk.
verdict-pill-lg (DS Tier 2 + Tier 3 supplement) erstatter custom
verdict-pill — nå med __verdict + valgfri __sub-struktur. renderPageShell
aksepterer opts.verdictSub som videresendes til renderVerdictPill.

Fase 4: Onboarding wizard bruker DS Tier 3 form-progress + fp-step med
data-state="done|in-progress|pending" og __num/__name — erstatter
playground-ens lokale form-progress__step-implementasjon. Steps wrappet
i form-progress__steps-container per DS-mønster. Aside har nå
form-progress__autosave-blokk med scope-badge og fullført-counter.

CSS-blokken som tidligere overstyrte DS for .verdict-pill og
.form-progress__heading/__step/__step-marker/--done er fjernet —
DS Tier 3 supplement vinner cascade-en.

Verifisering: verdict-pill-lg=20 (>=12), badge--scope-security=5 (>=5),
fp-step=12 (>=5), .verdict-pill\b i style-blokk=0, form-progress__step
strict singular=0 (3 naive treff er DS-canonical __steps-plural).
14 window-globaler intakt. JS parse OK, demo-state JSON OK,
HTML-balansert (3/3 script, 1/1 style).

Sesjon 2 av 5 i v7.6.0-pipeline. Foundation (sesjon 1) ga 9ef0c48.
Neste: Tier 3 spesialkomponenter del 1 (fase 5a-d) i sesjon 3.
Docs (plugin README/CLAUDE/rot-README/CHANGELOG) oppdateres i Sesjon 5
per master-plan; derav [skip-docs] her.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-06 13:13:03 +02:00
9ef0c48c00 feat(llm-security): playground v7.6.0 fase 1-2 — fjern DS-duplikater + page-shell harmonisering
Slett ~50 duplikat-CSS-deklarasjoner fra playground-ens <style>-blokk
som overstyrte DS Tier 3 supplement uten gevinst (.app-shell, .tab-list,
.fleet-tile*, .form-progress, .eyebrow, .page__*, .key-stat*, .field-*,
.expansion (ekskl. body), .stack-*, .card*, .tracks*, .checkbox-row).

JS-fix: 4 modifier-strenger oppdatert fra forkortede ('crit', 'med')
til DS-konsistente fulle navn ('critical', 'medium') i renderKeyStatsGrid-data.

Konsekvens: DS vinner cascade-en, eliminerer subtile visuelle drift
mellom playground og referanse-scenarioer.

Page-shell harmonisering: alle 4 overflater (onboarding, home, catalog,
project) bruker nå DS page__header-klyngen via renderPageShell. Onboarding
konvertert fra custom <header class="onboarding-header"> til samme mønster.
renderPageShell utvidet med opts.meta (page__meta) og opts.hero
(page__header--hero modifier). Hero-mønster på home med
clamp(36px, 5vw, 56px) og letter-spacing -0.025em.

Behold til Sesjon 2: .verdict-pill (erstattes av verdict-pill-lg fase 3),
.form-progress__step* (erstattes av fp-step fase 4), .multi-select
(bevisst input-box-look), .expansion__body (markup-mismatch m/ DS-anim).

Forberedelse til v7.6.0 — Tier 3 referanse-case.
2026-05-06 12:55:25 +02:00
ce3891bdd0 feat(llm-security): playground Fase 3 — v7.5.0 med 18 parsere/renderere
Single-file SPA playground har nå parser + renderer for alle 18
produces_report=true-kommandoer (Fase 2: 10 høy-prio + Fase 3: 8
gjenstående: mcp-inspect, supply-check, pre-deploy, diff, watch,
registry, clean, threat-model). 18 markdown test-fixtures fungerer
som kontrakt-anker for parser-utvikling.

Komplett demo-prosjekt `dft-komplett-demo` har alle 18 rapporter
ferdig parsed inline — klikk-gjennom uten "parser ikke implementert"-
paneler. 2 nye archetypes i KEY_STATS_CONFIG: kanban-buckets (clean)
og matrix-risk (threat-model).

Bug-fix: normalizeVerdictText sjekker nå GO-WITH-CONDITIONS /
CONDITIONAL / BETINGET FØR plain GO så betinget verdict (pre-deploy
med åpne vilkår) ikke kollapser til ALLOW.

Eksponert 11 window-globaler for testing/automasjon (__store,
__navigate, __loadDemoState, __PARSERS, __RENDERERS, __CATALOG,
__inferVerdict, __inferKeyStats, __renderPageShell,
__handlePasteImport, __scheduleRender). 12 Playwright-genererte
screenshots i playground/screenshots/v7.5.0/.

A11Y-rapport (WCAG 2.1 AA): 0 blokkerende, 3 mindre forbedringer
flagget for v7.5.x patch (skip-link, heading-hierarki på project,
aria-live toast).

Versjonsbump 7.4.0 -> 7.5.0 i 10 filer (package.json, plugin.json,
CLAUDE.md header, README badge, CHANGELOG-entry, 3 scanner VERSION-
konstanter, ROADMAP, marketplace-rot README).

Ingen scanner- eller hook-behavior-changes — purely additive surface.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-05 22:15:47 +02:00
c71d7030e7 Add .mailmap to consolidate author identities 2026-05-05 20:08:12 +02:00
fba0adf17c feat(llm-security): playground Fase 1 — single-file SPA skjelett [skip-docs]
Mirror av ms-ai-architect playground-arkitektur, tilpasset llm-security:

- 4 overflater (onboarding/home/catalog/project) med surface-router
- IndexedDB persistens (llm-security-playground-v1) + localStorage fallback
- Theme-bootstrap med FOUC-prevention og localStorage-persist
- 20 kommandoer i CATALOG (5 kategorier: discover/posture/findings-ops/
  hardening/adversarial/mcp-ops) med full input_fields + report_archetype
- 5-gruppers onboarding (organisasjon/scope/profil/plattform/compliance)
  med form-progress sidebar
- Home: 3 tracks + fleet-grid prosjektliste + tom-state med demo-data
- Katalog: ekspanderbare grupper med live-søk og forhåndsvisning
- Prosjekt-stub: 4 screen-tabs + 6 kategori-tabs + per-kommando
  skjema/paste-import/rapport-soner
- Demo-state: Direktoratet for digital tjenesteutvikling med 2 prosjekter
- Eksport/import (JSON envelope), action-handlers (35), modal-portal

PARSERS + RENDERERS er tomme routing-objekter — fylles i Fase 2 (10 høy-prio
kommandoer) og Fase 3 (resterende 10). Paste-import viser «parser ikke
implementert»-guide-panel for kommandoer uten parser, og lagrer rå markdown
i state for fremtidig parsing.

Vendor: 27 filer synket fra shared/playground-design-system/
(MANIFEST.json sjekksum-låst, source_commit 487f7ae).

Verifisert: node --check OK (2737 linjer, 113733 char inline JS),
HTML-tag-balanse OK. Manuell smoke-test gjenstår.

Docs (plugin README, CLAUDE.md, rot-README) bumpes ved Fase 3-fullføring
sammen med plugin.json v7.5.0. Derfor [skip-docs] her.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-05 18:47:45 +02:00
487f7ae746 chore(voyage): scrub ultra-cc-architect references from source
The ultra-cc-architect plugin was removed from the marketplace; voyage's
architecture-discovery contract still pointed at it by name. Replaced
verbatim references with plugin-agnostic phrasing ("upstream architect
producer") in code comments and user-facing warning messages.

CHANGELOG entries and config-audit v5.0.0 snapshots intentionally
preserved as historical records.
2026-05-05 15:51:17 +02:00
cbbd1b0589 docs: README — bump llm-security to v7.4.0 with examples + e2e suite
- Add v7.4.0 line covering 9 runnable examples and 3 new e2e test suites
- Update test count 1768 → 1822 in stat footer
- Add "9 runnable examples" to stat footer
2026-05-05 15:43:54 +02:00
7a90d348ad feat(voyage)!: marketplace handoff — rename plugins/ultraplan-local to plugins/voyage [skip-docs]
Session 5 of voyage-rebrand (V6). Operator-authorized cross-plugin scope.

- git mv plugins/ultraplan-local plugins/voyage (rename detected, history preserved)
- .claude-plugin/marketplace.json: voyage entry replaces ultraplan-local
- CLAUDE.md: voyage row in plugin list, voyage in design-system consumer list
- README.md: bulk rename ultra*-local commands -> trek* commands; ultraplan-local refs -> voyage; type discriminators (type: trekbrief/trekreview); session-title pattern (voyage:<command>:<slug>); v4.0.0 release-note paragraph
- plugins/voyage/.claude-plugin/plugin.json: homepage/repository URLs point to monorepo voyage path
- plugins/voyage/verify.sh: drop URL whitelist exception (no longer needed)

Closes voyage-rebrand. bash plugins/voyage/verify.sh PASS 7/7. npm test 361/361.
2026-05-05 15:37:52 +02:00
8f1bf9b7b4 chore(llm-security): v7.4.0 — examples + e2e suite minor
Bumps from v7.3.1 to v7.4.0. Purely additive surface — no scanner
or hook behavior changes, no breaking changes.

Headline content (already merged on main since v7.3.1):

- examples/ utvidelse — seven runnable demonstration walkthroughs
  shipped over three sessions (sesjon 1 pre-existing
  prompt-injection-showcase + lethal-trifecta-walkthrough,
  mcp-rug-pull, supply-chain-attack, poisoned-claude-md,
  bash-evasion-gallery, toxic-agent-demo, pre-compact-poisoning).
  Each is self-contained: README + fixture + run-script +
  expected-findings testable contract. State-isolation pattern
  (PID-suffixed JSONL or env-overrides like
  LLM_SECURITY_MCP_CACHE_FILE) keeps the user's real cache and
  /tmp state untouched.
- tests/e2e/ — three new suites totalling 45 tests:
  attack-chain.test.mjs (17), multi-session.test.mjs (9),
  scan-pipeline.test.mjs (19). Test count 1777 to 1822. These
  exercise the framework as a coordinated system rather than as
  isolated unit-tests.

Version sync (8 files):

- package.json
- .claude-plugin/plugin.json
- CLAUDE.md (header)
- README.md (badge + Recent versions tabellen new row)
- CHANGELOG.md (Unreleased to [7.4.0] - 2026-05-05 with summary)
- scanners/dashboard-aggregator.mjs VERSION constant
- scanners/ide-extension-scanner.mjs VERSION constant
- scanners/posture-scanner.mjs VERSION constant

Stabilization-stance unchanged. v8.0.0 remains the planned
deprecation-cleanup release. v7.x continues as the stable line.

Tests: 1822/1822 grønne lokalt etter bump.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-05 15:34:02 +02:00
e89ac5eb98 fix(voyage): verify.sh handles v4.0.0 reality (URL exception + --local flag) [skip-docs] 2026-05-05 15:27:11 +02:00
ee56b11c78 feat(voyage)!: bump v4.0.0, rename plugin to voyage, CHANGELOG entry [skip-docs] 2026-05-05 15:27:06 +02:00
7684672ca3 feat(voyage)!: add verify.sh automating brief SC1-SC7 [skip-docs] 2026-05-05 15:23:26 +02:00
1e5838146f feat(voyage)!: add TRADEMARKS.md disclaiming Anthropic affiliation [skip-docs] 2026-05-05 15:23:26 +02:00
b6d912200e feat(llm-security): add pre-compact-poisoning example for PreCompact hook [skip-docs]
Runnable demonstration of hooks/scripts/pre-compact-scan.mjs (the
only PreCompact hook in the plugin) detecting both a CRITICAL
injection pattern and an AWS-shaped credential inside a synthetic
JSONL transcript, exercised across all three values of
LLM_SECURITY_PRECOMPACT_MODE plus a benign-transcript control case
in block mode that proves the gate is not a brick wall.

The transcript is generated at runtime in a per-invocation tempdir
under os.tmpdir() and the directory is removed in a finally block,
so the user's real ~/.claude/projects/.../transcripts/ are never
touched. The AWS-shaped key uses the same 'AK' + 'IA' + ...
fragmentation idiom as tests/e2e/attack-chain.test.mjs so this
source contains no literal credentials and pre-edit-secrets does
not block writes during development.

Nine independent assertions (9/9 must pass):
- block mode + poisoned: exit 2, decision=block JSON, reason text
  covers both injection and AWS labels (3 assertions)
- warn mode + poisoned: exit 0, systemMessage JSON, no decision
  field (2 assertions)
- off mode + poisoned: exit 0, no JSON on stdout (2 assertions)
- block mode + benign: exit 0, no decision=block JSON (2 assertions)

OWASP / framework mapping: LLM01, LLM02, ASI01, AT-1, AT-3.

Docs updated: plugin README "Other runnable examples", plugin
CLAUDE.md "Examples" tabellen, CHANGELOG [Unreleased] Added.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-05 15:23:10 +02:00