Step 2 av v4.1-execute (Wave 1, Session 1).
Legg --profile i valued-arrayen for alle 6 voyage-kommandoer (trekbrief,
trekresearch, trekplan, trekexecute, trekreview, trekcontinue). Mønster
identisk med eksisterende --project/--brief valued-handling. Ingen endring
til parseArgs-logikk — utvider kun schema.
Tester (11 nye, baseline 361 → 372):
- 6 happy-path-tests (én per kommando)
- ARG_MISSING_VALUE for --profile uten verdi
- --profile + --quick kombo
- --profile + --gates edge-case (--gates parses inline, ikke i FLAG_SCHEMA)
- --profile + --project kombo
- trekcontinue --profile (validerer at tomt valued[] nå er utvidet)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Slash-command-parseren matcher !`...` selv inne i ```bash markdown-fences,
som gjorde at Phase 8 NEXT-SESSION-PROMPT-template eksekverte ved skill-load
med literale {project_dir}/{next_session_brief_path}/{next_session_label}/
{status}-strenger som argv. Det ga ENOENT på .session-state.local.json.tmp
og blokkerte hele /trekexecute skill-loadet.
Fjern !`...`-wrapperen og merk blokken eksplisitt som runtime-template.
Pattern matcher nå konvensjonen brukt andre steder i samme fil
(linje 202-208) der ```bash brukes for orkestrator-instruksjon uten
auto-eksekvering.
Wave 0 av v4.1-execute — pre-requisite for å låse opp /trekexecute
skill-invokasjon mot .claude/projects/2026-05-08-voyage-v4.1-modellprofiler/
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- renderCost: FIX — KEY_STATS_CONFIG['cost-distribution'] og inferVerdict('cost-distribution') viste "[object Object]" / returnerte alltid 'go' fordi parser-output har p50/p90 = {monthly, yearly}-objekter, ikke tall. Begge ekstraherer nå .monthly med fallback for flate fixtures.
- renderLicense: PASS — ingen kode-endring. Capability-matrix-status korrekt utledet (met/partial/missing) via parseCapabilityMatrix. Visuell QA gjenstår i sesjon 6.
- renderCompare: FIX — firstWord-heuristikk feilet når begge subjekter delte førsteord (f.eks. "Azure AI Foundry" vs "Azure ML + AKS" ga begge fw='azure', kollapset vinn-attribusjon). Erstattet med distinctive-token-matching: full-subject-substring først, deretter ord som er unike for ett subjekt. Diff-cell coloring oppdatert til samme matchSubject()-helper.
- renderUtredning: MINOR — droppet misvisende role="tab"/role="tablist" siden vi rendrer anchor-jump-TOC (alle paneler synlige), ikke ekte tab-toggle. Beholdt aria-current="true" for visuell aktiv-markør (DS-CSS hekter på den). Ekte tab-toggle defer til v1.15.0.
validate-plugin.sh: 219 PASS uendret
run-e2e.sh --playground: 272 PASS uendret
test-playground-migrations.sh: 7 PASS uendret
Refs V1.14.0-AUDIT.local.md sub-batch E (sesjon 5b).
- renderMigrate: <section class="phase-detail"> per fase erstattet med
<div class="expansion">-list (DS-supplement). Default-collapsed, klikkbar
header (Fase N: navn + duration), body = milepaeler + suksesskriterier.
Behold cycle-ribbon + mat-ladder + phases-summary-tabell + risks-tabell.
- renderPoc: speil renderMigrate. Traffic-light flyttet inn i expansion-body
(ul.traffic-list per fase med status fra fasens stepState).
- renderSummary: KEY_STATS_CONFIG['verdict'] patchet — parseTable returnerer
rader med header-baserte nokler (Metric/Verdi/Mal) ikke canonical
{label,value,unit}. Ny logikk bruker metrics_headers + heuristikk-match for
label/value/unit-kolonner, med fallback til canonical felt.
Backward-kompatibelt.
- renderAdr: verifisert PASS — ingen endring (.adr-meta + critique-cards
rendrer pent uten ekstra arbeid).
- ACTIONS['phase-expand']: ny handler registrert som alias for
requirement-expand (samme toggle-monster, eget action-navn for senere
divergens).
- Lokal CSS: hele .phase-detail-blokken (~10 linjer) slettet. Defensive-
kommentar oppsummert til 5-linjers historie-notat.
- Style-blokk effektive linjer: 147 (var 178 etter sesjon 4).
Smoke-tester:
- validate-plugin.sh: 219 PASS
- run-e2e.sh --playground: 272 PASS (202 statisk + 70 parser)
- test-playground-migrations.sh: 7 PASS
Refs V1.14.0-AUDIT.local.md sub-batch D.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Bugfixes (B-DS-1, B-DS-2, B-DS-3 fra V1.14.0-AUDIT):
- .kanban-card__name (tier3-supplement): word-break: break-all → break-word
+ overflow-wrap: anywhere. Knekket midt i ord ("Tekn isk dokumen tasjon").
- .expansion__title-main, .expansion__title-sub (tier3-supplement): legg
til display: block. Begge er <span> som flyter inline by default —
resultat: "dokumentertKilde: Art. 9" på samme linje.
- .matrix__bubble (components.css): legg til cursor: pointer, hover-scale
og focus-visible. Antas rendret som <button> i konsumenter — gir
visuell + keyboard-fokus-feedback.
Re-syncet til plugins/ms-ai-architect/playground/vendor/ via
sync-design-system.mjs. Slettet 3 lokal-overrides i playground HTML
(matrix-bubble, expansion-title, kanban-card-name). Style-blokk:
191 → 182 linjer.
Smoke-tester: validate-plugin 219 PASS, e2e --playground 272 PASS,
statisk struktur 202 PASS.
Andre plugins (llm-security, voyage, okr, config-audit) påvirkes IKKE
— beholder gammel vendored DS inntil de selv re-syncer.
Sesjon 2 av 6 i v1.14.0 root-cause-multi-sesjons-løp.
ms-ai-architect plugin-versjon ikke bumpet (sesjon 6 ship-er v1.14.0).
[skip-docs]: docs oppdateres i sesjon 6 ved v1.14.0 plugin-ship.
Refs V1.14.0-AUDIT.local.md sub-batch 1 + 4.
10 visuelle bugs identifisert av maintainer i nettleser etter v1.13.0
shipped. Patch-pakke som adresserer mismatch mellom playground-rendrere
og DS-konvensjoner som v1.13.0 ikke fanget opp.
- B7: classify "Forpliktelser" indent — lokal .report-meta CSS-reset
(DL grid max-content+1fr, h4 uppercase+bold, ul padding-left space-5)
for konsistent venstre-justering uavhengig av nestelse.
- B8a: requirement-expand handler missing — renderRequirements markup
hadde data-action="requirement-expand" på hver expansion__head, men
ingen ACTIONS-handler var registrert. R-01..R-09-radene i AI Act-krav
var derfor ikke klikkbare. Fix: register ACTIONS['requirement-expand'].
- B8b: expansion title-main + title-sub kjørte sammen — DS' spans var
inline. Lokal display:block så de stables vertikalt.
- B10: kanban-card tegnknekking — DS' word-break:break-all knekker midt
i ord. Lokal override med break-word.
- B11: DPIA matrix-bobler ikke responderer — v1.13.0 click-handler
matchet kun mot første-kolonne i Trusler-tabellen. DPIA-fixturer har
full-tekst label i matrix_cells men T-001-id i threats-tabellen, så
ingen match. Utvid til (Pass 1) exact first-cell + (Pass 2) substring-
match mot enhver celle med 40-tegn-prefiks-toleranse.
- B12, B13, B15: defensive layout for top-risks/suppressed-panel/
phase-detail/aiact-timeline — eksplisitt display:block; clear:both;
width:100% mot grid-leak fra small-multiples/kanban-board/mat-ladder.
- B14: Migrate "skal vel være tabell" — phases-summary-tabell over
phase-detail-seksjonene (Fase, Varighet, Milepæler-count, Suksesskriterier-
count, Status). Samme tabell speilet i renderPoc for konsistens.
Verifisering:
- 23/23 smoke-test PASS (B7-B15 + 5 v1.13.0-regresjoner)
- 271/271 playground E2E PASS
- 219 plugin-validering PASS
- 42 KB-update PASS
Versjon: v1.13.0 -> v1.13.1 (plugin.json, README badge, README
version-history, CHANGELOG, ROADMAP, TODO, plugin CLAUDE.md
playground-header, root README plugin-list, root CLAUDE.md plugin-list).
Berører kun lokal CSS i <style>-blokk, ACTIONS-handler-registrering,
click-handler-utvidelse, og to renderer-funksjoner. Ingen modifisering
av playground/vendor/. Vendored DS' .kanban-card__name { word-break:
break-all } står — overstyres lokalt.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Fix-pakke som speiler llm-security v7.6.1 (commit f9b555a). Samme klasse
visuelle bugs identifisert via parallell DS-analyse av playground-rendrere.
- B1: renderFindingsBlock + renderRequirements bytter <div class="findings">
outer (DS grid 360px+1fr klemte indre struktur til 360px-kolonne, lot
1fr-detail-panel-kolonnen stå tom) til <section class="report-meta">.
BEM-strukturen findings__list > findings__group > findings__items uendret.
- B2: lokal .report-table CSS for 6+ rapporter (Trusler, Kostnadsoversikt,
TCO, Risiko-tabell, Key Metrics) som manglet styling — DS implementerer
ikke klassen. Speilet lokal styling fra llm-security v7.6.1.
- B3: ROS-matrise-bobler bytter <span> til <button type="button"
data-threat-id="..." aria-label="..."> med document-level click-handler
som scroller smooth til tilsvarende rad i Trusler-tabellen og
highlighter raden i 1.6 sek. Lokal CSS for cursor:pointer, hover
scale(1.15), :focus-visible outline.
- B4: renderRadarSvg bumpet 300x300 til 380x380, R fra 100 til 125,
label-offset fra R+25 til R+28, dynamisk text-anchor basert på
horisontal-posisjon for å unngå at bottom-labels overlapper hverandre
ved 6+ akser (typisk for ROS-rapport med 7 risiko-dimensjoner).
- B5: lokal .recommendation-card__body { overflow-wrap: anywhere;
word-break: break-word } for å forhindre at lange single-line tekster
(URLer, owner-tags, dato) skubber innhold ut av viewport i grid-cellen.
tests/test-playground-v3.sh: DS-klasse-assertion oppdatert fra .findings
til .findings__list (BEM-list er fortsatt i bruk; outer grid-container
bevisst fjernet i B1).
Verifisering:
- 22/22 smoke-test PASS (B1-B5 grep-asserts)
- 271/271 playground E2E PASS (201 statisk-struktur + 70 parser-fixtures)
- 219 plugin-validering PASS
- 42 KB-update test PASS
Versjon: v1.12.0 -> v1.13.0 (plugin.json, README badge, README
version-history, CHANGELOG, ROADMAP, TODO, plugin CLAUDE.md
playground-header, root README plugin-list, root CLAUDE.md plugin-list).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CLAUDE.md OBLIGATORISK-regel: enhver feature-endring som pusher til
Forgejo MÅ oppdatere alle tre doc-nivåer i SAMME commit eller umiddelbart
etter. v7.6.1-fix-commit (f9b555a) bumpet kun versjons-badgen — denne
oppfølgings-commit-en lukker doc-gapet.
- plugins/llm-security/README.md: ny [7.6.1] history-tabell-rad
- plugins/llm-security/CLAUDE.md: header bumpet v7.6.0 → v7.6.1 +
ny v7.6.1-blurb (alle 6 fix-detaljer)
- README.md (rot): llm-security versjons-rad bumpet v7.6.0 → v7.6.1 +
v7.6.1 history-bullet over v7.6.0-bullet
Ingen kodeendringer.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Seks bugs fanget av maintainer ved manuell verifisering i nettleser etter
v7.6.0-release. Alle skyldes mismatch mellom DS-klasser og hvordan
playground-rendrere brukte dem, eller manglende DS-implementasjoner av
klasser playground-rendrere antok eksisterte.
Fixes:
- renderFindingsBlock brukte .findings outer-class som DS har som
2-kolonners grid (360px list + 1fr detail-panel) — headeren havnet
i venstre kolonne, items i høyre, brutt layout i alle 18 rapporter
med findings. Erstattet med .report-meta + h4 + findings__list >
findings__group + findings__group-header + findings__items
(korrekt DS-mønster, kun list-delen).
- .report-table manglet helt i DS men brukes i 7+ rendrere (OWASP,
Supply chain, Scanner Risk Matrix, Plugin-meta, Permission-matrise,
Live-meter, Siste runs, Godkjenninger, Mitigation roadmap). Lagt
lokal CSS-implementasjon i playground-HTML style-blokk: border-
collapse, zebra-hover, header-styling. Komplementerer DS-tokens
uten å modifisere vendor.
- renderPreDeploy traffic-lights brukte .sm-card__grade som er fast
28x28 px (én A-F-bokstav) — kuttet PASS til AS og PASS-WITH-NOTES
til PASS-WITH-... i alle traffic-light-cards. Erstattet med
bredde-tilpasset status-pill via inline styling (severity-soft +
on tokens).
- Threat-model matrix-bobler ikke klikkbare. Erstattet span med
button type=button data-threat-id + aria-label. Click-handler
scroller til tilsvarende rad i Trusler-tabellen og fremhever
den i 1.6 sek.
- Radar-labels overlappet ved 6+ akser fordi alle brukte
text-anchor=middle. Økt SVG-størrelse 280 → 380, radius 105 → 125.
Bytter text-anchor fra middle til start/end basert på horisontal-
posisjon.
- recommendation-card__body tekstoverflyt på lange single-line tekster
(vilkår, owner-tags, dato). Lagt overflow-wrap: anywhere;
word-break: break-word i lokal style-blokk.
Verifisering:
- 4/4 fix-spesifikke smoke-tester passerer
- 18/18 renderere produserer fortsatt komplett HTML mot
dft-komplett-demo (regresjons-test)
- Filendring playground.html 10677 → 10753 linjer (+76 netto)
Versjonsbump v7.6.0 → v7.6.1 (patch — bugfix-only, ingen scanner- eller
hook-atferdsendringer):
- plugins/llm-security/.claude-plugin/plugin.json
- plugins/llm-security/package.json
- plugins/llm-security/README.md (badge)
- plugins/llm-security/CHANGELOG.md ([7.6.1] entry)
- plugins/llm-security/playground/llm-security-playground.html (footer)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Fase 3: badge--scope-security som identitets-chip på alle prosjekt- og
rapport-cards (signal "denne er llm-security"). Plassert i topbar
(app-header__brand), fleet-tile-meta, command-subcard card__head,
catalog-card card__head, og onboarding form-progress autosave-blokk.
verdict-pill-lg (DS Tier 2 + Tier 3 supplement) erstatter custom
verdict-pill — nå med __verdict + valgfri __sub-struktur. renderPageShell
aksepterer opts.verdictSub som videresendes til renderVerdictPill.
Fase 4: Onboarding wizard bruker DS Tier 3 form-progress + fp-step med
data-state="done|in-progress|pending" og __num/__name — erstatter
playground-ens lokale form-progress__step-implementasjon. Steps wrappet
i form-progress__steps-container per DS-mønster. Aside har nå
form-progress__autosave-blokk med scope-badge og fullført-counter.
CSS-blokken som tidligere overstyrte DS for .verdict-pill og
.form-progress__heading/__step/__step-marker/--done er fjernet —
DS Tier 3 supplement vinner cascade-en.
Verifisering: verdict-pill-lg=20 (>=12), badge--scope-security=5 (>=5),
fp-step=12 (>=5), .verdict-pill\b i style-blokk=0, form-progress__step
strict singular=0 (3 naive treff er DS-canonical __steps-plural).
14 window-globaler intakt. JS parse OK, demo-state JSON OK,
HTML-balansert (3/3 script, 1/1 style).
Sesjon 2 av 5 i v7.6.0-pipeline. Foundation (sesjon 1) ga 9ef0c48.
Neste: Tier 3 spesialkomponenter del 1 (fase 5a-d) i sesjon 3.
Docs (plugin README/CLAUDE/rot-README/CHANGELOG) oppdateres i Sesjon 5
per master-plan; derav [skip-docs] her.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Slett ~50 duplikat-CSS-deklarasjoner fra playground-ens <style>-blokk
som overstyrte DS Tier 3 supplement uten gevinst (.app-shell, .tab-list,
.fleet-tile*, .form-progress, .eyebrow, .page__*, .key-stat*, .field-*,
.expansion (ekskl. body), .stack-*, .card*, .tracks*, .checkbox-row).
JS-fix: 4 modifier-strenger oppdatert fra forkortede ('crit', 'med')
til DS-konsistente fulle navn ('critical', 'medium') i renderKeyStatsGrid-data.
Konsekvens: DS vinner cascade-en, eliminerer subtile visuelle drift
mellom playground og referanse-scenarioer.
Page-shell harmonisering: alle 4 overflater (onboarding, home, catalog,
project) bruker nå DS page__header-klyngen via renderPageShell. Onboarding
konvertert fra custom <header class="onboarding-header"> til samme mønster.
renderPageShell utvidet med opts.meta (page__meta) og opts.hero
(page__header--hero modifier). Hero-mønster på home med
clamp(36px, 5vw, 56px) og letter-spacing -0.025em.
Behold til Sesjon 2: .verdict-pill (erstattes av verdict-pill-lg fase 3),
.form-progress__step* (erstattes av fp-step fase 4), .multi-select
(bevisst input-box-look), .expansion__body (markup-mismatch m/ DS-anim).
Forberedelse til v7.6.0 — Tier 3 referanse-case.
Mirror av ms-ai-architect playground-arkitektur, tilpasset llm-security:
- 4 overflater (onboarding/home/catalog/project) med surface-router
- IndexedDB persistens (llm-security-playground-v1) + localStorage fallback
- Theme-bootstrap med FOUC-prevention og localStorage-persist
- 20 kommandoer i CATALOG (5 kategorier: discover/posture/findings-ops/
hardening/adversarial/mcp-ops) med full input_fields + report_archetype
- 5-gruppers onboarding (organisasjon/scope/profil/plattform/compliance)
med form-progress sidebar
- Home: 3 tracks + fleet-grid prosjektliste + tom-state med demo-data
- Katalog: ekspanderbare grupper med live-søk og forhåndsvisning
- Prosjekt-stub: 4 screen-tabs + 6 kategori-tabs + per-kommando
skjema/paste-import/rapport-soner
- Demo-state: Direktoratet for digital tjenesteutvikling med 2 prosjekter
- Eksport/import (JSON envelope), action-handlers (35), modal-portal
PARSERS + RENDERERS er tomme routing-objekter — fylles i Fase 2 (10 høy-prio
kommandoer) og Fase 3 (resterende 10). Paste-import viser «parser ikke
implementert»-guide-panel for kommandoer uten parser, og lagrer rå markdown
i state for fremtidig parsing.
Vendor: 27 filer synket fra shared/playground-design-system/
(MANIFEST.json sjekksum-låst, source_commit 487f7ae).
Verifisert: node --check OK (2737 linjer, 113733 char inline JS),
HTML-tag-balanse OK. Manuell smoke-test gjenstår.
Docs (plugin README, CLAUDE.md, rot-README) bumpes ved Fase 3-fullføring
sammen med plugin.json v7.5.0. Derfor [skip-docs] her.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The ultra-cc-architect plugin was removed from the marketplace; voyage's
architecture-discovery contract still pointed at it by name. Replaced
verbatim references with plugin-agnostic phrasing ("upstream architect
producer") in code comments and user-facing warning messages.
CHANGELOG entries and config-audit v5.0.0 snapshots intentionally
preserved as historical records.
Bumps from v7.3.1 to v7.4.0. Purely additive surface — no scanner
or hook behavior changes, no breaking changes.
Headline content (already merged on main since v7.3.1):
- examples/ utvidelse — seven runnable demonstration walkthroughs
shipped over three sessions (sesjon 1 pre-existing
prompt-injection-showcase + lethal-trifecta-walkthrough,
mcp-rug-pull, supply-chain-attack, poisoned-claude-md,
bash-evasion-gallery, toxic-agent-demo, pre-compact-poisoning).
Each is self-contained: README + fixture + run-script +
expected-findings testable contract. State-isolation pattern
(PID-suffixed JSONL or env-overrides like
LLM_SECURITY_MCP_CACHE_FILE) keeps the user's real cache and
/tmp state untouched.
- tests/e2e/ — three new suites totalling 45 tests:
attack-chain.test.mjs (17), multi-session.test.mjs (9),
scan-pipeline.test.mjs (19). Test count 1777 to 1822. These
exercise the framework as a coordinated system rather than as
isolated unit-tests.
Version sync (8 files):
- package.json
- .claude-plugin/plugin.json
- CLAUDE.md (header)
- README.md (badge + Recent versions tabellen new row)
- CHANGELOG.md (Unreleased to [7.4.0] - 2026-05-05 with summary)
- scanners/dashboard-aggregator.mjs VERSION constant
- scanners/ide-extension-scanner.mjs VERSION constant
- scanners/posture-scanner.mjs VERSION constant
Stabilization-stance unchanged. v8.0.0 remains the planned
deprecation-cleanup release. v7.x continues as the stable line.
Tests: 1822/1822 grønne lokalt etter bump.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Runnable demonstration of hooks/scripts/pre-compact-scan.mjs (the
only PreCompact hook in the plugin) detecting both a CRITICAL
injection pattern and an AWS-shaped credential inside a synthetic
JSONL transcript, exercised across all three values of
LLM_SECURITY_PRECOMPACT_MODE plus a benign-transcript control case
in block mode that proves the gate is not a brick wall.
The transcript is generated at runtime in a per-invocation tempdir
under os.tmpdir() and the directory is removed in a finally block,
so the user's real ~/.claude/projects/.../transcripts/ are never
touched. The AWS-shaped key uses the same 'AK' + 'IA' + ...
fragmentation idiom as tests/e2e/attack-chain.test.mjs so this
source contains no literal credentials and pre-edit-secrets does
not block writes during development.
Nine independent assertions (9/9 must pass):
- block mode + poisoned: exit 2, decision=block JSON, reason text
covers both injection and AWS labels (3 assertions)
- warn mode + poisoned: exit 0, systemMessage JSON, no decision
field (2 assertions)
- off mode + poisoned: exit 0, no JSON on stdout (2 assertions)
- block mode + benign: exit 0, no decision=block JSON (2 assertions)
OWASP / framework mapping: LLM01, LLM02, ASI01, AT-1, AT-3.
Docs updated: plugin README "Other runnable examples", plugin
CLAUDE.md "Examples" tabellen, CHANGELOG [Unreleased] Added.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Single-component lethal-trifecta walkthrough that drives
scanners/toxic-flow-analyzer.mjs against a deliberately
misconfigured fixture plugin. The fixture agent declares
tools: [Bash, Read, WebFetch], which alone covers all three
trifecta legs (input surface + data access + exfil sink). No
hooks/hooks.json is shipped, so TFA's mitigation logic finds
no active guards and emits a CRITICAL "Lethal trifecta:"
finding without downgrade.
Plugin marker is plugin.fixture.json (recognised by isPlugin())
rather than .claude-plugin/plugin.json — the latter is blocked
by the plugin's own pre-write-pathguard hook, and
plugin.fixture.json exists in isPlugin() specifically so
example fixtures can self-mark without touching guarded paths.
Three independent assertions (3/3 must pass): direct trifecta
present and CRITICAL; finding mentions the exfil-helper
component; description confirms "no hook guards detected"
(proves the mitigation path stayed inactive). expected-findings.md
documents the contract.
OWASP / framework mapping: ASI01, ASI02, ASI05, LLM01, LLM02, LLM06.
Docs updated: plugin README "Other runnable examples", plugin
CLAUDE.md "Examples" tabellen, CHANGELOG [Unreleased] Added.
[skip-docs] is appropriate because examples don't change what
the plugin "synes å dekke utad" — marketplace root README is
unaffected.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Three new self-contained, runnable threat demonstrations under
examples/, continuing the batch started in 583a78c. Each example
has README.md + run-*.mjs + expected-findings.md and uses
state-isolation discipline so the user's real cache/state files
are never polluted.
- examples/supply-chain-attack/ — two-layer demonstration:
pre-install-supply-chain (PreToolUse) blocks compromised
event-stream version 3.3.6 and emits a scope-hop advisory for
the @evilcorp scope; dep-auditor (DEP scanner, offline) flags
5 typosquat dependencies plus a curl-piped install-script
vector in the fixture package.json. Maps to LLM03/LLM05/ASI04.
- examples/poisoned-claude-md/ — all 6 memory-poisoning detectors
fire on a deliberately poisoned CLAUDE.md plus a fixture
agent file under .claude/agents (E15/v7.2.0 surface):
detectInjection, detectShellCommands, detectSuspiciousUrls,
detectCredentialPaths, detectPermissionExpansion,
detectEncodedPayloads. No agent runtime needed — scanner
imported directly. Maps to LLM01/LLM06/ASI04.
- examples/bash-evasion-gallery/ — one disguised variant per
T1 through T9 evasion technique fed through pre-bash-destructive,
verified BLOCK after bash-normalize strips the evasion. T8
base64-pipe-shell uses its own BLOCK_RULE. The canonical
destructive form uses a path token rather than the bare slash
(regex word-boundary requires it). Source-string fragmentation
pattern reused from the e2e attack-chain test. Maps to
LLM06/ASI01/LLM01.
Plugin README "Other runnable examples" section + plugin
CLAUDE.md "Examples" table + CHANGELOG Unreleased/Added
all updated. Marketplace root README unchanged
([skip-docs] for marketplace-level gate — plugin's outward
coverage is unchanged, only demonstrations were added).
Companion to 8df5d5c (which only carried the doc updates — the example
directories themselves were left out of staging by mistake). This
commit adds the actual example mappes:
- examples/lethal-trifecta-walkthrough/{README.md, run-trifecta.mjs,
expected-findings.md}
- examples/mcp-rug-pull/{README.md, run-rug-pull.mjs,
expected-findings.md}
Plus plugin CLAUDE.md "Examples (runnable demonstrations)" section
with a 4-row table covering malicious-skill-demo, prompt-injection-
showcase, lethal-trifecta-walkthrough, and mcp-rug-pull plus the
state-isolation discipline notes.
Marketplace root README unchanged since plugin's outward coverage
is unchanged ([skip-docs] covers the marketplace-level gate).
Two new self-contained, runnable threat demonstrations under examples/:
- lethal-trifecta-walkthrough/ — feeds 5 hook calls (WebFetch, Read .env,
Bash curl POST + suppression follow-ups) into post-session-guard and
verifies the Rule-of-Two advisory fires exactly on leg 3. State
isolated via run-script PID so /tmp/llm-security-session-*.jsonl is
not polluted. Treffer post-session-guard, ASI01/ASI02, LLM01/LLM02.
- mcp-rug-pull/ — mutates an MCP tool description across 8 stages.
Each per-update <10% Levenshtein, cumulative reaches 32.2% by stage
7 — proves the v7.3.0 (E14) mcp-cumulative-drift MEDIUM advisory
catches slow-burn rug-pulls that the per-update detection would
miss. Uses LLM_SECURITY_MCP_CACHE_FILE to isolate cache. Treffer
post-mcp-verify, mcp-description-cache.mjs, OWASP MCP05/LLM03/ASI04.
Each example: README.md + run-*.mjs + expected-findings.md.
Plugin README "Other runnable examples" section + CHANGELOG
[Unreleased] Added bullets + plugin CLAUDE.md "Examples" section
all updated in this commit. Marketplace root README unchanged
since plugin's outward coverage is unchanged ([skip-docs]
covers the marketplace-level gate).
Synthetic ROS-analyse output for "Acme Kunde-chatbot" (Acme Kommune)
following the same pattern as security-assessment, cost-estimation,
ai-act and summary fixtures. Satisfies all 29 assertions in
tests/test-ros-output.sh:
- 8 phases (Fase 1-8) plus Ledelsessammendrag
- 12 trusler i T-XXX-NN format (MAESTRO + OWASP-mapping)
- 9 risikoer i R-N format
- 10 tiltak i M-N format
- 7 ROS-dimensjoner med X/5-scoring
- 5x5 risikomatrise + restrisiko-tabell
- NS 5814 + ISO 31000 metodikk-referanser
- AI Act, GDPR, OWASP regulatoriske referanser
- MAESTRO + supply-chain referanser (Vedlegg O coverage)
Tar bort den siste pre-eksisterende run-e2e-feilen
(`bash tests/run-e2e.sh` exits 0).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Three new files in tests/e2e/ (45 tests, 1777 -> 1822):
- attack-chain.test.mjs (17): full hook stack against attack payloads in
sequence -- prompt injection at the gate; T1/T5/T8 bash evasions;
pathguard on .env / .ssh; secrets hook on AWS-shaped keys and PEM
headers; markdown link-title and HTML-comment poisoning in tool
output; trifecta accumulation over a single session with dedup on
the next benign call.
- multi-session.test.mjs (9): state persistence across simulated
session boundaries. Uses the fact that a hook child's process.ppid
equals the test runner's process.pid, so writing the session state
file directly simulates "previous session" history. Covers slow-burn
trifecta (legs spread >50 calls), MCP cumulative description drift
via LLM_SECURITY_MCP_CACHE_FILE override, and pre-compact transcript
poisoning in warn / block / clean / missing-file modes.
- scan-pipeline.test.mjs (19): scan-orchestrator + all 10 scanners +
toxic-flow correlator against poisoned-project (BLOCK / 95 / Extreme)
and grade-a-project (WARNING / 48 / High). Asserts envelope shape,
verdict, risk_score, severity counts, OWASP coverage, scanner
enumeration, and a narrative-coherence cross-check that the BLOCK
scan strictly outranks the WARNING scan along every axis.
Test files build credential-shaped payloads at runtime via concatenation
so they contain no literal matches for the pre-edit-secrets regexes
(memory rule feedback_secrets_hook_test_fixtures.md).
Doc updates in same commit per marketplace policy:
- CLAUDE.md header: 1777+ -> 1822+ tests, mentions tests/e2e/
- README.md badge tests-1777 -> tests-1822, body text updated
- CHANGELOG.md: new [Unreleased] Added section describing scope
No version bump. No behavior changes outside tests/.
ToS-vurdering konkluderte med at autonom cron-kjøring er unødvendig kompleks
for en solo-fork-and-own-plugin. Apply-fasen krever LLM-resonnering uansett,
så manuell trigger fra en aktiv Claude Code-sesjon er enklere og holder
pluginen klart innenfor Anthropic Consumer Terms paragraf 3 (automated access
only via API key or where explicitly permitted — Claude Code CLI er
eksemptert som offisielt verktøy).
Lagt til:
- commands/kb-update.md — ny /architect:kb-update slash-kommando som driver
poll, endringsrapport, microsoft_docs_fetch-update og commit fra sesjonen.
Argumenter: --skip-discover, --priorities, --dry-run, --single-commit
- Catalog-entry i playground HTML for kb-update (categori: tool, 4 input-felt)
Slettet (Wave 3-5 reversert, ~1500 linjer + 7 testmoduler):
- scripts/install-kb-cron.mjs (cross-OS scheduler-installer)
- scripts/kb-update/weekly-kb-cron.mjs (cron-orkestrator med pre-flight, lock,
backup, claude -p subprocess, post-run verify, rollback)
- scripts/kb-update/templates/ (4 scheduler-templates: launchd plist, systemd
service+timer, Windows ps1 + README)
- scripts/kb-update/lib/auth-mode.mjs (cron-spesifikk auth validation)
- scripts/kb-update/lib/lock-file.mjs (PID+mtime stale-detection)
- scripts/kb-update/lib/cost-estimat.mjs (pre-flight budget-cap)
- 7 testmoduler under tests/kb-update/ for slettet kode
- tests/test-kb-update.sh (Bash-3.2-shim, erstattet av direkte node --test)
Beholdt (utility-laget fortsatt brukbart):
- run-weekly-update.mjs, report-changes.mjs, build-registry.mjs,
discover-new-urls.mjs (KB change-detection-pipelinen)
- lib/atomic-write, lib/backup, lib/cross-platform-paths, lib/log-rotate
- 4 testmoduler (42/42 tester PASS)
Endret:
- hooks/scripts/session-start-context.mjs: fjern kb-update-status.json-overvaaking
- tests/run-e2e.sh --kb-update kaller node --test direkte i stedet for shim
- README.md, CLAUDE.md: KB-vedlikehold-seksjon rewriter for manuell modell
- plugin.json: 1.11.0 -> 1.12.0
- Rot README + CLAUDE.md: ms-ai-architect-versjon bumpet
Schedulering er bevisst utenfor scope og overlatt til brukeren — eventuelle
forks som vil ha periodisk varsling kan sette opp egen cron / launchd /
GitHub Actions som kjører rapport-fasen og varsler om aa kjore
/architect:kb-update i CC-sesjon.
Verifisering:
- bash tests/validate-plugin.sh: 219 PASS, 0 FAIL
- bash tests/run-e2e.sh --kb-update: 42/42 inner + suite PASS
- bash tests/run-e2e.sh --playground: 271/271 PASS (statisk + parsers)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Step 12 — adds --kb-update flag to tests/run-e2e.sh and a Bash 3.2-compatible
shim test-kb-update.sh that runs `node --test tests/kb-update/*.test.mjs`
(shell-glob form; Node 25 rejects directory-form arguments). Shim translates
node --test exit code + parsed pass/fail counts into the e2e-helpers.sh
suite counters (init_suite/print_summary).
Verification:
- Playground baseline 271 PASS unchanged before/after edit
- bash tests/run-e2e.sh --kb-update: exits 0, 110/110 inner tests pass
- bash tests/run-e2e.sh --all: kb-update suite included
- Pre-existing ROS-fixture absence (tests/fixtures/ros-analysis/) is
unrelated to this change and remains for separate handling
Wave 5 of 7 in v1.12.0 auto-KB-update plan.
Plan: .claude/projects/2026-05-04-kb-update-fork-and-own/plan.md (Step 12)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Step 11 of v1.12.0 plan (.claude/projects/2026-05-04-kb-update-fork-and-own/plan.md).
scripts/install-kb-cron.mjs lives at the scripts/ root (not inside
scripts/kb-update/) because it is a plugin-wide install tool, not part of
the KB-update pipeline itself. Reads the appropriate template from
scripts/kb-update/templates/, fills {{NODE_BIN}}, {{PLUGIN_ROOT}},
{{LOG_FILE}}, {{SCHEDULE_HOUR/MINUTE/DAY_OF_WEEK}} placeholders, writes
to the platform-specific scheduler dir, and registers the job:
macOS - launchctl bootstrap gui/<uid> <plist> (load -w fallback)
Linux - systemctl --user daemon-reload && enable --now <timer>
Windows - powershell -ExecutionPolicy Bypass -File <ps1> (beta)
Flags: --print-only, --target macos|linux|windows, --uninstall, --purge,
--node-bin, --claude-bin, --schedule "M H * * D" (default: Wed 04:23).
UID resolution for launchctl is guarded by process.getuid() POSIX-only
(undefined on Windows). MCP server presence in ~/.claude.json is
warning-only per brief Spørsmål 7. WSL detected via /proc/version.
Cross-OS rendering supported via --print-only --target <other>; install
on a non-host target rejects with explicit error.
11 subprocess + filesystem-snapshot tests in
tests/kb-update/test-install-cron.test.mjs verify --print-only produces
filled templates with no unsubstituted {{...}} placeholders, --print-only
writes nothing under HOME, --uninstall is idempotent on an empty HOME,
--schedule substitutes correctly, and invalid flags reject with non-zero
exit. Tests never invoke launchctl/systemctl/Register-ScheduledTask
against real schedulers.
Tests: 110/110 pass (was 99 before this step).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
D5 — final session of post-v3.4.1 stabilisering. Repo prepared for the
upcoming voyage-rebrand (v4.0.0 hard cut: ultraplan-local → voyage,
/ultra*-local → /trek*).
Tracked changes:
- README.md: cut #9 jargon — '### Self-verifying plan chain' →
'### Manifest-verified steps' with body rewritten to drop the
'objective completion predicate' jargon.
- package.json: removed 'simulate' script that pointed to
tests/simulator/run-pipeline.mjs (file never existed; D3 was
dropped before that work shipped).
- .claude-plugin/marketplace.json: ultraplan-local description
updated from 'Four-command pipeline' to the current six-command
shape with Handover 6 + multi-session resumption (matches
plugin.json).
- docs/_archive-ultra-suite-brief_2.md: deleted (tracked planning-doc
unrelated to ultraplan-local; 117 lines, no inbound references).
Untracked cleanup (not in commit, gitignored):
- 4 stale plugin-root .local.md (NEXT-SESSION-PROMPT.archived,
PLAN-v2.1-phase3, V3.0-MULTI-SESSION-PLAN, etc.)
- 3 docs/ planning .local.md (ultracontinue-brief, ultracontinue-design-notes,
ultraexecute-v2-observations)
- examples/01-add-verbose-flag/perf-baseline.local.md
- .claude/plans/ultraplan-2026-04-17-logger.md
- 9 closed sub-projects under .claude/projects/ (skill-factory,
ultracontinue, ultrareview-local, ultra-pipeline-speedup,
examples-02-real-cli, post-v3.4.0-roadmap, spor-c-q3-cache,
v3.3.1-ultracontinue-fixes)
Cuts #7 (template-duplisering) + #10 (Two kinds of briefs) reviewed
and judged not needed: README has 38 code-fences vs CLAUDE.md 2 (no
overlap), and 'Two kinds of briefs' is already a direct task-vs-
research-brief explanation, not jargon.
D3 + D4 droppet 2026-05-05 — voyage-rebrand renames all ultra*
references; new test infrastructure built against the old names
would need to be renamed in the same pass. Memory pin:
feedback_cleanup_vs_new_code.md.
Tests: 361 / 0 (unchanged — no test changes).
Stabilisering close-out: complete. Repo is ready for voyage-rebrand.