Commit graph

277 commits

Author SHA1 Message Date
Kjell Tore Guttormsen
65087e624f feat(config-audit): cache-prefix stability scanner CPS (v5 N3) [skip-docs]
New CPS scanner walks CLAUDE.md cascade and flags volatile content
between lines 31 and 150 — the cache-prefix window beyond TOK Pattern
A's top-30 territory. Volatile content anywhere in the cached prefix
forces a fresh cache write from that line down on every turn.

Volatile-pattern set extends TOK Pattern A with:
- shell-exec lines (! prefix) — common in CLAUDE.md to inject git/date
- ${VAR} substitutions — vary per-shell, defeat cache reuse

Severity: medium per finding. Skips lines 1-30 to avoid duplicating
Pattern A's range; CPS' value is in the 31-150 zone.

Wired into scan-orchestrator + scoring SCANNER_AREA_MAP. CPS shares
the "Token Efficiency" area with TOK; scoreByArea now deduplicates by
area name and combines counts across scanners contributing to the
same area, so the 9-area scorecard contract holds.

Fixtures volatile-mid-section/{volatile-line-60, volatile-line-200}
verify both positive (line 60) and out-of-window (line 200) cases.

[skip-docs] reason: v5 plan fences off README/CLAUDE.md badge updates
to Session 5; Forgejo pre-commit-docs-gate hook requires this tag.

Tests: 604 → 611 (+7).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-01 07:37:54 +02:00
Kjell Tore Guttormsen
0420b8cc4a feat(config-audit): /config-audit manifest command (v5 N2) [skip-docs]
New scanners/manifest.mjs CLI + commands/manifest.md slash command.
Reads activeConfig and produces a flat, ranked list of every token
source (CLAUDE.md cascade entries, plugins, skills, MCP servers, hooks)
sorted DESC by estimated_tokens.

CLAUDE.md per-file tokens are derived by distributing
claudeMd.estimatedTokens across the cascade proportional to bytes.

Tests cover both real-config (plugin root) and fixture (rich-repo with
patched HOME containing 2 plugins + 3 skills + .mcp.json) paths, plus
error handling (nonexistent path → exit 3, --output-file).

Builds on readActiveConfig from M1 (v5 alpha.2).

[skip-docs] reason: v5 plan fences off README/CLAUDE.md badge updates
to Session 5; Forgejo pre-commit-docs-gate hook requires this tag on
feat commits without doc changes.

Tests: 593 → 604 (+11).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-01 07:32:54 +02:00
Kjell Tore Guttormsen
b2407a09b3 feat(config-audit): CA-TOK-005 MCP tool-schema budget (v5 N1) [skip-docs]
Adds detectMcpToolBudget detection block in TOK scanner. Tiered severity
per project-local .mcp.json server based on toolCount:
- < 20: no finding
- 20-49: low
- 50-99: medium
- 100+: high
- null (manifest unparseable): low + "tool count unknown" message

Scoped to source==='.mcp.json' to keep findings actionable for the
audited path; plugin/user-level MCP servers are surfaced by the
manifest scanner (Step 19 / N2).

5 fixtures (mcp-budget/{14,25,60,120,unknown}-tools) use inline `tools`
arrays in .mcp.json — no node_modules needed for these tests.

Tests assert title+severity (not exact ID) since TOK IDs are sequential
per scan, not semantic per pattern.

[skip-docs] reason: v5 plan fences off README/CLAUDE.md badge updates
to Session 5; Forgejo pre-commit-docs-gate hook requires this tag on
feat commits without doc changes.

Tests: 586 → 593 (+7).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-01 07:29:57 +02:00
Kjell Tore Guttormsen
dd0d4bf738 docs(config-audit): v5 implementation log — Session 2 alpha.2 result 2026-05-01 07:14:18 +02:00
Kjell Tore Guttormsen
55cedbea2c docs(config-audit): CHANGELOG 5.0.0-alpha.2 entry 2026-05-01 07:10:52 +02:00
Kjell Tore Guttormsen
3c79f95e9a feat(config-audit): self-audit --check-readme flag (v5 F6) [skip-docs]
Filesystem counts are the source of truth; README badges parsed via
line-anchored substring (badge/<kind>-<N>-...). Emits readmeCheck object
with counts/badges/mismatches.

CLI: node scanners/self-audit.mjs --check-readme [--json]
API: runSelfAudit({ checkReadme: true }) → result.readmeCheck
Helper: checkReadmeBadges(pluginDir) for per-fixture testing

New fixture: readme-desynced/ (commands/foo + bar, README claims 1).

Note: alpha phase does NOT require result.readmeCheck.passed === true.
Self-test of real plugin currently fails (scanners 10 vs 9, tests 31 vs 543);
will be reconciled in Session 5 Step 28 (README sync).

582 → 586 tests, all green.
2026-05-01 07:09:26 +02:00
Kjell Tore Guttormsen
910567d661 feat(config-audit): HKV flags verbose hook output (v5 M5) [skip-docs]
Static heuristic — counts console.log / process.stdout.write lines per
referenced hook script. > 50 → low CA-HKV-NNN finding.

New fixtures:
- hooks-verbose/ (61 verbose lines → triggers)
- hooks-quiet/ (5 lines → no finding)

580 → 582 tests, all green.
2026-05-01 07:05:45 +02:00
Kjell Tore Guttormsen
7181862644 chore(config-audit): allow fake node_modules in tests/fixtures (v5 M1) [skip-docs]
The mcp-tool-heavy fixture relies on node_modules/mcp-heavy/package.json
being committed so the v5 M1 tool-count detection test runs deterministically.
Add an unignore rule for tests/fixtures/**/node_modules/.
2026-05-01 07:02:54 +02:00
Kjell Tore Guttormsen
1422daf895 feat(config-audit): MCP tool-count detection with manifest fallback (v5 M1) [skip-docs]
readActiveMcpServers now resolves tool count via:
  1. In-config tools array
  2. Cached tools/list at \$HOME/.claude/config-audit/mcp-cache/<name>.json
  3. node_modules/<pkg>/package.json (resolved from npx <pkg>)
  4. Fallback: { toolCount: null, toolCountUnknown: true }

estimateTokens uses detected toolCount (heavy server > light server).

New fixture: mcp-tool-heavy/ with mocked node_modules/mcp-heavy/package.json (20 tools).

576 → 580 tests, all green.
2026-05-01 07:02:08 +02:00
Kjell Tore Guttormsen
9a44df22ac feat(config-audit): TOK flags skill description > 500 chars (v5 M2) [skip-docs]
- New Pattern F in TOK: low-severity finding when SKILL.md description > 500 chars
- Scoped to discovery.files (project-local) — activeConfig.skills walk would
  pull in user/plugin skills out of project scope
- New fixtures: skill-bloated (594-char desc) + skill-tight (46-char baseline)

574 → 576 tests, all green.
2026-05-01 06:58:42 +02:00
Kjell Tore Guttormsen
25ca6139b4 feat(config-audit): TOK flags CLAUDE.md cascade > 10k tokens (v5 M4) [skip-docs]
- New Pattern E in TOK: emits medium finding when activeConfig.claudeMd.estimatedTokens > 10_000
- Uses cascade tokens, file count, and calibration note as evidence
- New fixtures: large-cascade (37k bytes / 14475 cascade tokens) + small-cascade (5k baseline)

572 → 574 tests, all green.
2026-05-01 06:53:12 +02:00
Kjell Tore Guttormsen
9330124f5c feat(config-audit): flag additionalDirectories > 2 (v5 M6) [skip-docs]
- Add 'additionalDirectories' to KNOWN_KEYS
- Emit low severity finding when length > 2
- New fixtures: additional-dirs-many (3 entries) + additional-dirs-ok (2)

569 → 572 tests, all green.
2026-05-01 06:50:24 +02:00
Kjell Tore Guttormsen
58d6b5b9ea feat(config-audit): recalibrate TOK severities for tokens/turn (v5 F7) [skip-docs]
- Pattern A (cache-breaking volatile top): medium → high
- Pattern B (redundant permissions): low → medium
- Pattern C (deep @import chain): medium → low
- Add calibration_note evidence on every TOK finding
- Table-driven severity tests (identify by title, IDs are sequential)

563 → 569 tests, all green. Doc sweep deferred to Session 5 (Step 28).
2026-05-01 06:47:32 +02:00
Kjell Tore Guttormsen
5df8e8888e docs(ultraplan-local): trim README — outcomes section + remove duplication
Add "What you get" with Solo / Team / Virksomhet profiles + honest
"What it doesn't solve" list. Lets adoption decisions land before
command details.

Cuts (deduplication and historical noise):
- Architecture file-tree (~50 lines) → terse top-level layout +
  pointer to CONTRIBUTING.md. Original was stale (missing lib/,
  wrong hook-count, wrong plugin.json version note)
- "How it compares" matrix (~17 lines) — sales-coded comparison
  vs cloud tools, doesn't help adoption decisions
- Per-command "How it works" 8-step prose (~50 lines across 4
  commands) → 2-3 sentence summaries
- Exploration / Review / Research agent tables (~40 lines) →
  one-liner pointers to agents/ directory (already self-documenting)
- v1.x and v2.x migration sections (~30 lines) → pointer to
  CHANGELOG.md / MIGRATION.md
- v3.0.0, v2.4.0 historical callouts (~9 lines) — CHANGELOG owns
  these
- Cost profile bullet-list (~13 lines) → one paragraph

Net: 770 → 609 lines (-21%). Tests green.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-01 06:44:44 +02:00
Kjell Tore Guttormsen
3aba15c566 docs(config-audit): v5 implementation log — Session 1 alpha.1 result
Per-step result table for Steps 1-9 + 8b with commit SHAs and notable
deviations (Step 6 baseline switch to sonnet-era, Step 8 surprise on
sonnet-era discovery scope, PathGuard hook false positive on test
fixtures). 543 → 563 tests, all green, no blockers carried forward.
2026-05-01 06:37:08 +02:00
Kjell Tore Guttormsen
919bd213f8 docs(config-audit): CHANGELOG 5.0.0-alpha.1 entry
Summarizes F1-F5 scope: TOK ↔ readActiveConfig integration, 'mcp' kind
in estimateTokens (15 → ≥500), severity-weighted scoreByArea, dead-code
removal in TOK hotspots, Pattern D / CA-TOK-004 removal.

Includes migration notes for downstream consumers (CA-TOK-* globs still
suppress 001-003; scoringVersion field added for v4→v5 detection).
2026-05-01 06:34:06 +02:00
Kjell Tore Guttormsen
08a9ead51a docs(config-audit): remove CA-TOK-004 references after F5 (v5)
knowledge/opus-4.7-patterns.md:
- Pattern 4 row removed from the catalogue table
- "Pattern 4 (sonnet-era)" detection note removed
- Threshold-calibration note no longer mentions pattern 4
- Added a short pointer explaining the v5 F5 removal

commands/tokens.md:
- "CA-TOK-001..004" → "CA-TOK-001..003" in two places
2026-05-01 06:33:01 +02:00
Kjell Tore Guttormsen
2810ee6f62 feat(config-audit): remove TOK Pattern D detectSonnetEra (v5 F5)
Pattern D was the v4 sonnet-era signature: 'config is structurally
clean but uses no Opus-4.7-specific features'. Two problems:
- It triggered on any minimal config that happened to lack skills/MCP
- The advice was generic and not actionable

The hotspots ranking and per-pattern findings (A/B/C) cover the same
ground with concrete, file-anchored signal. Dropping the noise.

BREAKING (intentional): scanners no longer emit the sonnet-era info
finding. Suppression entries and downstream tooling that reference
the v4 finding ID should be updated. Doc sweep follows in Step 8b.

Tests: sonnet-era fixture now asserts zero findings.
2026-05-01 06:31:43 +02:00
Kjell Tore Guttormsen
1486368a2b chore(release): ultraplan-local v3.1.0
Quality program release. Spor 0+1+2+3 all delivered.

- 109 zero-dep tests gate fork-readiness
- 5 validators wired into 4 commands as CLI shims
- HANDOVER-CONTRACTS.md: single source of truth for 5 pipeline handovers
- PreCompact-hook (P0) closes progress.json drift; --resume now works
- Semantic plan-critic catches paraphrased deferred decisions
- examples/01-add-verbose-flag/: hand-calibrated end-to-end pipeline demo
- 4 hooks total (pre-bash, pre-write, session-title, post-bash-stats, pre-compact-flush)
- SECURITY.md + Extending-the-plugin docs

CC v2.1.x feature adoption: F8 (MCP_CONNECTION_NONBLOCKING),
F9 (sessionTitle), F3 (duration_ms), F12 (disableSkillShellExecution).
F2 (hook 'if'-field) deferred — universal protection wins.

Pre-flight verification:
- npm test → 109 pass
- plan-validator --strict templates/plan-template.md → READY
- plan-validator --strict tests/fixtures/plan-fase-narrative.md → FAIL (expected)
- grep smallCodebase|mediumCodebase|largeCodebase → 0 hits

Version bumped: package.json, plugin.json, README badge, root README,
root CLAUDE.md.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-01 06:31:42 +02:00
Kjell Tore Guttormsen
0d8a9af3d6 fix(config-audit): remove TOK dead take + hotspot padding (v5 F4)
The buildHotspots padding loop and unused 'take' variable were dead
code from the v3 hotspots-min contract. Replaced with a clean
ranked.slice(0, HOTSPOTS_MAX). Tiny fixtures may now return fewer
than 3 hotspots, which is the honest answer; the contract now only
asserts <= 10.

Tests: +2 cases — every hotspot.source is unique (no padding); length
never exceeds HOTSPOTS_MAX.
2026-05-01 06:29:33 +02:00
Kjell Tore Guttormsen
9ecd225018 feat(ultraplan-local): Spor 3 — semantic plan-critic, examples, CC features, security docs
- agents/plan-critic.md: rule #7 split into literal blockers (TBD/TODO/FIXME)
  + semantic rubric with 8 deferred-decision tests; calibrated against the
  5-phrase corpus from the v3.1.0 quality brief
- hooks/hooks.json: rebuilt from corrupted state; valid JSON, registers
  PreToolUse(Bash,Write), UserPromptSubmit, PostToolUse(Bash), PreCompact
- hooks/scripts/session-title.mjs: NEW — sets ultra:<cmd>:<slug> session
  title for ultra commands (CC v2.1.94+)
- hooks/scripts/post-bash-stats.mjs: NEW — appends duration_ms per Bash
  call to ultraexecute-stats.jsonl (CC v2.1.97+)
- SECURITY.md: NEW — Forgejo private-issue reporting, supported = current
  minor only, scope = 4 hooks + denylist, hardening recommendations
- docs/architect-bridge-test.md: NEW — manual smoke checklist for the
  ultraplan ↔ ultra-cc-architect bridge
- examples/01-add-verbose-flag/: NEW — calibrated end-to-end (brief +
  research + plan + progress.json) for fork-er onramp; all four artifacts
  pass their validators
- README.md: + Extending the plugin, + Headless multi-session tuning
  (MCP_CONNECTION_NONBLOCKING), + Session titles, + Per-step timing,
  + disableSkillShellExecution recommendation
- CLAUDE.md: documents session-title.mjs and post-bash-stats.mjs
- root README.md: v3.1.0 entry expanded with Spor 2+3 deliverables

CC features adopted: F8, F9, F12 implemented; F3 implemented as Bash
PostToolUse logger; F2 (hook 'if'-field scoping) deferred — universal
protection beats reduced-scope protection for blocked commands.

Tests: 109/109 green.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-01 06:28:44 +02:00
Kjell Tore Guttormsen
34669d596c feat(config-audit): TOK consumes readActiveConfig (v5 F1)
Removes the v4 'void readActiveConfig' placeholder and wires the
active-config snapshot into the TOK scanner.

Per-turn behavior changes:
- Each enabled MCP server becomes its own hotspot entry (richer than
  the parent .mcp.json file alone)
- total_estimated_tokens now includes MCP server cost
- result.activeConfig exposes a small summary
  (claudeMdEstimatedTokens, mcpServerCount, pluginCount, skillCount)

Failures of readActiveConfig are non-fatal — the scanner falls back
to the discovery-only path used in v4.

Tests: +3 cases on the new tok-active-config fixture
(.mcp.json with 2 servers, CLAUDE.md, plugin skeleton).
2026-05-01 06:27:34 +02:00
Kjell Tore Guttormsen
ce7c42f517 fix(config-audit): MCP token callers use 'mcp' kind (v5 F2)
Two MCP enumeration paths in readActiveMcpServers now pass kind='mcp'
to estimateTokens with optional toolCount derived from def.tools array
(populated when callers cache MCP discovery — Step 14 wires that up).

Hook callers keep kind='item' (no schema overhead).

Visible effect: every active MCP server jumps from estimatedTokens=15
to >= 500 (or higher when toolCount is known). The whats-active output
and TOK hotspots now reflect actual MCP cost.

Tests: assert mcpServers[].estimatedTokens >= 500 in fixture.
2026-05-01 06:22:54 +02:00
Kjell Tore Guttormsen
48d560a209 feat(config-audit): add 'mcp' kind to estimateTokens (v5 F2)
Differentiate MCP servers from generic 'item' (flat 15) — they actually
cost 500+ tokens per turn for protocol metadata and tool schemas.

estimateTokens(bytes, 'mcp', {toolCount}) returns max of:
- 500 token floor (base overhead)
- ceil(bytes / 3.5) (json-rate when bytes known)
- 500 + toolCount * 200 (when tool count is detected; Step 14 wires this)

Caller-side migration in next commit (Step 5).

Tests: +4 cases for mcp kind.
2026-05-01 06:21:30 +02:00
Kjell Tore Guttormsen
8ca391fdb2 fix(llm-security): correct distribution URLs to marketplace path
The plugin lives in ktg-plugin-marketplace and is distributed via the
Claude Code marketplace mechanism. There is no standalone
open/claude-code-llm-security repo; references to it were aspirational
and never realized.

- package.json: homepage now deep-links to plugins/llm-security/ in the
  marketplace; repository.url uses the marketplace repo with directory
  field (npm convention for monorepo plugins); bugs.url routes to
  marketplace issue tracker.
- CLAUDE.md: "Public Repository" section replaced with "Distribution"
  section documenting the marketplace install path.
- CONTRIBUTING.md: issue tracker URL points at marketplace issues with
  [llm-security] prefix convention.
- CHANGELOG.md: v7.3.1 entry rewritten to reflect actual change
  (URLs corrected to marketplace, not "fixed from one wrong URL to
  another wrong URL").

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-01 06:20:54 +02:00
Kjell Tore Guttormsen
a65c7f4080 feat(config-audit): severity-weighted scoreByArea (v5 F3)
Replace count-based pass-rate with severity-weighted penalty:
- penalty = sum(count[s] * WEIGHTS[s])
- maxBudget = max(10, findingCount * 4)
- passRate = max(0, 100 - penalty / maxBudget * 100)

A few lows no longer crater an area's grade; a single high or critical
consumes a large fraction of budget. Mirrors the operator intuition that
severity, not count, is the signal.

BREAKING (intentional): scoring semantics differ from v4 for non-clean
configs. Add scoringVersion: 'v5' to the returned struct so consumers
can detect the version. baseline-all-a remains all-A (no critical/high
on that fixture).

Tests: +6 cases for severity weighting; existing "many findings" test
updated to use highs (where v5 still drops the grade as expected).
2026-05-01 06:20:08 +02:00
Kjell Tore Guttormsen
e5efc2ff64 feat(config-audit): export WEIGHTS from severity.mjs (v5 F3 prep)
Promote WEIGHTS const to named export with Object.freeze for downstream
use in scoring.mjs (severity-weighted scoreByArea, F3).

Tests: +2 cases asserting WEIGHTS shape.
2026-05-01 06:16:28 +02:00
Kjell Tore Guttormsen
62a9335772 chore(llm-security): v7.3.1 — stabilization patch for forkers and downstream users
No behavior changes. Sets the public stance, tightens documentation, and
removes coherence drift so anyone forking or downloading the plugin gets
a consistent starting point.

Added:
- CONTRIBUTING.md — public fork-and-own guide. Why PRs are not accepted,
  how to fork well, what is welcome via issues.
- README "Project scope" section — out-of-scope table naming what is
  fork-and-own territory (web dashboard, fleet policy, runtime firewall,
  IDE LSP, compliance pack, ticketing, multi-tenancy, ML detectors,
  marketplace UI, SSO/SCIM/RBAC) with commercial alternatives.
- package.json: bugs.url, CONTRIBUTING/SECURITY/CHANGELOG in files
  whitelist for npm publishing.

Changed:
- SECURITY.md rewritten. Supported-versions table from stale 5.1.x to
  current reality (7.3.x active, 7.0-7.2 best-effort, <7.0 EOL).
  Best-effort solo response timeline. Scope expanded to bin/.
- Scanner VERSION constants synced to plugin version. Was 6.0.0 in
  dashboard-aggregator and posture-scanner.
- package.json repository.url corrected from fromaitochitta/ to open/.
- README "Feedback & contributing" links to CONTRIBUTING.md.

Fixed:
- pre-compact-scan size-cap timing test ceiling raised 500ms -> 1000ms.
  Was a flake on Intel Mac and CI under load. Design target unchanged
  (<500ms, documented in CLAUDE.md).

Notes:
- First patch on the stabilization line (post-2026-05-01).
- Wave E attack-simulator scenarios deferred indefinitely; coverage
  remains at 72.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-01 06:14:03 +02:00
Kjell Tore Guttormsen
4bd7cd5056 docs(config-audit): v5.0.0 brief + implementation plan
Planning artifacts for v5.0.0 (token-economy round):

- v5-brief.md: scope brief with 22 items (F1-F7 + M1-M8 + N1-N7), revised
  with Avklaringer-section after critical review (N7 dropped, M3+N6 merged,
  N5 promoted to v5.0.0, SC-6/SC-10 reformulated)
- v5-plan.md: 31-step implementation plan in 5 sessions
  (alpha.1 → alpha.2 → beta.1 → rc.1 → release). B+ score (84/100) after
  plan-critic + scope-guardian review addressed all blockers/majors/gaps.
- v5-implementation-log.md: per-session status record (skeleton)

Sessions track via state files (REMEMBER.md, TODO.md gitignored;
implementation-log.md committed; NEXT-SESSION-PROMPT.local.md gitignored).

No code changes in this commit — planning only.
2026-05-01 06:10:44 +02:00
Kjell Tore Guttormsen
1f0b03b1e5 docs(graceful-handoff): 2.0 — sync README, CLAUDE.md, root README 2026-05-01 06:07:02 +02:00
Kjell Tore Guttormsen
cc38155fa6 feat(ultraplan-local): Spor 2 — HANDOVER-CONTRACTS.md + PreCompact-hook (P0 progress.json drift fix)
Reconciles divergence after parallel-session race: includes both Spor 1 wiring (validators inn i 4 commands + 1 agent) og Spor 2 (HANDOVER-CONTRACTS.md + PreCompact-hook).

Spor 1 wiring (re-applied etter rebase):
- /ultrabrief-local Phase 4g — brief-validator post-write
- /ultraplan-local Phase 1 — brief-validator --soft + research-validator --dir + architecture-discovery
- planning-orchestrator Phase 5.5 — plan-validator --strict erstatter 3 grep -cE-kall
- /ultraexecute-local Phase 2.3 (--validate) — plan-validator + progress-validator
- YAML-parser-utvidelse: list-of-dicts (must_contain), støtter v1.7 template-format

Spor 2 NEW:
- docs/HANDOVER-CONTRACTS.md (~310 linjer) — single source of truth for de 5 pipeline-handover-formatene m/ faste sub-headinger (Producer / Consumer / Path / Frontmatter schema / Body invariants / Validation strategy / Versioning / Failure modes)
- hooks/scripts/pre-compact-flush.mjs (NY) — fikser dokumentert P0 i docs/ultraexecute-v2-observations-from-config-audit-v4.md:
  * Fyrer på PreCompact-event (CC v2.1.105+)
  * Lokaliserer progress.json under .claude/projects/*/
  * Sammenligner stored current_step mot git log {session_start_sha}..HEAD
  * Atomisk write (tmp + rename), monoton — current_step kan aldri reduseres
  * Aldri blokkerer compaction (exit 0)
- hooks/hooks.json registrerer PreCompact-hooken

Resultat: /ultraexecute-local --resume virker nå etter context compaction selv ved skill-driven execution.

Docs:
- README.md (plugin): "Quality infrastructure", "Handover contracts", "PreCompact resume integrity"
- CLAUDE.md (plugin): peker til HANDOVER-CONTRACTS.md + dokumenterer pre-compact-flush
- README.md (marketplace root): bullet-liste over Spor 2-deliverables (resolved merge-konflikt fra parallell-sesjon)

Tester: 109 grønn (ingen regresjon).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-01 06:07:01 +02:00
Kjell Tore Guttormsen
0707d03bea chore(graceful-handoff): 2.0.0 — bump version, remove auto_discover, update CHANGELOG [skip-docs]
Step 8 of v2.0 plan.
2026-05-01 06:06:25 +02:00
Kjell Tore Guttormsen
a67411ae26 feat(graceful-handoff): 2.0 — register hooks and statusLine in hooks.json [skip-docs]
Step 7 of v2.0 plan. Registers SessionStart, Stop, and statusLine hooks.
Note: statusLine top-level placement in hooks.json is an open assumption
(brief Assumption 1) — verified to be valid JSON syntax; live smoke-test
required to confirm Claude Code loads it from this location vs requiring
settings.json placement.
2026-05-01 06:06:25 +02:00
Kjell Tore Guttormsen
4076bf904a feat(graceful-handoff): 2.0 — SessionStart auto-load handoff on resume/compact [skip-docs]
Step 6 of v2.0 plan. SessionStart hook fires on source: resume or
source: compact, walks up to 3 levels searching for
NEXT-SESSION-*.local.md, injects content via additionalContext, and
archives the file (rename to *.archived.local.md) to prevent stale-load
in later sessions. 9 tests cover sources, multi-level search,
topic-slug variants, archive filtering, malformed payload.
2026-05-01 06:06:25 +02:00
Kjell Tore Guttormsen
81aba9a5f5 feat(graceful-handoff): 2.0 — Stop hook auto-execute + pipeline staging fix [skip-docs]
Step 5 of v2.0 plan + critical pipeline fix.

Stop hook (hooks/scripts/stop-context-monitor.mjs):
- Estimates context usage from transcript size (chars/3.5 / window_size)
- At ≥70%, spawns handoff-pipeline.mjs --auto --no-push synchronously
- Reads context_window_size from payload (supports 1M windows)
- Lock file at <transcript_dir>/.handoff-lock-<session_id>
- Gracefully handles missing CLAUDE_PLUGIN_ROOT, missing transcript

Pipeline fix (scripts/handoff-pipeline.mjs):
- REMOVED `git add -A` (CLAUDE.md anti-pattern: scoops up unrelated WIP)
- Now stages ONLY artifact + REMEMBER.md/TODO.md if present
- New regression test 'pipeline never stages unrelated dirty files'

Tests: 7 stop-hook tests use stub pipeline (no real git operations);
11 pipeline tests including new regression for explicit staging.
2026-05-01 06:06:25 +02:00
Kjell Tore Guttormsen
1efb1b3176 feat(graceful-handoff): 2.0 — statusLine context-percent hint [skip-docs]
Step 4 of v2.0 plan. statusLine hook reads context_window.used_percentage
from stdin payload and prints display-only hint at 60% / 70%. NEVER runs
git (research/03 — statusLine scripts can be cancelled mid-flight, unsafe
for side effects). 9 tests cover thresholds, null payload, malformed JSON.
Includes hook-helper.mjs copied from llm-security as test infrastructure.
2026-05-01 06:06:25 +02:00
Kjell Tore Guttormsen
8d4e16bf8e feat(graceful-handoff): 2.0 — JSON pipeline script with idempotency and confirm-on-commit [skip-docs]
Step 2 of v2.0 plan. Deterministic Node script that classifies handoff
type, renders artifact, and orchestrates commit/push with explicit
confirmation. Handles detached HEAD, no-upstream, and idempotency
(60s cooldown on clean tree). 10 tests cover dry-run, --auto path,
interactive y/n, idempotency, robustness edge cases.
2026-05-01 06:06:25 +02:00
Kjell Tore Guttormsen
1a65d8e4d5 feat(graceful-handoff): 2.0 — migrate to skills/ with disable-model-invocation [skip-docs]
Step 1 of v2.0 plan. Hard cut from commands/ to skills/ per Anthropic
recommendation for new plugins. Frontmatter sets disable-model-invocation:
true and pins model: claude-sonnet-4-6. Docs (README, CLAUDE.md, root
README) deferred to Step 9 per plan.
2026-05-01 05:45:26 +02:00
Kjell Tore Guttormsen
65c9242160 feat(ultraplan-local): Spor 1 wave 2 — 5 validators + doc-consistency, 108 tests grønn [skip-docs]
5 nye validator-moduler (alle m/ CLI-shim for invokering fra commands):
- brief-validator.mjs — frontmatter (type, brief_version, task, slug, research_topics, research_status), state machine (research_topics > 0 + skipped requires brief_quality: partial), body sections (Intent/Goal/Success Criteria)
- research-validator.mjs — type=ultraresearch-brief, confidence ∈ [0,1], dimensions ≥ 1, body sections, --dir mode for batch validering
- plan-validator.mjs — wrapper over plan-schema + manifest-yaml; håndhever step-count == manifest-count, plan_version=1.7
- progress-validator.mjs — schema_version, status enum, current_step in range, step shape, checkResumeReadiness
- architecture-discovery.mjs — EKSTERN KONTRAKT: drift-WARN ikke drift-FAIL; tolererer non-canonical filnavn, surfacer loose files som warnings

Doc-consistency-test pinning prose vs source-of-truth:
- agents/*.md count == CLAUDE.md agent-tabell rader
- commands/*.md mentioned i CLAUDE.md
- command frontmatter.name == filnavn
- templates/plan-template.md plan_version 1.7 invariant
- settings.json kun kjente scopes (ultraplan, ultraresearch)
- settings.json ingen exploration eller agentTeam (vestigial guard etter Spor 0)
- CLAUDE.md refererer alle 4 pipeline-commands

Wave 1 + Wave 2 = 108 tester grønn.

[skip-docs]: Test-infrastrukturen er ikke user-facing før Spor 1 wiring lander; README/CLAUDE.md oppdateres når commands faktisk endrer atferd (neste commit).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-01 05:39:47 +02:00
Kjell Tore Guttormsen
7219a5fe20 docs(readme): total overhaul for v7.3.0
Rewrites README.md from 919 → 484 lines (47% reduction). Modernized
structure, all counts updated to v7.3.0 reality (commands 19→20,
scanners 22→23, knowledge 19→22, tests 1665→1777), trimmed Version
History to last 3 versions with link to CHANGELOG.md.

Structural changes:
- Removed dated "Prompt Injection Showcase (v5.0)" section
- Removed verbose Directory Structure tree (file paths discoverable
  from CLAUDE.md and the file system itself)
- Collapsed Knowledge Base 18-row table into 5-category summary
- Merged "Architecture" mermaid + "What's inside" into single layered
  overview
- Tightened Compliance & Governance, OWASP Coverage, Workflow Examples
  to essentials only
- Added explicit v7.3.0 sections inline:
  - npm scope-hop typosquat in supply-chain hook (E13)
  - workflow-scanner W F L row in Scanners (E11)
  - .gitattributes post-clone advisory in remote scanning table (E12)
  - MCP cumulative-drift baseline + reset in Output verification + own subsection (E14)
  - rot13 + T7-T9 bash-normalize in Prompt injection + Destructive commands hooks (E3/E8/E9/E10)
  - env-var deprecation runway in Compliance & Governance (8.7)
  - Hook count corrected to 9 throughout (8.10)
- New badges: commands-20, scanners-23, knowledge-22, tests-1777

Content preserved (load-bearing):
- AI-generated disclosure
- "no PRs accepted" framing
- Sandbox defense-in-depth tables
- OWASP coverage matrix
- Defense philosophy section
- Self-scan + malicious-skill-demo references
- Recommended-combo with parry-guard

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-01 05:37:42 +02:00
Kjell Tore Guttormsen
205cdbf77f feat(ultraplan-local): Spor 1 wave 1 — lib/parsers + 66 tests grønn
7 nye moduler:
- lib/util/result.mjs — Result-shape m/ ok/fail/combine helpers
- lib/util/frontmatter.mjs — håndruller YAML-frontmatter-parser (subset, zero deps)
- lib/parsers/plan-schema.mjs — v1.7 step-regex + forbidden-heading-deteksjon (Fase/Phase/Stage/Steg)
- lib/parsers/manifest-yaml.mjs — per-step Manifest YAML-ekstraksjon m/ regex-validering
- lib/parsers/project-discovery.mjs — finn brief/research/architecture/plan/progress i prosjektmappe
- lib/parsers/arg-parser.mjs — $ARGUMENTS for alle 4 commands m/ flag-schema
- lib/parsers/bash-normalize.mjs — løftet fra hooks/scripts/pre-bash-executor.mjs

6 test-filer (66 tester totalt) — alle grønn:
- frontmatter (CRLF/BOM, scalars, lister, indent-rejection)
- plan-schema (positive Step-form, negative Fase/Phase/Stage/Steg, numbering, slicing)
- manifest-yaml (extraction, parsing, regex-validering, missing-key detection)
- project-discovery (sortert research, architecture-detection, phase-requirements)
- arg-parser (boolean/valued/multi-value flags, kvotert positional, ukjente flag)
- bash-normalize (\${x}/\\\\evasion, ANSI-stripping, full canonicalize-pipeline)

Forbereder Wave 2 (validators) og Spor 1-wiring inn i commands.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-01 05:35:28 +02:00
Kjell Tore Guttormsen
c4183b8b4d chore(release): bump to v7.3.0
Batch C release. Closes 12 implementation tasks (E3, E8-E14, 8.4, 8.6,
8.7, 8.10) across four execution waves: A (bash + decoder), B (supply
chain + workflow scanner), C (MCP cumulative drift), D (code quality).

Wave E (9 new attack-simulator scenarios for the new defenses) deferred
to v7.3.1 — defenses are unit-tested per wave; the deferred work adds
attack-simulator regression coverage on top, not the primary safety net.

Tests: 1665+ → 1777 (Wave A-D cumulative, +112).

Version sync targets touched:
- package.json
- .claude-plugin/plugin.json
- CLAUDE.md (header)
- README.md (badge + new release-history row)
- scanners/ide-extension-scanner.mjs (VERSION constant)
- ../../README.md (marketplace root plugin entry)
- CHANGELOG.md (new [7.3.0] section per Keep a Changelog, all 12 task
  IDs covered individually under Added/Changed/Documentation/Tests/Notes)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-01 05:28:45 +02:00
Kjell Tore Guttormsen
1016914fc1 chore(ultraplan-local): Spor 0 — foundation for v3.1.0 kvalitetsprogram
- package.json med node:test runner og scripts (test, simulate), zero deps
- settings.json: fjern vestigial exploration- og agentTeam-blokker (verifisert leset av ingen kode via grep)
- docs/: commit subagent-delegation-audit.md og ultraexecute-v2-observations-from-config-audit-v4.md (begge real arkitektur-notater)
- docs/: arkiver ultra-suite-brief_2.md som _archive- (var paste fra annet plugin-arbeid, irrelevant her)
- tests/helpers/hook-helper.mjs kopiert fra llm-security m/ provenance-kommentar

Forberedelse for Spor 1 (lib/-moduler), Spor 2 (HANDOVER-CONTRACTS + PreCompact-hook), Spor 3 (bug-fixes + CC-features).

Plan: ~/.claude/plans/det-neste-vi-gj-r-eventual-adleman.md

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-01 05:27:44 +02:00
Kjell Tore Guttormsen
ab504bdf8c refactor(marketplace): split cc-architect from ultraplan-local into its own plugin
Extract `/ultra-cc-architect-local` and `/ultra-skill-author-local` plus all 7
supporting agents, the `cc-architect-catalog` skill (13 files), the
`ngram-overlap.mjs` IP-hygiene script, and the skill-factory test fixtures
from `ultraplan-local` v2.4.0 into a new `ultra-cc-architect` plugin v0.1.0.

Why: ultraplan-local had drifted into containing two distinct domains — a
universal planning pipeline (brief → research → plan → execute) and a
Claude-Code-specific architecture phase. Keeping them together forced users
to inherit an unfinished CC-feature catalog (~11 seeds) when they only
wanted the planning pipeline, and locked the catalog and the pipeline into
the same release cadence. The architect was already optional and decoupled
at the code level — only one filesystem touchpoint remained
(auto-discovery of `architecture/overview.md`), which already handles
absence gracefully.

Plugin manifests:
- ultraplan-local: 2.4.0 → 3.0.0 (description + keywords updated)
- ultra-cc-architect: new at 0.1.0 (pre-release; catalog is thin, Fase 2/3
  of skill-factory unbuilt, decision-layer empty, fallback list still
  needed)

What stays in ultraplan-local: brief/research/plan/execute commands, all
19 planning agents, security hooks, plan auto-discovery of
`architecture/overview.md` (filesystem-level contract, not code-level).

What moved (28 files via git mv, R100 — full history preserved):
- 2 commands, 8 agents, 1 skill catalog (13 files), 2 scripts, 8 fixtures

Documentation updates: plugin CLAUDE.md and README.md for both plugins,
root README.md (added ultra-cc-architect section, updated ultraplan-local
section), root CLAUDE.md (added ultra-cc-architect to repo-struktur),
marketplace.json (registered ultra-cc-architect), ultraplan-local
CHANGELOG.md (v3.0.0 entry with migration guidance).

Test verification: ngram-overlap.test.mjs passes 23/23 from new location.

Memory updated: feedback_no_architect_until_v3.md now points at the new
plugin and reframes the threshold around catalog maturity rather than an
ultraplan-local milestone.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-30 17:18:47 +02:00
Kjell Tore Guttormsen
97c5c9d934 docs(claude-md): 8.10 — fix hooks count + add doc-consistency test for hook-table sync 2026-04-30 17:12:49 +02:00
Kjell Tore Guttormsen
ba5f2b64ad feat(policy-loader): 8.7 — env-var deprecation warnings (v8.0.0 removal) 2026-04-30 17:11:07 +02:00
Kjell Tore Guttormsen
e8ea75fe6b docs(hardening-guide): 8.6 — sandbox-architecture rationale (no code consolidation) 2026-04-30 16:55:45 +02:00
Kjell Tore Guttormsen
2b7329151c docs(severity): 8.4 — @deprecated annotation on riskScoreV1 2026-04-30 16:54:37 +02:00
Kjell Tore Guttormsen
001df2ebe8 feat(commands): E14 part 3 — /security mcp-baseline-reset slash command
Wave C step C3: closes E14 with the user-facing reset command.

After a legitimate MCP server upgrade the sticky baseline (added in C1)
becomes a stale "what the tool used to say" anchor and every subsequent
post-mcp-verify advisory will re-flag the change. /security mcp-baseline-reset
lets the user acknowledge the upgrade so the next call seeds a fresh
baseline.

New files:
- scanners/mcp-baseline-reset.mjs — small CLI wrapper around clearBaseline /
  listBaselines. Modes: --list (read-only), --target <name>, no-args (all).
  Outputs JSON summary on stdout. Exit 0 always (idempotent).
- commands/mcp-baseline-reset.md — dispatcher following mcp-inspect.md
  shape. Frontmatter: name=security:mcp-baseline-reset, sonnet model,
  Read/Bash/AskUserQuestion tools. 4-step body (list -> confirm scope
  -> execute -> confirm result).
- tests/scanners/mcp-baseline-reset.test.mjs — 10 CLI tests across
  --list, --target, clear-all, idempotency, history preservation, and
  bare-positional sugar.

Updated:
- commands/security.md — new row in commands table after mcp-inspect.
- CLAUDE.md — new commands-table row + new v7.3.0 narrative section
  describing the baseline schema, cumulative-drift detection, reset
  semantics, and the LLM_SECURITY_MCP_CACHE_FILE override.
- Plugin README.md — new MCP-baseline-reset row in commands table,
  scanner count 12 standalone -> 13 standalone, new "MCP Description
  Drift (E14, v7.3.0)" subsection explaining the sticky baseline,
  cumulative threshold, reset semantics, and env-var override.
- Root marketplace README.md — scanner count 22 -> 23 (10 orchestrated +
  13 standalone), command count 19 -> 20, test count 1511 -> 1768.

Wave C complete: 1738 -> 1768 tests (+30 across C1/C2/C3). Per plan,
Wave C does NOT bump the plugin version — that lands at the wave-bundle
release. The advisory text in post-mcp-verify already references the
new command path so the user has a ready remediation step.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-30 16:49:01 +02:00
Kjell Tore Guttormsen
427b68eca9 feat(post-mcp-verify): E14 part 2 — cumulative-drift MEDIUM advisory [skip-docs]
Wave C step C2: surface the cumulative-drift signal from
checkDescriptionDrift() (added in C1) as a separate MEDIUM advisory
with finding category mcp-cumulative-drift. Independent of the existing
per-update drift advisory — a slow-burn rug-pull that keeps each update
below the 10% per-update threshold but cumulatively drifts >=25% from
the sticky baseline now triggers the new advisory without ever crossing
the per-update bar.

The advisory references /security mcp-baseline-reset (added in C3) so
the user knows how to acknowledge a legitimate MCP server upgrade.

CLAUDE.md updates:
- post-mcp-verify hooks-table row mentions per-update + cumulative drift
- mcp-description-cache lib bullet documents baseline schema, history,
  cumulative threshold policy key, and LLM_SECURITY_MCP_CACHE_FILE
  override.

Tests: 2 new hook tests using LLM_SECURITY_MCP_CACHE_FILE for cache
isolation. Existing 68 still pass; total 70.

Plugin README and root marketplace README updates land in C3 alongside
the new /security mcp-baseline-reset slash command (combined Wave-C
doc update per plan §"Wave C — Touch" list).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-30 16:40:52 +02:00