ktg-plugin-marketplace

Author	SHA1	Message	Date
Kjell Tore Guttormsen	b2407a09b3	feat(config-audit): CA-TOK-005 MCP tool-schema budget (v5 N1) [skip-docs] Adds detectMcpToolBudget detection block in TOK scanner. Tiered severity per project-local .mcp.json server based on toolCount: - < 20: no finding - 20-49: low - 50-99: medium - 100+: high - null (manifest unparseable): low + "tool count unknown" message Scoped to source==='.mcp.json' to keep findings actionable for the audited path; plugin/user-level MCP servers are surfaced by the manifest scanner (Step 19 / N2). 5 fixtures (mcp-budget/{14,25,60,120,unknown}-tools) use inline `tools` arrays in .mcp.json — no node_modules needed for these tests. Tests assert title+severity (not exact ID) since TOK IDs are sequential per scan, not semantic per pattern. [skip-docs] reason: v5 plan fences off README/CLAUDE.md badge updates to Session 5; Forgejo pre-commit-docs-gate hook requires this tag on feat commits without doc changes. Tests: 586 → 593 (+7). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-01 07:29:57 +02:00
Kjell Tore Guttormsen	dd0d4bf738	docs(config-audit): v5 implementation log — Session 2 alpha.2 result	2026-05-01 07:14:18 +02:00
Kjell Tore Guttormsen	55cedbea2c	docs(config-audit): CHANGELOG 5.0.0-alpha.2 entry	2026-05-01 07:10:52 +02:00
Kjell Tore Guttormsen	3c79f95e9a	feat(config-audit): self-audit --check-readme flag (v5 F6) [skip-docs] Filesystem counts are the source of truth; README badges parsed via line-anchored substring (badge/<kind>-<N>-...). Emits readmeCheck object with counts/badges/mismatches. CLI: node scanners/self-audit.mjs --check-readme [--json] API: runSelfAudit({ checkReadme: true }) → result.readmeCheck Helper: checkReadmeBadges(pluginDir) for per-fixture testing New fixture: readme-desynced/ (commands/foo + bar, README claims 1). Note: alpha phase does NOT require result.readmeCheck.passed === true. Self-test of real plugin currently fails (scanners 10 vs 9, tests 31 vs 543); will be reconciled in Session 5 Step 28 (README sync). 582 → 586 tests, all green.	2026-05-01 07:09:26 +02:00
Kjell Tore Guttormsen	910567d661	feat(config-audit): HKV flags verbose hook output (v5 M5) [skip-docs] Static heuristic — counts console.log / process.stdout.write lines per referenced hook script. > 50 → low CA-HKV-NNN finding. New fixtures: - hooks-verbose/ (61 verbose lines → triggers) - hooks-quiet/ (5 lines → no finding) 580 → 582 tests, all green.	2026-05-01 07:05:45 +02:00
Kjell Tore Guttormsen	7181862644	chore(config-audit): allow fake node_modules in tests/fixtures (v5 M1) [skip-docs] The mcp-tool-heavy fixture relies on node_modules/mcp-heavy/package.json being committed so the v5 M1 tool-count detection test runs deterministically. Add an unignore rule for tests/fixtures/**/node_modules/.	2026-05-01 07:02:54 +02:00
Kjell Tore Guttormsen	1422daf895	feat(config-audit): MCP tool-count detection with manifest fallback (v5 M1) [skip-docs] readActiveMcpServers now resolves tool count via: 1. In-config tools array 2. Cached tools/list at \$HOME/.claude/config-audit/mcp-cache/<name>.json 3. node_modules/<pkg>/package.json (resolved from npx <pkg>) 4. Fallback: { toolCount: null, toolCountUnknown: true } estimateTokens uses detected toolCount (heavy server > light server). New fixture: mcp-tool-heavy/ with mocked node_modules/mcp-heavy/package.json (20 tools). 576 → 580 tests, all green.	2026-05-01 07:02:08 +02:00
Kjell Tore Guttormsen	9a44df22ac	feat(config-audit): TOK flags skill description > 500 chars (v5 M2) [skip-docs] - New Pattern F in TOK: low-severity finding when SKILL.md description > 500 chars - Scoped to discovery.files (project-local) — activeConfig.skills walk would pull in user/plugin skills out of project scope - New fixtures: skill-bloated (594-char desc) + skill-tight (46-char baseline) 574 → 576 tests, all green.	2026-05-01 06:58:42 +02:00
Kjell Tore Guttormsen	25ca6139b4	feat(config-audit): TOK flags CLAUDE.md cascade > 10k tokens (v5 M4) [skip-docs] - New Pattern E in TOK: emits medium finding when activeConfig.claudeMd.estimatedTokens > 10_000 - Uses cascade tokens, file count, and calibration note as evidence - New fixtures: large-cascade (37k bytes / 14475 cascade tokens) + small-cascade (5k baseline) 572 → 574 tests, all green.	2026-05-01 06:53:12 +02:00
Kjell Tore Guttormsen	9330124f5c	feat(config-audit): flag additionalDirectories > 2 (v5 M6) [skip-docs] - Add 'additionalDirectories' to KNOWN_KEYS - Emit low severity finding when length > 2 - New fixtures: additional-dirs-many (3 entries) + additional-dirs-ok (2) 569 → 572 tests, all green.	2026-05-01 06:50:24 +02:00
Kjell Tore Guttormsen	58d6b5b9ea	feat(config-audit): recalibrate TOK severities for tokens/turn (v5 F7) [skip-docs] - Pattern A (cache-breaking volatile top): medium → high - Pattern B (redundant permissions): low → medium - Pattern C (deep @import chain): medium → low - Add calibration_note evidence on every TOK finding - Table-driven severity tests (identify by title, IDs are sequential) 563 → 569 tests, all green. Doc sweep deferred to Session 5 (Step 28).	2026-05-01 06:47:32 +02:00
Kjell Tore Guttormsen	5df8e8888e	docs(ultraplan-local): trim README — outcomes section + remove duplication Add "What you get" with Solo / Team / Virksomhet profiles + honest "What it doesn't solve" list. Lets adoption decisions land before command details. Cuts (deduplication and historical noise): - Architecture file-tree (~50 lines) → terse top-level layout + pointer to CONTRIBUTING.md. Original was stale (missing lib/, wrong hook-count, wrong plugin.json version note) - "How it compares" matrix (~17 lines) — sales-coded comparison vs cloud tools, doesn't help adoption decisions - Per-command "How it works" 8-step prose (~50 lines across 4 commands) → 2-3 sentence summaries - Exploration / Review / Research agent tables (~40 lines) → one-liner pointers to agents/ directory (already self-documenting) - v1.x and v2.x migration sections (~30 lines) → pointer to CHANGELOG.md / MIGRATION.md - v3.0.0, v2.4.0 historical callouts (~9 lines) — CHANGELOG owns these - Cost profile bullet-list (~13 lines) → one paragraph Net: 770 → 609 lines (-21%). Tests green. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-01 06:44:44 +02:00
Kjell Tore Guttormsen	3aba15c566	docs(config-audit): v5 implementation log — Session 1 alpha.1 result Per-step result table for Steps 1-9 + 8b with commit SHAs and notable deviations (Step 6 baseline switch to sonnet-era, Step 8 surprise on sonnet-era discovery scope, PathGuard hook false positive on test fixtures). 543 → 563 tests, all green, no blockers carried forward.	2026-05-01 06:37:08 +02:00
Kjell Tore Guttormsen	919bd213f8	docs(config-audit): CHANGELOG 5.0.0-alpha.1 entry Summarizes F1-F5 scope: TOK ↔ readActiveConfig integration, 'mcp' kind in estimateTokens (15 → ≥500), severity-weighted scoreByArea, dead-code removal in TOK hotspots, Pattern D / CA-TOK-004 removal. Includes migration notes for downstream consumers (CA-TOK-* globs still suppress 001-003; scoringVersion field added for v4→v5 detection).	2026-05-01 06:34:06 +02:00
Kjell Tore Guttormsen	08a9ead51a	docs(config-audit): remove CA-TOK-004 references after F5 (v5) knowledge/opus-4.7-patterns.md: - Pattern 4 row removed from the catalogue table - "Pattern 4 (sonnet-era)" detection note removed - Threshold-calibration note no longer mentions pattern 4 - Added a short pointer explaining the v5 F5 removal commands/tokens.md: - "CA-TOK-001..004" → "CA-TOK-001..003" in two places	2026-05-01 06:33:01 +02:00
Kjell Tore Guttormsen	2810ee6f62	feat(config-audit): remove TOK Pattern D detectSonnetEra (v5 F5) Pattern D was the v4 sonnet-era signature: 'config is structurally clean but uses no Opus-4.7-specific features'. Two problems: - It triggered on any minimal config that happened to lack skills/MCP - The advice was generic and not actionable The hotspots ranking and per-pattern findings (A/B/C) cover the same ground with concrete, file-anchored signal. Dropping the noise. BREAKING (intentional): scanners no longer emit the sonnet-era info finding. Suppression entries and downstream tooling that reference the v4 finding ID should be updated. Doc sweep follows in Step 8b. Tests: sonnet-era fixture now asserts zero findings.	2026-05-01 06:31:43 +02:00
Kjell Tore Guttormsen	1486368a2b	chore(release): ultraplan-local v3.1.0 Quality program release. Spor 0+1+2+3 all delivered. - 109 zero-dep tests gate fork-readiness - 5 validators wired into 4 commands as CLI shims - HANDOVER-CONTRACTS.md: single source of truth for 5 pipeline handovers - PreCompact-hook (P0) closes progress.json drift; --resume now works - Semantic plan-critic catches paraphrased deferred decisions - examples/01-add-verbose-flag/: hand-calibrated end-to-end pipeline demo - 4 hooks total (pre-bash, pre-write, session-title, post-bash-stats, pre-compact-flush) - SECURITY.md + Extending-the-plugin docs CC v2.1.x feature adoption: F8 (MCP_CONNECTION_NONBLOCKING), F9 (sessionTitle), F3 (duration_ms), F12 (disableSkillShellExecution). F2 (hook 'if'-field) deferred — universal protection wins. Pre-flight verification: - npm test → 109 pass - plan-validator --strict templates/plan-template.md → READY - plan-validator --strict tests/fixtures/plan-fase-narrative.md → FAIL (expected) - grep smallCodebase\|mediumCodebase\|largeCodebase → 0 hits Version bumped: package.json, plugin.json, README badge, root README, root CLAUDE.md. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-01 06:31:42 +02:00
Kjell Tore Guttormsen	0d8a9af3d6	fix(config-audit): remove TOK dead take + hotspot padding (v5 F4) The buildHotspots padding loop and unused 'take' variable were dead code from the v3 hotspots-min contract. Replaced with a clean ranked.slice(0, HOTSPOTS_MAX). Tiny fixtures may now return fewer than 3 hotspots, which is the honest answer; the contract now only asserts <= 10. Tests: +2 cases — every hotspot.source is unique (no padding); length never exceeds HOTSPOTS_MAX.	2026-05-01 06:29:33 +02:00
Kjell Tore Guttormsen	9ecd225018	feat(ultraplan-local): Spor 3 — semantic plan-critic, examples, CC features, security docs - agents/plan-critic.md: rule #7 split into literal blockers (TBD/TODO/FIXME) + semantic rubric with 8 deferred-decision tests; calibrated against the 5-phrase corpus from the v3.1.0 quality brief - hooks/hooks.json: rebuilt from corrupted state; valid JSON, registers PreToolUse(Bash,Write), UserPromptSubmit, PostToolUse(Bash), PreCompact - hooks/scripts/session-title.mjs: NEW — sets ultra:<cmd>:<slug> session title for ultra commands (CC v2.1.94+) - hooks/scripts/post-bash-stats.mjs: NEW — appends duration_ms per Bash call to ultraexecute-stats.jsonl (CC v2.1.97+) - SECURITY.md: NEW — Forgejo private-issue reporting, supported = current minor only, scope = 4 hooks + denylist, hardening recommendations - docs/architect-bridge-test.md: NEW — manual smoke checklist for the ultraplan ↔ ultra-cc-architect bridge - examples/01-add-verbose-flag/: NEW — calibrated end-to-end (brief + research + plan + progress.json) for fork-er onramp; all four artifacts pass their validators - README.md: + Extending the plugin, + Headless multi-session tuning (MCP_CONNECTION_NONBLOCKING), + Session titles, + Per-step timing, + disableSkillShellExecution recommendation - CLAUDE.md: documents session-title.mjs and post-bash-stats.mjs - root README.md: v3.1.0 entry expanded with Spor 2+3 deliverables CC features adopted: F8, F9, F12 implemented; F3 implemented as Bash PostToolUse logger; F2 (hook 'if'-field scoping) deferred — universal protection beats reduced-scope protection for blocked commands. Tests: 109/109 green. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-01 06:28:44 +02:00
Kjell Tore Guttormsen	34669d596c	feat(config-audit): TOK consumes readActiveConfig (v5 F1) Removes the v4 'void readActiveConfig' placeholder and wires the active-config snapshot into the TOK scanner. Per-turn behavior changes: - Each enabled MCP server becomes its own hotspot entry (richer than the parent .mcp.json file alone) - total_estimated_tokens now includes MCP server cost - result.activeConfig exposes a small summary (claudeMdEstimatedTokens, mcpServerCount, pluginCount, skillCount) Failures of readActiveConfig are non-fatal — the scanner falls back to the discovery-only path used in v4. Tests: +3 cases on the new tok-active-config fixture (.mcp.json with 2 servers, CLAUDE.md, plugin skeleton).	2026-05-01 06:27:34 +02:00
Kjell Tore Guttormsen	ce7c42f517	fix(config-audit): MCP token callers use 'mcp' kind (v5 F2) Two MCP enumeration paths in readActiveMcpServers now pass kind='mcp' to estimateTokens with optional toolCount derived from def.tools array (populated when callers cache MCP discovery — Step 14 wires that up). Hook callers keep kind='item' (no schema overhead). Visible effect: every active MCP server jumps from estimatedTokens=15 to >= 500 (or higher when toolCount is known). The whats-active output and TOK hotspots now reflect actual MCP cost. Tests: assert mcpServers[].estimatedTokens >= 500 in fixture.	2026-05-01 06:22:54 +02:00
Kjell Tore Guttormsen	48d560a209	feat(config-audit): add 'mcp' kind to estimateTokens (v5 F2) Differentiate MCP servers from generic 'item' (flat 15) — they actually cost 500+ tokens per turn for protocol metadata and tool schemas. estimateTokens(bytes, 'mcp', {toolCount}) returns max of: - 500 token floor (base overhead) - ceil(bytes / 3.5) (json-rate when bytes known) - 500 + toolCount * 200 (when tool count is detected; Step 14 wires this) Caller-side migration in next commit (Step 5). Tests: +4 cases for mcp kind.	2026-05-01 06:21:30 +02:00
Kjell Tore Guttormsen	8ca391fdb2	fix(llm-security): correct distribution URLs to marketplace path The plugin lives in ktg-plugin-marketplace and is distributed via the Claude Code marketplace mechanism. There is no standalone open/claude-code-llm-security repo; references to it were aspirational and never realized. - package.json: homepage now deep-links to plugins/llm-security/ in the marketplace; repository.url uses the marketplace repo with directory field (npm convention for monorepo plugins); bugs.url routes to marketplace issue tracker. - CLAUDE.md: "Public Repository" section replaced with "Distribution" section documenting the marketplace install path. - CONTRIBUTING.md: issue tracker URL points at marketplace issues with [llm-security] prefix convention. - CHANGELOG.md: v7.3.1 entry rewritten to reflect actual change (URLs corrected to marketplace, not "fixed from one wrong URL to another wrong URL"). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-01 06:20:54 +02:00
Kjell Tore Guttormsen	a65c7f4080	feat(config-audit): severity-weighted scoreByArea (v5 F3) Replace count-based pass-rate with severity-weighted penalty: - penalty = sum(count[s] * WEIGHTS[s]) - maxBudget = max(10, findingCount * 4) - passRate = max(0, 100 - penalty / maxBudget * 100) A few lows no longer crater an area's grade; a single high or critical consumes a large fraction of budget. Mirrors the operator intuition that severity, not count, is the signal. BREAKING (intentional): scoring semantics differ from v4 for non-clean configs. Add scoringVersion: 'v5' to the returned struct so consumers can detect the version. baseline-all-a remains all-A (no critical/high on that fixture). Tests: +6 cases for severity weighting; existing "many findings" test updated to use highs (where v5 still drops the grade as expected).	2026-05-01 06:20:08 +02:00
Kjell Tore Guttormsen	e5efc2ff64	feat(config-audit): export WEIGHTS from severity.mjs (v5 F3 prep) Promote WEIGHTS const to named export with Object.freeze for downstream use in scoring.mjs (severity-weighted scoreByArea, F3). Tests: +2 cases asserting WEIGHTS shape.	2026-05-01 06:16:28 +02:00
Kjell Tore Guttormsen	62a9335772	chore(llm-security): v7.3.1 — stabilization patch for forkers and downstream users No behavior changes. Sets the public stance, tightens documentation, and removes coherence drift so anyone forking or downloading the plugin gets a consistent starting point. Added: - CONTRIBUTING.md — public fork-and-own guide. Why PRs are not accepted, how to fork well, what is welcome via issues. - README "Project scope" section — out-of-scope table naming what is fork-and-own territory (web dashboard, fleet policy, runtime firewall, IDE LSP, compliance pack, ticketing, multi-tenancy, ML detectors, marketplace UI, SSO/SCIM/RBAC) with commercial alternatives. - package.json: bugs.url, CONTRIBUTING/SECURITY/CHANGELOG in files whitelist for npm publishing. Changed: - SECURITY.md rewritten. Supported-versions table from stale 5.1.x to current reality (7.3.x active, 7.0-7.2 best-effort, <7.0 EOL). Best-effort solo response timeline. Scope expanded to bin/. - Scanner VERSION constants synced to plugin version. Was 6.0.0 in dashboard-aggregator and posture-scanner. - package.json repository.url corrected from fromaitochitta/ to open/. - README "Feedback & contributing" links to CONTRIBUTING.md. Fixed: - pre-compact-scan size-cap timing test ceiling raised 500ms -> 1000ms. Was a flake on Intel Mac and CI under load. Design target unchanged (<500ms, documented in CLAUDE.md). Notes: - First patch on the stabilization line (post-2026-05-01). - Wave E attack-simulator scenarios deferred indefinitely; coverage remains at 72. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-01 06:14:03 +02:00
Kjell Tore Guttormsen	4bd7cd5056	docs(config-audit): v5.0.0 brief + implementation plan Planning artifacts for v5.0.0 (token-economy round): - v5-brief.md: scope brief with 22 items (F1-F7 + M1-M8 + N1-N7), revised with Avklaringer-section after critical review (N7 dropped, M3+N6 merged, N5 promoted to v5.0.0, SC-6/SC-10 reformulated) - v5-plan.md: 31-step implementation plan in 5 sessions (alpha.1 → alpha.2 → beta.1 → rc.1 → release). B+ score (84/100) after plan-critic + scope-guardian review addressed all blockers/majors/gaps. - v5-implementation-log.md: per-session status record (skeleton) Sessions track via state files (REMEMBER.md, TODO.md gitignored; implementation-log.md committed; NEXT-SESSION-PROMPT.local.md gitignored). No code changes in this commit — planning only.	2026-05-01 06:10:44 +02:00
Kjell Tore Guttormsen	1f0b03b1e5	docs(graceful-handoff): 2.0 — sync README, CLAUDE.md, root README	2026-05-01 06:07:02 +02:00
Kjell Tore Guttormsen	cc38155fa6	feat(ultraplan-local): Spor 2 — HANDOVER-CONTRACTS.md + PreCompact-hook (P0 progress.json drift fix) Reconciles divergence after parallel-session race: includes both Spor 1 wiring (validators inn i 4 commands + 1 agent) og Spor 2 (HANDOVER-CONTRACTS.md + PreCompact-hook). Spor 1 wiring (re-applied etter rebase): - /ultrabrief-local Phase 4g — brief-validator post-write - /ultraplan-local Phase 1 — brief-validator --soft + research-validator --dir + architecture-discovery - planning-orchestrator Phase 5.5 — plan-validator --strict erstatter 3 grep -cE-kall - /ultraexecute-local Phase 2.3 (--validate) — plan-validator + progress-validator - YAML-parser-utvidelse: list-of-dicts (must_contain), støtter v1.7 template-format Spor 2 NEW: - docs/HANDOVER-CONTRACTS.md (~310 linjer) — single source of truth for de 5 pipeline-handover-formatene m/ faste sub-headinger (Producer / Consumer / Path / Frontmatter schema / Body invariants / Validation strategy / Versioning / Failure modes) - hooks/scripts/pre-compact-flush.mjs (NY) — fikser dokumentert P0 i docs/ultraexecute-v2-observations-from-config-audit-v4.md: * Fyrer på PreCompact-event (CC v2.1.105+) * Lokaliserer progress.json under .claude/projects// Sammenligner stored current_step mot git log {session_start_sha}..HEAD * Atomisk write (tmp + rename), monoton — current_step kan aldri reduseres * Aldri blokkerer compaction (exit 0) - hooks/hooks.json registrerer PreCompact-hooken Resultat: /ultraexecute-local --resume virker nå etter context compaction selv ved skill-driven execution. Docs: - README.md (plugin): "Quality infrastructure", "Handover contracts", "PreCompact resume integrity" - CLAUDE.md (plugin): peker til HANDOVER-CONTRACTS.md + dokumenterer pre-compact-flush - README.md (marketplace root): bullet-liste over Spor 2-deliverables (resolved merge-konflikt fra parallell-sesjon) Tester: 109 grønn (ingen regresjon). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-01 06:07:01 +02:00
Kjell Tore Guttormsen	0707d03bea	chore(graceful-handoff): 2.0.0 — bump version, remove auto_discover, update CHANGELOG [skip-docs] Step 8 of v2.0 plan.	2026-05-01 06:06:25 +02:00
Kjell Tore Guttormsen	a67411ae26	feat(graceful-handoff): 2.0 — register hooks and statusLine in hooks.json [skip-docs] Step 7 of v2.0 plan. Registers SessionStart, Stop, and statusLine hooks. Note: statusLine top-level placement in hooks.json is an open assumption (brief Assumption 1) — verified to be valid JSON syntax; live smoke-test required to confirm Claude Code loads it from this location vs requiring settings.json placement.	2026-05-01 06:06:25 +02:00
Kjell Tore Guttormsen	4076bf904a	feat(graceful-handoff): 2.0 — SessionStart auto-load handoff on resume/compact [skip-docs] Step 6 of v2.0 plan. SessionStart hook fires on source: resume or source: compact, walks up to 3 levels searching for NEXT-SESSION-.local.md, injects content via additionalContext, and archives the file (rename to .archived.local.md) to prevent stale-load in later sessions. 9 tests cover sources, multi-level search, topic-slug variants, archive filtering, malformed payload.	2026-05-01 06:06:25 +02:00
Kjell Tore Guttormsen	81aba9a5f5	feat(graceful-handoff): 2.0 — Stop hook auto-execute + pipeline staging fix [skip-docs] Step 5 of v2.0 plan + critical pipeline fix. Stop hook (hooks/scripts/stop-context-monitor.mjs): - Estimates context usage from transcript size (chars/3.5 / window_size) - At ≥70%, spawns handoff-pipeline.mjs --auto --no-push synchronously - Reads context_window_size from payload (supports 1M windows) - Lock file at <transcript_dir>/.handoff-lock-<session_id> - Gracefully handles missing CLAUDE_PLUGIN_ROOT, missing transcript Pipeline fix (scripts/handoff-pipeline.mjs): - REMOVED `git add -A` (CLAUDE.md anti-pattern: scoops up unrelated WIP) - Now stages ONLY artifact + REMEMBER.md/TODO.md if present - New regression test 'pipeline never stages unrelated dirty files' Tests: 7 stop-hook tests use stub pipeline (no real git operations); 11 pipeline tests including new regression for explicit staging.	2026-05-01 06:06:25 +02:00
Kjell Tore Guttormsen	1efb1b3176	feat(graceful-handoff): 2.0 — statusLine context-percent hint [skip-docs] Step 4 of v2.0 plan. statusLine hook reads context_window.used_percentage from stdin payload and prints display-only hint at 60% / 70%. NEVER runs git (research/03 — statusLine scripts can be cancelled mid-flight, unsafe for side effects). 9 tests cover thresholds, null payload, malformed JSON. Includes hook-helper.mjs copied from llm-security as test infrastructure.	2026-05-01 06:06:25 +02:00
Kjell Tore Guttormsen	8d4e16bf8e	feat(graceful-handoff): 2.0 — JSON pipeline script with idempotency and confirm-on-commit [skip-docs] Step 2 of v2.0 plan. Deterministic Node script that classifies handoff type, renders artifact, and orchestrates commit/push with explicit confirmation. Handles detached HEAD, no-upstream, and idempotency (60s cooldown on clean tree). 10 tests cover dry-run, --auto path, interactive y/n, idempotency, robustness edge cases.	2026-05-01 06:06:25 +02:00
Kjell Tore Guttormsen	1a65d8e4d5	feat(graceful-handoff): 2.0 — migrate to skills/ with disable-model-invocation [skip-docs] Step 1 of v2.0 plan. Hard cut from commands/ to skills/ per Anthropic recommendation for new plugins. Frontmatter sets disable-model-invocation: true and pins model: claude-sonnet-4-6. Docs (README, CLAUDE.md, root README) deferred to Step 9 per plan.	2026-05-01 05:45:26 +02:00
Kjell Tore Guttormsen	65c9242160	feat(ultraplan-local): Spor 1 wave 2 — 5 validators + doc-consistency, 108 tests grønn [skip-docs] 5 nye validator-moduler (alle m/ CLI-shim for invokering fra commands): - brief-validator.mjs — frontmatter (type, brief_version, task, slug, research_topics, research_status), state machine (research_topics > 0 + skipped requires brief_quality: partial), body sections (Intent/Goal/Success Criteria) - research-validator.mjs — type=ultraresearch-brief, confidence ∈ [0,1], dimensions ≥ 1, body sections, --dir mode for batch validering - plan-validator.mjs — wrapper over plan-schema + manifest-yaml; håndhever step-count == manifest-count, plan_version=1.7 - progress-validator.mjs — schema_version, status enum, current_step in range, step shape, checkResumeReadiness - architecture-discovery.mjs — EKSTERN KONTRAKT: drift-WARN ikke drift-FAIL; tolererer non-canonical filnavn, surfacer loose files som warnings Doc-consistency-test pinning prose vs source-of-truth: - agents/.md count == CLAUDE.md agent-tabell rader - commands/.md mentioned i CLAUDE.md - command frontmatter.name == filnavn - templates/plan-template.md plan_version 1.7 invariant - settings.json kun kjente scopes (ultraplan, ultraresearch) - settings.json ingen exploration eller agentTeam (vestigial guard etter Spor 0) - CLAUDE.md refererer alle 4 pipeline-commands Wave 1 + Wave 2 = 108 tester grønn. [skip-docs]: Test-infrastrukturen er ikke user-facing før Spor 1 wiring lander; README/CLAUDE.md oppdateres når commands faktisk endrer atferd (neste commit). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-01 05:39:47 +02:00
Kjell Tore Guttormsen	7219a5fe20	docs(readme): total overhaul for v7.3.0 Rewrites README.md from 919 → 484 lines (47% reduction). Modernized structure, all counts updated to v7.3.0 reality (commands 19→20, scanners 22→23, knowledge 19→22, tests 1665→1777), trimmed Version History to last 3 versions with link to CHANGELOG.md. Structural changes: - Removed dated "Prompt Injection Showcase (v5.0)" section - Removed verbose Directory Structure tree (file paths discoverable from CLAUDE.md and the file system itself) - Collapsed Knowledge Base 18-row table into 5-category summary - Merged "Architecture" mermaid + "What's inside" into single layered overview - Tightened Compliance & Governance, OWASP Coverage, Workflow Examples to essentials only - Added explicit v7.3.0 sections inline: - npm scope-hop typosquat in supply-chain hook (E13) - workflow-scanner W F L row in Scanners (E11) - .gitattributes post-clone advisory in remote scanning table (E12) - MCP cumulative-drift baseline + reset in Output verification + own subsection (E14) - rot13 + T7-T9 bash-normalize in Prompt injection + Destructive commands hooks (E3/E8/E9/E10) - env-var deprecation runway in Compliance & Governance (8.7) - Hook count corrected to 9 throughout (8.10) - New badges: commands-20, scanners-23, knowledge-22, tests-1777 Content preserved (load-bearing): - AI-generated disclosure - "no PRs accepted" framing - Sandbox defense-in-depth tables - OWASP coverage matrix - Defense philosophy section - Self-scan + malicious-skill-demo references - Recommended-combo with parry-guard Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-01 05:37:42 +02:00
Kjell Tore Guttormsen	205cdbf77f	feat(ultraplan-local): Spor 1 wave 1 — lib/parsers + 66 tests grønn 7 nye moduler: - lib/util/result.mjs — Result-shape m/ ok/fail/combine helpers - lib/util/frontmatter.mjs — håndruller YAML-frontmatter-parser (subset, zero deps) - lib/parsers/plan-schema.mjs — v1.7 step-regex + forbidden-heading-deteksjon (Fase/Phase/Stage/Steg) - lib/parsers/manifest-yaml.mjs — per-step Manifest YAML-ekstraksjon m/ regex-validering - lib/parsers/project-discovery.mjs — finn brief/research/architecture/plan/progress i prosjektmappe - lib/parsers/arg-parser.mjs — $ARGUMENTS for alle 4 commands m/ flag-schema - lib/parsers/bash-normalize.mjs — løftet fra hooks/scripts/pre-bash-executor.mjs 6 test-filer (66 tester totalt) — alle grønn: - frontmatter (CRLF/BOM, scalars, lister, indent-rejection) - plan-schema (positive Step-form, negative Fase/Phase/Stage/Steg, numbering, slicing) - manifest-yaml (extraction, parsing, regex-validering, missing-key detection) - project-discovery (sortert research, architecture-detection, phase-requirements) - arg-parser (boolean/valued/multi-value flags, kvotert positional, ukjente flag) - bash-normalize (\${x}/\\\\evasion, ANSI-stripping, full canonicalize-pipeline) Forbereder Wave 2 (validators) og Spor 1-wiring inn i commands. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-01 05:35:28 +02:00
Kjell Tore Guttormsen	c4183b8b4d	chore(release): bump to v7.3.0 Batch C release. Closes 12 implementation tasks (E3, E8-E14, 8.4, 8.6, 8.7, 8.10) across four execution waves: A (bash + decoder), B (supply chain + workflow scanner), C (MCP cumulative drift), D (code quality). Wave E (9 new attack-simulator scenarios for the new defenses) deferred to v7.3.1 — defenses are unit-tested per wave; the deferred work adds attack-simulator regression coverage on top, not the primary safety net. Tests: 1665+ → 1777 (Wave A-D cumulative, +112). Version sync targets touched: - package.json - .claude-plugin/plugin.json - CLAUDE.md (header) - README.md (badge + new release-history row) - scanners/ide-extension-scanner.mjs (VERSION constant) - ../../README.md (marketplace root plugin entry) - CHANGELOG.md (new [7.3.0] section per Keep a Changelog, all 12 task IDs covered individually under Added/Changed/Documentation/Tests/Notes) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-01 05:28:45 +02:00
Kjell Tore Guttormsen	1016914fc1	chore(ultraplan-local): Spor 0 — foundation for v3.1.0 kvalitetsprogram - package.json med node:test runner og scripts (test, simulate), zero deps - settings.json: fjern vestigial exploration- og agentTeam-blokker (verifisert leset av ingen kode via grep) - docs/: commit subagent-delegation-audit.md og ultraexecute-v2-observations-from-config-audit-v4.md (begge real arkitektur-notater) - docs/: arkiver ultra-suite-brief_2.md som _archive- (var paste fra annet plugin-arbeid, irrelevant her) - tests/helpers/hook-helper.mjs kopiert fra llm-security m/ provenance-kommentar Forberedelse for Spor 1 (lib/-moduler), Spor 2 (HANDOVER-CONTRACTS + PreCompact-hook), Spor 3 (bug-fixes + CC-features). Plan: ~/.claude/plans/det-neste-vi-gj-r-eventual-adleman.md Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-01 05:27:44 +02:00
Kjell Tore Guttormsen	ab504bdf8c	refactor(marketplace): split cc-architect from ultraplan-local into its own plugin Extract `/ultra-cc-architect-local` and `/ultra-skill-author-local` plus all 7 supporting agents, the `cc-architect-catalog` skill (13 files), the `ngram-overlap.mjs` IP-hygiene script, and the skill-factory test fixtures from `ultraplan-local` v2.4.0 into a new `ultra-cc-architect` plugin v0.1.0. Why: ultraplan-local had drifted into containing two distinct domains — a universal planning pipeline (brief → research → plan → execute) and a Claude-Code-specific architecture phase. Keeping them together forced users to inherit an unfinished CC-feature catalog (~11 seeds) when they only wanted the planning pipeline, and locked the catalog and the pipeline into the same release cadence. The architect was already optional and decoupled at the code level — only one filesystem touchpoint remained (auto-discovery of `architecture/overview.md`), which already handles absence gracefully. Plugin manifests: - ultraplan-local: 2.4.0 → 3.0.0 (description + keywords updated) - ultra-cc-architect: new at 0.1.0 (pre-release; catalog is thin, Fase 2/3 of skill-factory unbuilt, decision-layer empty, fallback list still needed) What stays in ultraplan-local: brief/research/plan/execute commands, all 19 planning agents, security hooks, plan auto-discovery of `architecture/overview.md` (filesystem-level contract, not code-level). What moved (28 files via git mv, R100 — full history preserved): - 2 commands, 8 agents, 1 skill catalog (13 files), 2 scripts, 8 fixtures Documentation updates: plugin CLAUDE.md and README.md for both plugins, root README.md (added ultra-cc-architect section, updated ultraplan-local section), root CLAUDE.md (added ultra-cc-architect to repo-struktur), marketplace.json (registered ultra-cc-architect), ultraplan-local CHANGELOG.md (v3.0.0 entry with migration guidance). Test verification: ngram-overlap.test.mjs passes 23/23 from new location. Memory updated: feedback_no_architect_until_v3.md now points at the new plugin and reframes the threshold around catalog maturity rather than an ultraplan-local milestone. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-30 17:18:47 +02:00
Kjell Tore Guttormsen	97c5c9d934	docs(claude-md): 8.10 — fix hooks count + add doc-consistency test for hook-table sync	2026-04-30 17:12:49 +02:00
Kjell Tore Guttormsen	ba5f2b64ad	feat(policy-loader): 8.7 — env-var deprecation warnings (v8.0.0 removal)	2026-04-30 17:11:07 +02:00
Kjell Tore Guttormsen	e8ea75fe6b	docs(hardening-guide): 8.6 — sandbox-architecture rationale (no code consolidation)	2026-04-30 16:55:45 +02:00
Kjell Tore Guttormsen	2b7329151c	docs(severity): 8.4 — @deprecated annotation on riskScoreV1	2026-04-30 16:54:37 +02:00
Kjell Tore Guttormsen	001df2ebe8	feat(commands): E14 part 3 — /security mcp-baseline-reset slash command Wave C step C3: closes E14 with the user-facing reset command. After a legitimate MCP server upgrade the sticky baseline (added in C1) becomes a stale "what the tool used to say" anchor and every subsequent post-mcp-verify advisory will re-flag the change. /security mcp-baseline-reset lets the user acknowledge the upgrade so the next call seeds a fresh baseline. New files: - scanners/mcp-baseline-reset.mjs — small CLI wrapper around clearBaseline / listBaselines. Modes: --list (read-only), --target <name>, no-args (all). Outputs JSON summary on stdout. Exit 0 always (idempotent). - commands/mcp-baseline-reset.md — dispatcher following mcp-inspect.md shape. Frontmatter: name=security:mcp-baseline-reset, sonnet model, Read/Bash/AskUserQuestion tools. 4-step body (list -> confirm scope -> execute -> confirm result). - tests/scanners/mcp-baseline-reset.test.mjs — 10 CLI tests across --list, --target, clear-all, idempotency, history preservation, and bare-positional sugar. Updated: - commands/security.md — new row in commands table after mcp-inspect. - CLAUDE.md — new commands-table row + new v7.3.0 narrative section describing the baseline schema, cumulative-drift detection, reset semantics, and the LLM_SECURITY_MCP_CACHE_FILE override. - Plugin README.md — new MCP-baseline-reset row in commands table, scanner count 12 standalone -> 13 standalone, new "MCP Description Drift (E14, v7.3.0)" subsection explaining the sticky baseline, cumulative threshold, reset semantics, and env-var override. - Root marketplace README.md — scanner count 22 -> 23 (10 orchestrated + 13 standalone), command count 19 -> 20, test count 1511 -> 1768. Wave C complete: 1738 -> 1768 tests (+30 across C1/C2/C3). Per plan, Wave C does NOT bump the plugin version — that lands at the wave-bundle release. The advisory text in post-mcp-verify already references the new command path so the user has a ready remediation step. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-30 16:49:01 +02:00
Kjell Tore Guttormsen	427b68eca9	feat(post-mcp-verify): E14 part 2 — cumulative-drift MEDIUM advisory [skip-docs] Wave C step C2: surface the cumulative-drift signal from checkDescriptionDrift() (added in C1) as a separate MEDIUM advisory with finding category mcp-cumulative-drift. Independent of the existing per-update drift advisory — a slow-burn rug-pull that keeps each update below the 10% per-update threshold but cumulatively drifts >=25% from the sticky baseline now triggers the new advisory without ever crossing the per-update bar. The advisory references /security mcp-baseline-reset (added in C3) so the user knows how to acknowledge a legitimate MCP server upgrade. CLAUDE.md updates: - post-mcp-verify hooks-table row mentions per-update + cumulative drift - mcp-description-cache lib bullet documents baseline schema, history, cumulative threshold policy key, and LLM_SECURITY_MCP_CACHE_FILE override. Tests: 2 new hook tests using LLM_SECURITY_MCP_CACHE_FILE for cache isolation. Existing 68 still pass; total 70. Plugin README and root marketplace README updates land in C3 alongside the new /security mcp-baseline-reset slash command (combined Wave-C doc update per plan §"Wave C — Touch" list). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-30 16:40:52 +02:00
Kjell Tore Guttormsen	eaac830300	feat(mcp-description-cache): E14 part 1 — baseline + history schema (cumulative drift) [skip-docs] Wave C step C1: extend the MCP description cache schema with a sticky baseline slot per tool and a rolling history array (last 10 drift events). Cumulative drift = levenshtein(current, baseline) / max(\|current\|, \|baseline\|); emits a separate signal when ratio >= mcp.cumulative_drift_threshold (default 0.25). Per-update drift logic and threshold unchanged. - loadCache(): TTL purge now skips entries with a baseline, preserving cumulative-drift detection across the 7-day window. v7.2.0 entries (no history field) are migrated on read by seeding baseline from the current description and adding an empty history array. Entries with history but no baseline (post-clearBaseline) are NOT re-seeded. - checkDescriptionDrift(): when an entry exists with history but no baseline (i.e. baseline was cleared), the next call re-seeds baseline from the incoming description so the legitimate next version becomes the new baseline. - clearBaseline(toolName?): removes baseline for one tool or all tools. Preserves description / firstSeen / lastSeen / history. - listBaselines(): read-only listing for the upcoming reset CLI. - LLM_SECURITY_MCP_CACHE_FILE env var override for end-to-end testing. - New policy key mcp.cumulative_drift_threshold (default 0.25). Tests: 23 new unit tests; existing 10 still pass. Docs deferred: CLAUDE.md update lands in C3 alongside the new /security mcp-baseline-reset command. C2 adds the hooks-table footer note. Combined wave docs match plan §"Wave C — Touch" list. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-30 16:37:33 +02:00
Kjell Tore Guttormsen	ede37219a3	feat(workflow-scanner): E11 part 2 — re-interpolation + auth-bypass + WFL prefix + orchestrator Closes E11. Three new pieces, plus integration: 1. Re-interpolation detector (Appsmith GHSL-2024-277 stealth pattern). The scanner now collects env: bindings (key -> source-expression text) by walking parsed events whose parentChain includes 'env', then for each `${{ env.<KEY> }}` inside run:, re-injects MEDIUM if the binding source matches the 23-field blacklist. This catches the pattern where developers apply env-indirection but then re-interpolate the env var in run:, which cancels the mitigation (template substitution happens before shell parsing). 2. Auth-bypass category (Synacktiv 2023 Dependabot spoofing). Detects `if: ${{ github.actor == 'dependabot[bot]' }}` and variants. MEDIUM, owasp: 'LLM06' (Excessive Agency). Distinct from injection — same expression syntax, different threat class. Recommendation steers users to `github.event.pull_request.user.login`. 3. severity.mjs OWASP map registration. WFL prefix added to all four maps: - OWASP_MAP['WFL'] = ['LLM02', 'LLM06'] - OWASP_AGENTIC_MAP['WFL'] = ['ASI04'] - OWASP_SKILLS_MAP['WFL'] = [] - OWASP_MCP_MAP['WFL'] = [] Empty arrays for skills/MCP are explicit, not omitted — keeps `Object.keys(OWASP_MAP)` symmetric across maps. 4. scan-orchestrator.mjs registration. workflowScan added between supply-chain and toxic-flow (toxic-flow correlates after primaries). Verified via integration: orchestrator emits 9 WFL findings on tests/fixtures/workflows/. Bug fix: extractTriggers in workflow-yaml-state.mjs was collecting sub-properties (`branches:`, `types:`) as triggers. Now tracks the first nested indent level and ignores anything deeper. Tests: - 6 new cases in tests/scanners/workflow-scanner.test.mjs: re-interp TP, no-double-count, auth-bypass TP, auth-bypass FP (startsWith head_ref is not auth-bypass), OWASP map shape, orchestrator import + SCANNERS array entry. - 2 new fixtures: tp-reinterpolation.yml, auth-bypass-dependabot.yml. - Existing 14 scanner tests + 15 state-machine tests unchanged. Test count: 1732 -> 1738 (+6). Wave B total: +53 over baseline 1685. Pre-compact-scan flake unchanged (passes in isolation).	2026-04-30 15:57:10 +02:00

1 2 3 4 5 ...

254 commits