docs(voyage): jsonl schema audit — field-allowlist input for v4.1 otel exporter
Step 1 av v4.1-execute (Wave 1, Session 1).
Audit alle 6 trek*-stats.jsonl-skjemaer + lib/stats/event-emit.mjs autonomy
events + hooks/scripts/post-bash-stats.mjs PostToolUse Bash records. Produser
markdown-tabell {schema_id, fields[], writer_path, line_ref, v4.1 additive,
PII} som load-bearing input til Step 11 (field-allowlist) og Step 8 (stats
plumbing).
Spesielle merker:
- command_excerpt fra post-bash-stats.mjs flagget CWE-212 (improper cross-
boundary removal of sensitive data) — eksporten MÅ hard-ekskludere uten
eksplisitt VOYAGE_EXPORT_INCLUDE_COMMAND_EXCERPT=1 (deferred til v4.2)
- v4.1 additive fields enumerert per skjema: profile, phase_models,
parallel_agents, external_research_enabled, profile_source
- EXPORT_ALLOWLIST + EXPORT_DENYLIST utdrag i bunnen som forhåndsdefinisjon
av Step 11 inline static consts
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
parent
ce9b06dd16
commit
0bdfc02e75
1 changed files with 76 additions and 0 deletions
76
plugins/voyage/tests/fixtures/jsonl-schemas.md
vendored
Normal file
76
plugins/voyage/tests/fixtures/jsonl-schemas.md
vendored
Normal file
|
|
@ -0,0 +1,76 @@
|
|||
# Voyage JSONL stats — schema audit (v4.1 input)
|
||||
|
||||
> **Purpose:** Field-allowlist input for v4.1 OTel exporter (Step 11). Lists every
|
||||
> field every voyage stats JSONL writer emits today, plus the additive fields v4.1
|
||||
> introduces. Load-bearing for Step 11 (field-allowlist) and Step 8 (stats plumbing).
|
||||
>
|
||||
> **PII-flag:** `command_excerpt` from `hooks/scripts/post-bash-stats.mjs` slices
|
||||
> the first 120 chars of an arbitrary Bash command — may contain operator paths,
|
||||
> branch names, or fragments of secrets that survived the secrets-hook. CWE-212
|
||||
> (Improper Cross-boundary Removal of Sensitive Data). The OTel exporter MUST
|
||||
> NOT export this field unless the operator explicitly opts in via
|
||||
> `VOYAGE_EXPORT_INCLUDE_COMMAND_EXCERPT=1` (deferred to v4.2 — v4.1 hard-excludes).
|
||||
>
|
||||
> **Additive v4.1 fields:** `profile`, `phase_models`, `parallel_agents`,
|
||||
> `external_research_enabled`, `profile_source`. All are forward-compat: existing
|
||||
> v4.0 consumers ignore unknown keys, v4.1 consumers get richer signal.
|
||||
|
||||
## Field table per JSONL writer
|
||||
|
||||
| schema_id | fields | writer_path | line_ref | v4.1 additive | PII |
|
||||
|-----------|--------|-------------|----------|---------------|-----|
|
||||
| trekbrief-stats | ts, task, slug, mode, interview_turns, review_iterations, brief_quality, research_topics, auto_research, auto_result, project_dir | commands/trekbrief.md (orchestrator-emit Phase 7) | trekbrief.md:657-672 | profile, phase_models, profile_source | none |
|
||||
| trekresearch-stats | ts, question, mode, scope, slug, project_dir, brief_path, dimensions, agents_local, agents_external, gemini_used, confidence, contradictions, open_questions | commands/trekresearch.md (orchestrator-emit Stats tracking) | trekresearch.md:388-410 | profile, phase_models, parallel_agents, external_research_enabled, profile_source | none |
|
||||
| trekplan-stats | ts, task, mode, slug, brief_path, project_dir, codebase_size, codebase_files, agents_deployed, deep_dives, research_briefs_used, research_scout_used, critic_verdict, guardian_verdict, outcome | commands/trekplan.md (orchestrator-emit Phase 12) | trekplan.md:805-826 | profile, phase_models, parallel_agents, profile_source | none |
|
||||
| trekexecute-stats (Phase 9 record) | ts, plan, plan_type, mode, result, steps_total, steps_passed, steps_failed, steps_skipped, failed_at_step | commands/trekexecute.md (orchestrator-emit Phase 9) | trekexecute.md:1479-1494 | profile, phase_models, profile_source | none |
|
||||
| trekexecute-stats (autonomy events) | ts, event, known_event, payload | lib/stats/event-emit.mjs `emit()` | event-emit.mjs:64-86 | payload.profile, payload.phase_models, payload.profile_source | none |
|
||||
| trekexecute-stats (PostToolUse Bash) | ts, session_id, command_excerpt, duration_ms, success | hooks/scripts/post-bash-stats.mjs (Bash PostToolUse) | post-bash-stats.mjs:42-54 | none (hook is plugin-level, not profile-aware) | command_excerpt (CWE-212) |
|
||||
| trekreview-stats | ts, slug, verdict, counts (BLOCKER/MAJOR/MINOR/SUGGESTION), reviewed_files_count, mode, duration_ms | commands/trekreview.md (orchestrator-emit Phase 8) | trekreview.md:255 | profile, phase_models, profile_source | none |
|
||||
| trekcontinue-stats | ts, project, next_session_label, status | commands/trekcontinue.md (orchestrator-emit Phase 5) | trekcontinue.md:289 | profile, profile_source | none |
|
||||
|
||||
## Field-allowlist input for Step 11
|
||||
|
||||
The OTel exporter (Step 11 `lib/exporters/field-allowlist.mjs`) MUST inline the
|
||||
following static const arrays (NOT load from this file at runtime — Step 11
|
||||
explicit constraint: INLINE static const, IKKE runtime fra tests/fixtures):
|
||||
|
||||
**EXPORT_ALLOWLIST** (numeric/bool/short-string fields safe for OTel metric labels):
|
||||
|
||||
```
|
||||
ts, slug, mode, brief_quality, auto_research, auto_result,
|
||||
codebase_size, codebase_files, agents_deployed, deep_dives,
|
||||
agents_local, agents_external, gemini_used, dimensions, confidence,
|
||||
contradictions, open_questions, interview_turns, review_iterations,
|
||||
research_topics, research_briefs_used, research_scout_used,
|
||||
critic_verdict, guardian_verdict, outcome, plan_type, result,
|
||||
steps_total, steps_passed, steps_failed, steps_skipped, failed_at_step,
|
||||
verdict, reviewed_files_count, duration_ms, status, next_session_label,
|
||||
event, known_event, success, scope,
|
||||
profile, profile_source, parallel_agents, external_research_enabled
|
||||
```
|
||||
|
||||
**EXPORT_DENYLIST** (PII or high-cardinality, never export):
|
||||
|
||||
```
|
||||
task, question, project_dir, project, plan, brief_path, command_excerpt, payload, counts, phase_models, session_id
|
||||
```
|
||||
|
||||
> Notes:
|
||||
> - `task` and `question` may contain user-content prose → high-cardinality + PII risk.
|
||||
> - `project_dir` and paths leak filesystem layout.
|
||||
> - `command_excerpt` per CWE-212 above.
|
||||
> - `phase_models` is a structured object (6 keys) — too high-cardinality for label;
|
||||
> profile name (`profile`) is the safe summary. v4.2 may revisit if operators ask.
|
||||
> - `counts` (review BLOCKER/MAJOR/MINOR/SUGGESTION) is a nested object — Step 11
|
||||
> exporter flattens to `voyage_review_counts_blocker`/`_major`/`_minor`/`_suggestion`
|
||||
> metrics rather than a label.
|
||||
> - `session_id` is a UUID — high-cardinality, not useful as a label, log-only.
|
||||
|
||||
## Cross-reference
|
||||
|
||||
- Step 8 (stats plumbing) — adds `profile` + `phase_models` + `profile_source` to all
|
||||
6 orchestrator-emit sites listed above.
|
||||
- Step 11 (field-allowlist) — codifies the EXPORT_ALLOWLIST/DENYLIST arrays above
|
||||
as inline static consts in `lib/exporters/field-allowlist.mjs`.
|
||||
- Step 9 (Prometheus textfile) — emits one metric line per allowlist-numeric field
|
||||
per JSONL writer; PII-flagged fields are dropped at format-layer, not export-layer.
|
||||
Loading…
Add table
Add a link
Reference in a new issue