feat(ms-ai-architect): sitemap-based KB change detection system
Adds a zero-dependency Node.js pipeline that polls Microsoft Learn sitemaps weekly to detect when source documentation changes. Replaces the broken mtime-based staleness check (all files had identical mtime after release). Components: - build-registry.mjs: extracts 1342 URLs from 387 reference files - poll-sitemaps.mjs: streams ~18 child sitemaps, matches against registry - report-changes.mjs: prioritized change report (critical/high/medium/low) - discover-new-urls.mjs: finds relevant new MS Learn pages not yet covered - run-weekly-update.mjs: orchestrator with --force/--discover/--dry-run Integration: - session-start hook reads change-report.json instead of broken mtime check - hook triggers background poll if >7 days since last check - generate-skills --update reads change report for targeted MCP updates Current stats: 69% match rate (924/1342 URLs tracked via sitemaps). ~31% unmatched due to Microsoft URL restructuring (ai-foundry/openai paths). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
035255fc5d
commit
f968f37be3
13 changed files with 976 additions and 59 deletions
|
|
@ -234,7 +234,9 @@ When invoked with `--update`, the command updates existing stale files instead o
|
|||
|
||||
**Workflow:**
|
||||
|
||||
1. Run `bash scripts/kb-staleness-check.sh --json` to identify stale files
|
||||
1. Read `scripts/kb-update/data/change-report.json` for source-aware change detection
|
||||
- If not available, fall back to `bash scripts/kb-staleness-check.sh --json`
|
||||
- The change report contains `changed_urls` per file — use these for targeted MCP fetches
|
||||
2. Sort by priority (Critical > High > Medium > Low)
|
||||
3. For each stale file, dispatch an update agent with this prompt:
|
||||
|
||||
|
|
@ -247,10 +249,14 @@ Oppdater filen: {FILE_PATH}
|
|||
## Eksisterende innhold (les først)
|
||||
Les filen med Read-verktøyet. Bevar strukturen.
|
||||
|
||||
## Endrede kilde-URLer (hent disse først)
|
||||
{changed_urls from change-report.json — if available}
|
||||
|
||||
## Steg 1: Research
|
||||
Bruk MCP-verktøy for å verifisere og oppdatere:
|
||||
1. microsoft_docs_search — 2-3 søk for siste oppdateringer
|
||||
2. microsoft_docs_fetch — les oppdatert dokumentasjon
|
||||
1. microsoft_docs_fetch — hent de endrede kilde-URLene direkte (hvis tilgjengelig)
|
||||
2. microsoft_docs_search — 2-3 søk for siste oppdateringer
|
||||
3. microsoft_docs_fetch — les ytterligere oppdatert dokumentasjon ved behov
|
||||
|
||||
## Steg 2: Oppdater med Edit
|
||||
Bruk Edit-verktøyet (IKKE Write) for å:
|
||||
|
|
@ -277,7 +283,9 @@ status: success|no_changes|failed
|
|||
|
||||
Before generating new knowledge base content, check for stale files:
|
||||
|
||||
1. Run `bash scripts/kb-staleness-check.sh` to identify stale files
|
||||
1. Read `scripts/kb-update/data/change-report.json` for source-aware staleness data
|
||||
- This is generated by `node scripts/kb-update/run-weekly-update.mjs` (polls Microsoft Learn sitemaps)
|
||||
- Fallback: `bash scripts/kb-staleness-check.sh` (mtime-based, less accurate)
|
||||
2. Prioritize regeneration of stale files by priority (Critical > Low)
|
||||
3. When regenerating a file, update its `Sist oppdatert:` header to today's date
|
||||
4. After regeneration, verify the file with the staleness checker
|
||||
3. When regenerating a file, update its `Last updated:` header to today's date
|
||||
4. After update, run `node scripts/kb-update/build-registry.mjs --merge` to refresh URL registry
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue