feat(ms-ai-architect): sitemap-based KB change detection system
Adds a zero-dependency Node.js pipeline that polls Microsoft Learn sitemaps weekly to detect when source documentation changes. Replaces the broken mtime-based staleness check (all files had identical mtime after release). Components: - build-registry.mjs: extracts 1342 URLs from 387 reference files - poll-sitemaps.mjs: streams ~18 child sitemaps, matches against registry - report-changes.mjs: prioritized change report (critical/high/medium/low) - discover-new-urls.mjs: finds relevant new MS Learn pages not yet covered - run-weekly-update.mjs: orchestrator with --force/--discover/--dry-run Integration: - session-start hook reads change-report.json instead of broken mtime check - hook triggers background poll if >7 days since last check - generate-skills --update reads change report for targeted MCP updates Current stats: 69% match rate (924/1342 URLs tracked via sitemaps). ~31% unmatched due to Microsoft URL restructuring (ai-foundry/openai paths). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
035255fc5d
commit
f968f37be3
13 changed files with 976 additions and 59 deletions
|
|
@ -487,29 +487,37 @@ bash tests/capture-fixture.sh <source-file> <section-header> <output-dir>
|
|||
|
||||
### Knowledge Base Maintenance
|
||||
|
||||
The plugin includes a systematic process for keeping reference documents current. See `docs/kb-update-policy.md` for the full policy (update frequencies per domain, procedures, quality gates).
|
||||
The plugin includes a sitemap-based change detection system that tracks when Microsoft Learn source pages are updated. This replaces the previous mtime-based staleness check.
|
||||
|
||||
**Staleness checking:**
|
||||
**Automated change detection (sitemap-based):**
|
||||
|
||||
```bash
|
||||
# Human-readable report
|
||||
bash scripts/kb-staleness-check.sh
|
||||
# Weekly update: poll sitemaps → compare → generate change report
|
||||
node scripts/kb-update/run-weekly-update.mjs --force
|
||||
|
||||
# Machine-readable JSON output
|
||||
bash scripts/kb-staleness-check.sh --json
|
||||
# Include discovery of new relevant pages
|
||||
node scripts/kb-update/run-weekly-update.mjs --force --discover
|
||||
|
||||
# Write report to file
|
||||
bash scripts/kb-staleness-check.sh --json --output report.json
|
||||
# View change report only (after polling)
|
||||
node scripts/kb-update/report-changes.mjs
|
||||
```
|
||||
|
||||
**Knowledge base regeneration:**
|
||||
The session-start hook automatically triggers a background poll if >7 days since the last check.
|
||||
|
||||
**How it works:**
|
||||
1. `build-registry.mjs` extracts 1342 unique `learn.microsoft.com` URLs from reference files
|
||||
2. `poll-sitemaps.mjs` fetches Microsoft Learn sitemaps and compares `<lastmod>` dates
|
||||
3. `report-changes.mjs` generates a prioritized list of files needing update
|
||||
4. `discover-new-urls.mjs` finds relevant new pages not yet covered
|
||||
|
||||
**Knowledge base update:**
|
||||
|
||||
```bash
|
||||
# Incremental update based on change report (targets changed sources via MCP)
|
||||
/architect:generate-skills --update
|
||||
|
||||
# Full regeneration via MCP research
|
||||
/architect:generate-skills
|
||||
|
||||
# Incremental update (Edit existing files instead of rewriting)
|
||||
/architect:generate-skills --update
|
||||
```
|
||||
|
||||
Category-to-skill routing is defined in `scripts/skill-gen/category-skill-map.json` (20 categories mapped to 5 skills), used by the generate-skills workflow to place new reference documents in the correct skill directory.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue