docs(architect): weekly KB update — 52 files refreshed (2026-04)

Key content changes:
- MLOps: MLflow 3 scorers expanded (RetrievalRelevance, Fluency, multi-turn judges)
- MLflow 3 A/B eval: mirror_traffic GA confirmed, new scorer catalog
- CI/CD: OIDC auth replaces deprecated --sdk-auth (Azure ML GitHub Actions)
- Agent framework A2A: updated SDK patterns (A2ACardResolver, BearerAuth)
- AG-UI backend tool rendering: accurate TOOL_CALL_* event shapes
- Computer Use agents: US region requirement, credentials patterns
- Purview governance: bulk term edit, expire/delete workflows
- CAF AI Secure: 3-phase structure confirmed current
- Copilot Studio: Claude Sonnet 4.5/4.6 GA, new orchestration controls
- M365 manifest: v1.26 GA (April 2026), copilotAgents node
- Power Platform: agent flow capacity enforcement corrected
- Azure Monitor: Simple Log Alerts GA, AMBA for policy-based alerting
- Security Copilot: SCU capacity model (400 SCU/1000 users)
- EU Data Boundary: all EU + EFTA countries confirmed
- gateway-multi-backend: added 4th topology, subscription-level quota note

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Kjell Tore Guttormsen 2026-04-10 11:31:11 +02:00
commit 34c6db36fa
40 changed files with 398 additions and 239 deletions

View file

@ -1,6 +1,6 @@
# Monitoring and Alerting for Failover Detection
**Last updated:** 2026-02
**Last updated:** 2026-04
**Status:** GA
**Category:** Business Continuity & Disaster Recovery
@ -427,22 +427,22 @@ Azure Monitor Application Insights tilbyr nå dedikert støtte for AI-agenter vi
| **Live metrics** | Sanntids health under failover-scenarier |
| **Availability tests** | Automatisk helsesjekk av agent-endepunkter |
### Instrumenteringsveiledning per agent-plattform
### Instrumenteringsveiledning per agent-plattform (Verified MCP 2026-04)
- **Azure AI Foundry-agenter:** Koble Application Insights til Foundry-prosjektet for automatisk tracing
- **Copilot Studio-agenter:** Konfigurer built-in telemetri-eksport til App Insights
- **Microsoft Agent Framework (self-hosted):** Bruk Azure Monitor OpenTelemetry Distro
- **LangChain/LangGraph og OpenAI Agents SDK:** Bruk Azure AI OpenTelemetry Tracer
- **Azure AI Foundry-agenter:** Start med [tracing setup i Foundry](https://learn.microsoft.com/azure/foundry/observability/how-to/trace-agent-setup). Koble Application Insights til Foundry-prosjektet for automatisk tracing. Kan også bruke Azure Monitor OpenTelemetry Distro med Foundry SDK.
- **Copilot Studio-agenter:** Konfigurer built-in telemetri-eksport til App Insights via innstillinger i Copilot Studio.
- **Microsoft Agent Framework (self-hosted):** Bruk Azure Monitor OpenTelemetry Distro for telemetri til Azure Monitor.
- **LangChain/LangGraph og OpenAI Agents SDK:** Bruk Azure AI OpenTelemetry Tracer. Framework-spesifikk veiledning tilgjengelig i Foundry docs.
**Anbefaling:** Gi hver agent et unikt navn for å skille dem i Agent details view. Bruk samme App Insights-ressurs for agenter som er del av et større system.
**Anbefaling:** Gi hver agent et unikt navn for å skille dem i Agent details view. Bruk samme App Insights-ressurs for agenter som er del av et større system. Vil du se agenter i Azure AI Foundry i tillegg til Azure Monitor, [koble App Insights-ressursen til Foundry-prosjektet](https://learn.microsoft.com/azure/foundry/observability/how-to/trace-agent-setup#connect-application-insights-to-your-foundry-project).
## Referanser
- [Monitor Azure OpenAI](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/how-to/monitor-openai) — OpenAI monitoring og alerting
- [Monitor Azure AI Search](https://learn.microsoft.com/en-us/azure/search/monitor-azure-cognitive-search) — AI Search monitoring
- [Azure Monitor alerts overview](https://learn.microsoft.com/en-us/azure/azure-monitor/alerts/alerts-overview) — Alert-rammeverk *(Verified MCP 2026-04)* — Stateful vs. stateless alerts, Simple Log Search Alerts (preview) for per-row evaluering, Query-based metric alerts for Prometheus/OTel (public preview). Alert processing rules for suppression ved planlagt vedlikehold. Opptil 5 action groups per alert rule.
- [Azure Monitor alerts overview](https://learn.microsoft.com/en-us/azure/azure-monitor/alerts/alerts-overview) — Alert-rammeverk *(Verified MCP 2026-04)* — Stateful vs. stateless alerts. **Simple Log Search Alerts** (GA) for per-row KQL evaluering — raskere varsling enn tradisjonelle log alerts. **Query-based metric alerts** for Prometheus/OTel (public preview). Alerts stored 30 dager. Fired instances er read-only. Alert processing rules for suppression ved planlagt vedlikehold. **Azure Monitor Baseline Alerts** (`aka.ms/amba`) for policy-basert alerting i skala via Azure Policy.
- [Health modeling and observability of mission-critical workloads](https://learn.microsoft.com/en-us/azure/well-architected/mission-critical/mission-critical-health-modeling) — Health modeling
- [Application Insights overview](https://learn.microsoft.com/en-us/azure/azure-monitor/app/app-insights-overview) — APM for applikasjoner *(Verified MCP 2026-04)*Nå OpenTelemetry-basert (OTel) som primær instrumentering. Nye features: **Agent details view** for AI-agenter fra Foundry, Copilot Studio og tredjeparts agenter. Støtter: Azure AI Foundry (via Foundry SDK tracing), Copilot Studio (built-in telemetri → App Insights), Microsoft Agent Framework (self-hosted), LangChain/LangGraph og OpenAI Agents SDK. Batch og continuous evaluations for produksjonstraffic. Live Metrics for sanntids observabilitet under failover-scenarier.
- [Application Insights overview](https://learn.microsoft.com/en-us/azure/azure-monitor/app/app-insights-overview) — APM for applikasjoner *(Verified MCP 2026-04)*OpenTelemetry (OTel) er primær instrumentering. AI-agenter støttes via Agents-tab i getting started. Azure Functions støtter OTel via `"telemetryMode": "OpenTelemetry"` i `host.json`. Nye views: **Agent details view** (Foundry, Copilot Studio, tredjeparts), **SDK Stats** (exporter success/drop metrics), **Dashboards with Grafana** (direkte i Azure portal). Evaluations: batch (local/cloud/portal) og continuous (produksjonstraffic). Classic API SDKs migreres til OTel — se migrasjonsveiledning. Fired alert instances er nå read-only (kan ikke editeres etter at de er trigget).
- [Azure Service Health](https://learn.microsoft.com/en-us/azure/service-health/overview) — Azure-tjenestestatus
## For Cosmo