docs(architect): weekly KB update — 106 files refreshed (2026-04)

Updates across all 5 skills: ms-ai-advisor, ms-ai-engineering, ms-ai-governance, ms-ai-security, ms-ai-infrastructure. Key changes: - Language Services (Custom Text Classification, Text Analytics, QnA): retirement warning 2029-03-31, migration guides to Foundry/GPT-4o - Agentic Retrieval: 50M free reasoning tokens/month (Public Preview) - Computer Use: Claude Sonnet 4.5 (preview) + OpenAI CUA models - Agent Registry: Risks column (M365 E7), user-shared/org-published types - Declarative agents: schema v1.5 → v1.6, Store validation requirements - MLflow 3: 13 built-in LLM judges, production monitoring, Genie Code - AG-UI HITL: ApprovalRequiredAIFunction (C#) + @tool(approval_mode) (Python) - Entra ID Ignite 2025: Agent ID Admin/Developer RBAC roles, Conditional Access - Security Copilot: 400 SCU/month per 1000 M365 E5 licenses, auto-provisioned - Fast Transcription API: phrase lists, 14-language multi-lingual transcription - Azure Monitor Workbooks: Bicep support, RBAC specifics - Power Platform Copilot: data residency (Norway/Europe → EU DB, Bing → USA) - RAG security-rbac: 4-approach table (GA + 3 preview access control methods) - IaC MLOps: Well-Architected OE:05 principles, Bicep/Terraform patterns - Translator: image file batch translation Preview (JPEG/PNG/BMP/WebP) All 106 files: Last updated 2026-04 | Verified: MCP 2026-04 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 09:13:24 +02:00 · 2026-04-10 09:13:24 +02:00 · ff6a50d14f
commit ff6a50d14f
parent dda86449fa
104 changed files with 1986 additions and 520 deletions
--- a/plugins/ms-ai-architect/skills/ms-ai-security/references/ai-security-engineering/ai-incident-response-procedures.md
+++ b/plugins/ms-ai-architect/skills/ms-ai-security/references/ai-security-engineering/ai-incident-response-procedures.md
@ -1,6 +1,6 @@
 # AI Incident Response and Breach Handling Procedures

-**Last updated:** 2026-04
+**Last updated:** 2026-04 | Verified: MCP 2026-04
 **Status:** Established Practice
 **Category:** AI Security Engineering

@ -586,7 +586,7 @@ Set-AzSecurityContact -Name "default1" `

 ### Konfidensnivå

-**Verified (High Confidence)** — Alle Azure-native tools, services og incident response procedures er verifisert via Microsoft Learn MCP-research (februar 2026). Prisestimater basert på offisiell Azure pricing, men kan variere ved currency fluctuation og regional pricing.
+**Verified (High Confidence)** — Alle Azure-native tools, services og incident response procedures er verifisert via Microsoft Learn MCP-research (februar 2026, re-verifisert april 2026). CAF Secure AI-dokumentet bekrefter: AI asset inventory via Azure Resource Graph, AI communication channel security (Managed Identities, Virtual Networks, APIM for MCP server-endepunkter), og Purview Insider Risk Management for prompt-basert data exfiltration-deteksjon. Prisestimater basert på offisiell Azure pricing, men kan variere ved currency fluctuation og regional pricing.

 **Baseline (Model Knowledge)** — Generell incident response framework (NIST SP 800-61), MITRE ATT&CK for ML, og best practices for forensics/chain of custody basert på industry standards. Norwegian regulatory requirements verifisert via offentlige kilder (Datatilsynet, NSM, Lovdata).

--- a/plugins/ms-ai-architect/skills/ms-ai-security/references/ai-security-engineering/ai-threat-modeling-stride.md
+++ b/plugins/ms-ai-architect/skills/ms-ai-security/references/ai-security-engineering/ai-threat-modeling-stride.md
@ -1,6 +1,6 @@
 # AI Threat Modeling Using STRIDE Framework

-**Last updated:** 2026-04
+**Last updated:** 2026-04 | Verified: MCP 2026-04
 **Status:** Established Practice
 **Category:** AI Security Engineering

--- a/plugins/ms-ai-architect/skills/ms-ai-security/references/ai-security-engineering/data-leakage-prevention-ai.md
+++ b/plugins/ms-ai-architect/skills/ms-ai-security/references/ai-security-engineering/data-leakage-prevention-ai.md
@ -1,7 +1,7 @@
 # Data Leakage Prevention in AI Contexts

 **Kategori:** AI Security Engineering
-**Sist oppdatert:** 2026-04
+**Sist oppdatert:** 2026-04 | Verified: MCP 2026-04
 **Målgruppe:** Enterprise AI architects og security teams

 ## Oversikt
@ -758,6 +758,6 @@ az monitor metrics alert create \
 **Microsoft Learn kilder:**
 - [Microsoft Purview DLP for Copilot](https://learn.microsoft.com/en-us/purview/dlp-microsoft365-copilot-location-learn-about)
 - [Azure AI Services DLP](https://learn.microsoft.com/en-us/azure/ai-services/cognitive-services-data-loss-prevention)
- [Secure AI (Cloud Adoption Framework)](https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/scenarios/ai/secure)
+- [Secure AI (Cloud Adoption Framework)](https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/scenarios/ai/secure) — Verified MCP 2026-04: Bekrefter bruk av Microsoft Purview DLP for AI-workflows, content filtering for å forhindre sensitiv informasjonslekkasje, og Purview Insider Risk Management for prompt-basert data exfiltration-deteksjon og identifisering av risikofull AI-atferd.
 - [Artificial Intelligence Security (MCSB)](https://learn.microsoft.com/en-us/security/benchmark/azure/mcsb-v2-artificial-intelligence-security)
 - [Confidential AI](https://learn.microsoft.com/en-us/azure/confidential-computing/confidential-ai)
--- a/plugins/ms-ai-architect/skills/ms-ai-security/references/ai-security-engineering/entra-agent-id-zero-trust.md
+++ b/plugins/ms-ai-architect/skills/ms-ai-security/references/ai-security-engineering/entra-agent-id-zero-trust.md
@ -1,7 +1,7 @@
 # Microsoft Entra Agent ID — Zero Trust for AI-agentidentiteter

 **Kategori:** AI Security Engineering
-**Sist oppdatert:** 2026-04
+**Sist oppdatert:** 2026-04 | Verified: MCP 2026-04
 **Status:** Public Preview (annonsert Ignite november 2025, utvidet preview; opt-out er midlertidig — vil bli obligatorisk for nye agenter) *(Verified MCP 2026-04)*
 **Målgruppe:** Arkitekter som skal sikre AI-agenter med dedikerte identiteter og Zero Trust-prinsipper

@ -422,7 +422,7 @@ Når en Foundry-agent publiseres, endres identiteten fra delt prosjektidentitet
 8. [Governing Agent Identities (Preview)](https://learn.microsoft.com/entra/id-governance/agent-id-governance-overview) — Identity Governance for agenter
 9. [Conditional Access for Agent ID (Preview)](https://learn.microsoft.com/entra/identity/conditional-access/agent-id) — Conditional Access for agentidentiteter
 10. [Protect agent identities with Microsoft Entra](https://learn.microsoft.com/microsoft-agent-365/admin/capabilities-entra) — Microsoft Agent 365-integrasjon
-11. [What's new at Microsoft Ignite 2025 - Microsoft Entra](https://learn.microsoft.com/entra/fundamentals/whats-new-ignite-2025) — Annonsering og ny dokumentasjon
+11. [What's new at Microsoft Ignite 2025 - Microsoft Entra](https://learn.microsoft.com/entra/fundamentals/whats-new-ignite-2025) — Annonsering og ny dokumentasjon. Verified MCP 2026-04: Bekrefter 50+ nye artikler om Agent ID, nye RBAC-roller (Agent ID Administrator, Agent ID Developer, Agent Registry Administrator), Conditional Access for agentidentiteter, Identity Protection for agenter (risky agents concept), AI Prompt Shield (Entra Internet Access), og Security Copilot + Entra-integrasjoner.
 12. [Surfing the AI Wave: Manage, Govern, and Protect AI Agents with Microsoft Entra Agent ID](https://techcommunity.microsoft.com/blog/microsoft-entra-blog/surfing-the-ai-wave-manage-govern-and-protect-ai-agents-with-microsoft-entra-age/2464407) — Offisiell Microsoft Entra-blogg, Ignite 2025

 ---
--- a/plugins/ms-ai-architect/skills/ms-ai-security/references/ai-security-engineering/security-copilot-integration.md
+++ b/plugins/ms-ai-architect/skills/ms-ai-security/references/ai-security-engineering/security-copilot-integration.md
@ -1,7 +1,7 @@
 # Microsoft Security Copilot — AI-drevet sikkerhetsoperasjonsplattform

 **Kategori:** AI Security Engineering
-**Sist oppdatert:** 2026-04
+**Sist oppdatert:** 2026-04 | Verified: MCP 2026-04
 **Målgruppe:** Sikkerhetsarkitekter og SOC-ledere som vurderer AI-assistert sikkerhetsoperasjon

 ## Introduksjon
@ -420,14 +420,14 @@ Per 2026-02: Security Copilot er kun tilgjengelig på kommersielt skynivå — i

 ## Kilder

-Basert på offisiell Microsoft Learn-dokumentasjon (sist verifisert 2026-04 via MCP): *(Verified MCP 2026-04)*
+Basert på offisiell Microsoft Learn-dokumentasjon (sist verifisert 2026-04 via MCP): *(Verified MCP 2026-04)* — Inklusjonsmodellen (M365 E5 → 400 SCU/1000 lisenser, maks 10 000 SCU/mnd, zero-click provisjonering) er bekreftet via MCP-fetch av security-copilot-inclusion og get-started-security-copilot.

 1. [What is Microsoft Security Copilot?](https://learn.microsoft.com/copilot/security/microsoft-security-copilot) — Overordnet produktbeskrivelse
 2. [Microsoft Security Copilot agents overview](https://learn.microsoft.com/copilot/security/agents-overview) — Komplett agentoversikt
 3. [Deploy AI agents in Microsoft Defender](https://learn.microsoft.com/defender-xdr/security-copilot-agents-defender) — Defender-spesifikke agenter
 4. [Security Copilot with Microsoft Sentinel](https://learn.microsoft.com/azure/sentinel/sentinel-security-copilot) — Sentinel-integrasjon
-5. [Learn about Security Copilot inclusion in Microsoft 365 E5](https://learn.microsoft.com/copilot/security/security-copilot-inclusion) — E5-lisensiering og SCU-modell
-6. [Get started with Microsoft Security Copilot](https://learn.microsoft.com/copilot/security/get-started-security-copilot) — Onboarding og lisensiering
+5. [Learn about Security Copilot inclusion in Microsoft 365 E5](https://learn.microsoft.com/copilot/security/security-copilot-inclusion) — E5-lisensiering og SCU-modell. Verified MCP 2026-04: Bekrefter rollout startet 18. november 2025, 400 SCU/måned per 1000 brukerlisenser (maks 10 000 SCU/mnd), zero-click auto-provisionering med 30-dagers forhåndsvarsel, SCU nullstilles månedlig, Developer Experiences (Agent Builder, MCP og Graph API-integrasjoner) er inkludert.
+6. [Get started with Microsoft Security Copilot](https://learn.microsoft.com/copilot/security/get-started-security-copilot) — Onboarding og lisensiering. Verified MCP 2026-04: Bekrefter to kundekategorier — M365 E5-kunder (auto-provisionert) og ikke-E5-kunder (manuell onboarding med SCU-provisjonering). M365 E5-kunder trenger ikke Azure-oppsett eller manuell SCU-tildeling.
 7. [Create your own custom plugins](https://learn.microsoft.com/copilot/security/custom-plugins) — Egendefinerte plugins
 8. [Microsoft Security Copilot Phishing Triage Agent](https://learn.microsoft.com/defender-xdr/phishing-triage-agent) — Phishing Triage Agent detaljer
 9. [Security Copilot agents in Intune overview](https://learn.microsoft.com/intune/agents/) — Intune-agenter
--- a/plugins/ms-ai-architect/skills/ms-ai-security/references/cost-optimization/multi-model-strategy-costs.md
+++ b/plugins/ms-ai-architect/skills/ms-ai-security/references/cost-optimization/multi-model-strategy-costs.md
@ -1,6 +1,6 @@
 # Multi-Model Strategy: Cost-Performance Trade-offs

-**Last updated:** 2026-04
+**Last updated:** 2026-04 | Verified: MCP 2026-04
 **Status:** GA
 **Category:** Cost Optimization & FinOps for AI

@ -638,7 +638,7 @@ az consumption usage list --start-date 2026-02-01 --end-date 2026-02-28 \

 **Microsoft Learn (MCP-verified):**
 1. [Model router for Azure AI Foundry](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/concepts/model-router) — **Verified** (MCP fetch, 2026-04)
-2. [Use a gateway in front of multiple Azure OpenAI deployments](https://learn.microsoft.com/en-us/azure/architecture/ai-ml/guide/azure-openai-gateway-multi-backend) — **Verified** (MCP fetch, 2026-04)
+2. [Use a gateway in front of multiple Azure OpenAI deployments](https://learn.microsoft.com/en-us/azure/architecture/ai-ml/guide/azure-openai-gateway-multi-backend) — **Verified** (MCP fetch, 2026-04). Dokument bekrefter: (a) credential termination og reestablishment ved gateway anbefales fremfor pass-through client credentials, (b) gateway gir client-based usage tracking og chargeback-støtte, (c) Azure OpenAI er nå tagget som "Foundry Tools / Azure OpenAI in Foundry Models".
 3. [Understanding costs associated with provisioned throughput units (PTU)](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/how-to/provisioned-throughput-onboarding) — **Verified** (MCP search, 2026-04)
 4. [Azure OpenAI in Azure AI Foundry Models](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/concepts/models) — **Verified** (MCP search, 2026-04)
 5. [GPT-4o vs GPT-4o mini model selection](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/whats-new) — **Verified** (MCP search, 2026-04)
--- a/plugins/ms-ai-architect/skills/ms-ai-security/references/cost-optimization/observability-cost-reduction.md
+++ b/plugins/ms-ai-architect/skills/ms-ai-security/references/cost-optimization/observability-cost-reduction.md
@ -1,6 +1,6 @@
 # Observability and Monitoring Cost Optimization

-**Last updated:** 2026-04
+**Last updated:** 2026-04 | Verified: MCP 2026-04
 **Status:** GA
 **Category:** Cost Optimization & FinOps for AI

@ -169,6 +169,8 @@ builder.Services.AddApplicationInsightsTelemetry(new ApplicationInsightsServiceO
 | **Alerts** | Støttes | Støttes ikke | Støttes ikke |
 | **Retention** | 30-730 dager | 8 dager interactive + long-term | Long-term kun |
 | **Pris (ingestion)** | Standard | ~50% lavere | ~75% lavere |
+| **Workspace replication** | ✅ | ✅ | ❌ (data ikke replikert — ingen beskyttelse ved regional feil) |
+| **Customer Lockbox** | ✅ | ✅ | ❌ (Lockbox-grensesnitt gjelder ikke for Auxiliary-tabeller) |

 **Beslutningstre:**
 1. **Trenger du real-time alerting?** → Analytics
@ -441,7 +443,7 @@ For volumer >1 TB/dag, vurder dedicated cluster for ytterligere besparelser (clu

 10. **Azure Monitor Logs overview: Table plans:**
    https://learn.microsoft.com/en-us/azure/azure-monitor/logs/data-platform-logs#table-plans
-    *Confidence: Verified* – Analytics, Basic, Auxiliary table plans.
+    *Confidence: Verified (MCP 2026-04)* – Analytics, Basic, Auxiliary table plans. Oppdatering 2026-04: Auxiliary-plan bekrefter ingen workspace replication (data ikke beskyttet mot regional feil) og ingen Customer Lockbox-støtte.

 ### Norsk lovverk (Baseline-kunnskap)

--- a/plugins/ms-ai-architect/skills/ms-ai-security/references/performance-scalability/async-processing-patterns.md
+++ b/plugins/ms-ai-architect/skills/ms-ai-security/references/performance-scalability/async-processing-patterns.md
@ -1,6 +1,6 @@
 # Asynchronous Processing Patterns

-**Last updated:** 2026-02
+**Last updated:** 2026-04 | Verified: MCP 2026-04
 **Status:** GA
 **Category:** Performance & Scalability

@ -428,6 +428,84 @@ public class AIRequestController : ControllerBase
 }
 ```

+
+## Event-Driven Architecture Styles (oppdatert 2026-04)
+
+Microsoft dokumenterer to primære topologier for event-drevet AI-prosessering:
+
+### Broker-topologi vs. Mediator-topologi
+
+| Aspekt | Broker-topologi | Mediator-topologi |
+|--------|----------------|-------------------|
+| Koordinering | Events publiseres direkte til broker | Central mediator koordinerer workflow |
+| Eksempel | Azure Event Hubs + Service Bus | Azure Durable Functions |
+| Kobling | Løs kobling mellom produsenter/konsumenter | Sterkere kobling via mediator |
+| Bruksscenario | Høyvolum streaming, uavhengige konsumenter | Komplekse AI-arbeidsflyter med avhengigheter |
+
+### Azure Event Hubs vs. Azure Event Grid
+
+| Service | Type | Bruksscenario |
+|---------|------|---------------|
+| **Azure Event Hubs** | Durable event stream (log) | AI-inferensresultater som skal prosesseres av mange konsumenter |
+| **Azure Event Grid** | Publish-subscribe, reaktiv | Trigger AI-jobb ved filnedlasting, blob-endring |
+| **Azure Service Bus** | Message queue, garantert levering | Jobb-kø for AI-prosessering med retry og dead-letter |
+
+### Utfordringer i event-drevne AI-arkitekturer
+
+```python
+# Utfordring 1: Garantert levering
+# Bruk Service Bus med peek-lock for å garantere at AI-jobb fullføres
+
+from azure.servicebus import ServiceBusClient, ServiceBusMessage
+import json
+
+def process_ai_job_safely(
+    servicebus_conn: str,
+    queue_name: str,
+    ai_processor
+) -> None:
+    """Garantert levering via peek-lock mønster."""
+    with ServiceBusClient.from_connection_string(servicebus_conn) as sb:
+        with sb.get_queue_receiver(queue_name, max_wait_time=5) as receiver:
+            for message in receiver:
+                # Peek-lock: meldingen er reservert, ikke slettet
+                try:
+                    payload = json.loads(str(message))
+                    result = ai_processor(payload)
+                    # Fullfør melding (slett fra kø) kun ved suksess
+                    receiver.complete_message(message)
+                    publish_result(result)
+                except Exception as e:
+                    # Abandon: meldingen returneres til kø for ny levering
+                    receiver.abandon_message(message)
+
+# Utfordring 2: Eventual consistency
+# AI-resultater publiseres asynkront — bruk correlation ID for sporing
+
+def create_ai_job(correlation_id: str, payload: dict) -> dict:
+    """Returner job receipt umiddelbart, resultat kommer asynkront."""
+    return {
+        "correlation_id": correlation_id,
+        "status": "accepted",
+        "result_url": f"/api/results/{correlation_id}",
+        "estimated_completion_seconds": 30
+    }
+
+# Utfordring 3: Ordregaranti
+# Event Hubs garanterer ordre innen én partisjon
+# Bruk samme partisjonsnøkkel for relaterte AI-forespørsler
+
+def publish_ordered_event(
+    producer,
+    partition_key: str,  # f.eks. dokument-ID
+    event_data: dict
+) -> None:
+    from azure.eventhub import EventData
+    event = EventData(json.dumps(event_data))
+    event.properties = {"partition_key": partition_key}
+    producer.send_batch([event], partition_key=partition_key)
+```
+
 ## Norsk offentlig sektor

 - **Saksbehandlingssystemer**: Asynkron prosessering er ideelt for AI-assistert saksbehandling der analyse kan ta tid. Saksbehandler sender inn dokument, fortsetter med annet arbeid, og mottar notifikasjon når analysen er ferdig.
--- a/plugins/ms-ai-architect/skills/ms-ai-security/references/performance-scalability/connection-pooling-patterns.md
+++ b/plugins/ms-ai-architect/skills/ms-ai-security/references/performance-scalability/connection-pooling-patterns.md
@ -1,6 +1,6 @@
 # Connection Pooling Patterns

-**Last updated:** 2026-02
+**Last updated:** 2026-04 | Verified: MCP 2026-04
 **Status:** GA
 **Category:** Performance & Scalability

@ -314,6 +314,56 @@ class ConnectionPoolLoadBalancer:
        raise Exception("All backends exhausted")
 ```

+
+## Azure API Management som Connection Pooling-lag (oppdatert 2026-04)
+
+APIM håndterer backend connection pooling mot Azure OpenAI, noe som avlaster klientsiden:
+
+### APIM Backend Pool-konfigurasjon
+
+```xml
+<!-- APIM: Backend pool med automatisk connection management -->
+<policies>
+    <inbound>
+        <base />
+        <!-- APIM gjenbruker backend-connections automatisk via intern pool -->
+        <!-- Klientene ser APIM som et enkelt endepunkt -->
+        <set-backend-service id="aoai-pool" backend-id="aoai-norway-backend" />
+        
+        <!-- Legg til correlation ID for tracing -->
+        <set-header name="x-correlation-id" exists-action="skip">
+            <value>@(context.RequestId)</value>
+        </set-header>
+    </inbound>
+    
+    <backend>
+        <retry condition="@(context.Response.StatusCode == 429 || context.Response.StatusCode >= 500)"
+               count="3" interval="0" first-fast-retry="true">
+            <!-- APIM håndterer retry mot backend-pool -->
+            <forward-request timeout="120" />
+        </retry>
+    </backend>
+    
+    <outbound>
+        <!-- Eksponer backend latens-metrikk til klient -->
+        <set-header name="x-backend-latency-ms" exists-action="override">
+            <value>@(context.Elapsed.TotalMilliseconds.ToString())</value>
+        </set-header>
+    </outbound>
+</policies>
+```
+
+### Backend-konfigurasjoner i APIM (4 topologier)
+
+Microsoft anbefaler disse mønstrene for APIM connection pooling mot Azure OpenAI:
+
+1. **Single backend**: Én APIM → én Azure OpenAI (enkelt, begrenset kvote)
+2. **Multi-backend single region**: APIM med weighted round-robin mellom Azure OpenAI-instanser
+3. **Multi-subscription**: Separate Azure OpenAI-instanser i ulike subscriptions for kvote-skalering
+4. **Multi-region**: APIM i flere regioner, each med regionale backends
+
+Klientene trenger aldri kjenne til antallet backends — APIM håndterer routing transparent.
+
 ## Norsk offentlig sektor

 Connection pooling har spesielle hensyn for norsk offentlig sektor:
--- a/plugins/ms-ai-architect/skills/ms-ai-security/references/performance-scalability/gpu-compute-sizing.md
+++ b/plugins/ms-ai-architect/skills/ms-ai-security/references/performance-scalability/gpu-compute-sizing.md
@ -1,6 +1,6 @@
 # GPU and Compute Sizing for AI

-**Last updated:** 2026-02
+**Last updated:** 2026-04 | Verified: MCP 2026-04
 **Status:** GA
 **Category:** Performance & Scalability

@ -326,6 +326,78 @@ def compare_deployment_options(
    }
 ```

+
+## Azure ML Online Endpoints — oppdatert (2026-04)
+
+Azure ML Online Endpoints har to deployment-typer:
+
+| Type | Infrastruktur | Administrasjon | Bruksscenario |
+|------|---------------|----------------|---------------|
+| Managed Online Endpoint | Azure-administrert | Minimal | Raskest å komme i gang, serverless |
+| Kubernetes Online Endpoint | Kundeeid K8s-kluster | Full kontroll | On-premises, hybrid, spesielle krav |
+
+### Anbefalt arbeidsflyt: Lokal debug → Azure deploy
+
+```python
+# Steg 1: Test deployment lokalt
+from azure.ai.ml import MLClient
+from azure.ai.ml.entities import (
+    ManagedOnlineEndpoint,
+    ManagedOnlineDeployment,
+    Model,
+    Environment
+)
+from azure.identity import DefaultAzureCredential
+
+# Lokal testing med Azure ML SDK
+import subprocess
+result = subprocess.run([
+    "az", "ml", "online-endpoint", "create",
+    "--local",
+    "--name", "my-endpoint",
+    "--file", "endpoint.yaml"
+], capture_output=True, text=True)
+
+# Steg 2: Deploy til Azure (ManagedOnlineDeployment)
+ml_client = MLClient(
+    credential=DefaultAzureCredential(),
+    subscription_id="...",
+    resource_group_name="rg-ai",
+    workspace_name="my-ml-workspace"
+)
+
+endpoint = ManagedOnlineEndpoint(
+    name="my-production-endpoint",
+    description="GPU-akselerert inferens",
+    auth_mode="key"
+)
+ml_client.online_endpoints.begin_create_or_update(endpoint).result()
+
+# ManagedOnlineDeployment: spesifiser instance_type for GPU
+deployment = ManagedOnlineDeployment(
+    name="blue",
+    endpoint_name="my-production-endpoint",
+    model="azureml:my-model:1",
+    instance_type="Standard_NC24ads_A100_v4",  # A100 GPU
+    instance_count=2,
+    environment="azureml:my-environment:1",
+    request_settings={
+        "max_concurrent_requests_per_instance": 4,
+        "request_timeout_ms": 90000
+    }
+)
+ml_client.online_deployments.begin_create_or_update(deployment).result()
+```
+
+### GPU-instanstyper for inferens (2026-04)
+
+| SKU | GPU | VRAM | Bruksscenario |
+|-----|-----|------|---------------|
+| `Standard_NC6s_v3` | V100 (1x) | 16 GB | Medium modeller |
+| `Standard_NC24s_v3` | V100 (4x) | 64 GB | Større modeller |
+| `Standard_NC24ads_A100_v4` | A100 (1x) | 80 GB | Store modeller (7B-13B) |
+| `Standard_ND96amsr_A100_v4` | A100 (8x) | 640 GB | Meget store modeller (70B+) |
+
 ## Norsk offentlig sektor

 - **Anskaffelse**: GPU VM-er er kostbare — bruk Azure Reserved Instances (1-3 år) for 40-60% besparelse på forutsigbare workloads. Krever godkjenning i anskaffelsesprosess.
--- a/plugins/ms-ai-architect/skills/ms-ai-security/references/performance-scalability/model-distillation-performance.md
+++ b/plugins/ms-ai-architect/skills/ms-ai-security/references/performance-scalability/model-distillation-performance.md
@ -1,6 +1,6 @@
 # Model Distillation for Performance

-**Last updated:** 2026-02
+**Last updated:** 2026-04 | Verified: MCP 2026-04
 **Status:** GA
 **Category:** Performance & Scalability

@ -352,6 +352,80 @@ print(f"ROI: {savings['roi_months']} måneder")
 | Hyppig endring i oppgave | Unngå distillation | Re-training overhead |
 | Latens-kritisk (<500ms) | Distiller til nano + PTU | Lavest mulig responstid |

+
+## Modellvalg og routing-strategi (oppdatert 2026-04)
+
+Microsoft dokumenterer nå **10 seleksjonskriterier** ved valg av AI-modell for distillasjon:
+
+| Kriterium | Relevans for distillasjon |
+|-----------|--------------------------|
+| Task fit | Velg teacher og student basert på oppgavens art |
+| **Routing strategy** | Definer routing FØR distillasjon — påvirker teacher-modellvalg |
+| Cost | Studentmodellens kostnad er primær motivasjon |
+| Context window | Student må håndtere samme kontekst som teacher |
+| Security | Studentmodell arver ikke teachers sikkerhetstiltak — re-evaluer |
+| Region | Student deployes i samme region som teacher for dataresidency |
+| Deployment | PTU vs Standard — student er oftest Standard til start |
+| Domain | Domene-spesifikk teacher gir bedre student |
+| Performance | Latens- og throughput-krav til student (se modellmatrise) |
+| **Tunability** | Studentmodellen MÅ støtte fine-tuning (f.eks. GPT-4o-mini, GPT-4.1-nano) |
+
+### Modell-routing som distillasjonsstrategi
+
+```python
+# Model routing strategy i distillasjonskontekst
+# Teacher: GPT-4.1 (høyeste kvalitet)
+# Router: Klassifiser oppgavekompleksitet → velg modell dynamisk
+# Student: GPT-4.1-mini eller GPT-4.1-nano (basert på klassifisering)
+
+from openai import AzureOpenAI
+import json
+
+client = AzureOpenAI(
+    azure_endpoint="https://my-foundry.openai.azure.com",
+    api_key="...",
+    api_version="2024-10-21"
+)
+
+def classify_task_complexity(user_input: str) -> str:
+    """Klassifiser oppgavekompleksitet for routing."""
+    response = client.chat.completions.create(
+        model="gpt-4.1-nano",  # Rask og billig til routing
+        messages=[{
+            "role": "system",
+            "content": "Klassifiser denne brukerforespørselen: 'simple' (fakta, svar, klassifisering) eller 'complex' (resonnering, kreativt, multi-steg). Svar med ett ord."
+        }, {"role": "user", "content": user_input}]
+    )
+    return response.choices[0].message.content.strip().lower()
+
+def route_to_model(user_input: str) -> str:
+    """Route til riktig modell basert på kompleksitet."""
+    complexity = classify_task_complexity(user_input)
+    
+    if complexity == "simple":
+        model = "ft:gpt-4.1-nano:distilled-v1"  # Distillert nano for enkle oppgaver
+    else:
+        model = "gpt-4.1"  # Teacher for komplekse oppgaver
+    
+    response = client.chat.completions.create(
+        model=model,
+        messages=[{"role": "user", "content": user_input}]
+    )
+    return response.choices[0].message.content
+
+# Routing strategy gir: lavere kostnad for enkle oppgaver + høy kvalitet for komplekse
+```
+
+### Oppdatert modellmatrise for distillasjon
+
+| Modell | Tunability | TPM (PTU, input) | Anbefalt student-rolle |
+|--------|-----------|-----------------|----------------------|
+| GPT-4.1-nano | Ja | 59,400 | Enkle oppgaver, latens-kritisk |
+| GPT-4o-mini | Ja | 37,000 | Generelle oppgaver, kostnadsoptimal |
+| GPT-4.1-mini | Ja | 14,900 | Moderate oppgaver, god balanse |
+| GPT-4.1 | Nei (direkte) | 3,000 | Teacher (ikke student) |
+| GPT-4o | Nei (direkte) | 2,500 | Teacher (ikke student) |
+
 ## Referanser

 - [Azure OpenAI stored completions & distillation](https://learn.microsoft.com/azure/ai-foundry/openai/how-to/stored-completions) — Distillation workflow
--- a/plugins/ms-ai-architect/skills/ms-ai-security/references/performance-scalability/rate-limit-management.md
+++ b/plugins/ms-ai-architect/skills/ms-ai-security/references/performance-scalability/rate-limit-management.md
@ -1,6 +1,6 @@
 # Rate Limit Management

-**Last updated:** 2026-02
+**Last updated:** 2026-04 | Verified: MCP 2026-04
 **Status:** GA
 **Category:** Performance & Scalability

@ -398,6 +398,60 @@ AzureMetrics
 """
 ```

+
+## Gateway Multi-Backend som Rate Limit-strategi (oppdatert 2026-04)
+
+Microsoft dokumenterer multi-backend gateway som den anbefalte arkitekturmønsteret for rate limit management — primært via Azure API Management:
+
+### Anbefalte topologier for rate limit-distribusjon
+
+| Topologi | Kvote-kapasitet | Kompleksitet | Anbefalt for |
+|----------|----------------|--------------|--------------|
+| Single instance | Baseline TPM | Lav | Utvikling, lav trafikk |
+| Multi-backend, single region | 2-5x baseline | Medium | Produksjon, standard |
+| Multi-subscription | 5-20x baseline | Høy | Høy trafikk enterprise |
+| Multi-region | Nær ubegrenset | Høy | Kritisk infrastruktur |
+
+### APIM-basert rate limit distribusjon
+
+```xml
+<!-- APIM Policy: Distribuer rate limit på tvers av backends -->
+<policies>
+    <inbound>
+        <base />
+        
+        <!-- Token-based rate limiting i APIM (avlaster Azure OpenAI) -->
+        <azure-openai-token-limit
+            counter-key="@(context.Request.Headers.GetValueOrDefault("x-client-id", "default"))"
+            tokens-per-minute="10000"
+            estimate-prompt-tokens="true"
+            tokens-consumed-variable-name="consumed-tokens"
+            remaining-tokens-variable-name="remaining-tokens" />
+        
+        <!-- Velg backend basert på tilgjengelighet -->
+        <set-variable name="backend-url" value="@{
+            // Prioritert liste: prøv Norway East, fallback til Sweden Central
+            if (context.Variables.GetValueOrDefault<int>("norway-throttle") < DateTimeOffset.UtcNow.ToUnixTimeSeconds())
+                return "https://aoai-norway.openai.azure.com";
+            return "https://aoai-sweden.openai.azure.com";
+        }" />
+        
+        <set-backend-service base-url="@(context.Variables.GetValueOrDefault<string>("backend-url"))" />
+    </inbound>
+    
+    <backend>
+        <retry condition="@(context.Response.StatusCode == 429)" count="2" interval="0">
+            <set-variable name="norway-throttle" value="@(
+                DateTimeOffset.UtcNow.AddSeconds(
+                    double.Parse(context.Response.Headers.GetValueOrDefault("Retry-After", "10"))
+                ).ToUnixTimeSeconds())" />
+            <set-backend-service base-url="https://aoai-sweden.openai.azure.com" />
+            <forward-request />
+        </retry>
+    </backend>
+</policies>
+```
+
 ## Norsk offentlig sektor

 - **SLA-implikasjoner**: Standard Azure OpenAI deployments har ingen latens-SLA — 429-feil er forventet atferd under høy belastning. Dokumenter dette i tjenesteavtaler med interne brukere.
--- a/plugins/ms-ai-architect/skills/ms-ai-security/references/performance-scalability/regional-deployment-latency.md
+++ b/plugins/ms-ai-architect/skills/ms-ai-security/references/performance-scalability/regional-deployment-latency.md
@ -1,6 +1,6 @@
 # Regional Deployment for Latency Reduction

-**Last updated:** 2026-02
+**Last updated:** 2026-04 | Verified: MCP 2026-04
 **Status:** GA
 **Category:** Performance & Scalability

@ -306,6 +306,73 @@ class MultiRegionHealthChecker:
 | Gradert informasjon | Nei | Nei | Avhenger av sertifisering |
 | Metadata i EU | Nei | Ja | Ja |

+
+## Azure Front Door — oppdatert (2026-04)
+
+### Edge-lokasjoner og kapabiliteter
+
+Azure Front Door har **118+ edge-lokasjoner** globalt (bekreftet 2026-04). Premium-tier støtter:
+- **Private Link til origins**: Front Door Premium kan rute trafikk til Azure OpenAI via Private Link — ingen offentlig eksponering av backend
+- **WAF-regler**: Innebygd Web Application Firewall med OpenAI-spesifikke regler
+
+```bash
+# Front Door Premium med Private Link til Azure OpenAI
+az afd origin create \
+  --resource-group rg-ai-networking \
+  --profile-name fd-ai-gateway \
+  --origin-group-name aoai-backends \
+  --origin-name aoai-norway \
+  --host-name aoai-norway.openai.azure.com \
+  --origin-host-header aoai-norway.openai.azure.com \
+  --priority 1 \
+  --weight 1000 \
+  --enabled-state Enabled \
+  --https-port 443 \
+  --enable-private-link true \
+  --private-link-location norwayeast \
+  --private-link-resource "/subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.CognitiveServices/accounts/aoai-norway" \
+  --private-link-sub-resource-type account
+```
+
+## Gateway Multi-Backend — 4 topologier (oppdatert 2026-04)
+
+Microsoft dokumenterer nå fire formelle topologier for Azure OpenAI gateway:
+
+| Topologi | Beskrivelse | Bruksscenario |
+|----------|-------------|---------------|
+| Single APIM instance | Én APIM mot én Azure OpenAI | Enkel arkitektur, lav kompleksitet |
+| Single region, multiple backends | Én region, flere Azure OpenAI-instanser | Load balancing og failover |
+| Single region, multiple subscriptions | Kvote-utvidelse via flere Azure-subscriptions | Høy TPM-kvote krav |
+| Multiple regions | APIM i flere regioner, globalt | Global distribusjon, data residency |
+
+### Topologi 3: Multiple subscriptions for kvote-utvidelse
+
+```xml
+<!-- APIM Policy: Distribuer last over subscriptions for kvote-utvidelse -->
+<policies>
+    <inbound>
+        <base />
+        <set-variable name="subscription-backends" value="@{
+            var backends = new JArray(
+                new JObject(
+                    new JProperty('url', 'https://aoai-sub1.openai.azure.com'),
+                    new JProperty('subscription', 'sub1')),
+                new JObject(
+                    new JProperty('url', 'https://aoai-sub2.openai.azure.com'),
+                    new JProperty('subscription', 'sub2')),
+                new JObject(
+                    new JProperty('url', 'https://aoai-sub3.openai.azure.com'),
+                    new JProperty('subscription', 'sub3'))
+            );
+            // Round-robin mellom subscriptions
+            var idx = (int)(DateTimeOffset.UtcNow.ToUnixTimeSeconds() / 60) % 3;
+            return backends[idx]['url'].ToString();
+        }" />
+        <set-backend-service base-url="@(context.Variables.GetValueOrDefault<string>("subscription-backends"))" />
+    </inbound>
+</policies>
+```
+
 ## Norsk offentlig sektor

 - **Primær region**: Norway East for alle workloads med personopplysninger. Sweden Central som failover.