docs(architect): weekly KB update — 106 files refreshed (2026-04)

Updates across all 5 skills: ms-ai-advisor, ms-ai-engineering,
ms-ai-governance, ms-ai-security, ms-ai-infrastructure.

Key changes:
- Language Services (Custom Text Classification, Text Analytics, QnA):
  retirement warning 2029-03-31, migration guides to Foundry/GPT-4o
- Agentic Retrieval: 50M free reasoning tokens/month (Public Preview)
- Computer Use: Claude Sonnet 4.5 (preview) + OpenAI CUA models
- Agent Registry: Risks column (M365 E7), user-shared/org-published types
- Declarative agents: schema v1.5 → v1.6, Store validation requirements
- MLflow 3: 13 built-in LLM judges, production monitoring, Genie Code
- AG-UI HITL: ApprovalRequiredAIFunction (C#) + @tool(approval_mode) (Python)
- Entra ID Ignite 2025: Agent ID Admin/Developer RBAC roles, Conditional Access
- Security Copilot: 400 SCU/month per 1000 M365 E5 licenses, auto-provisioned
- Fast Transcription API: phrase lists, 14-language multi-lingual transcription
- Azure Monitor Workbooks: Bicep support, RBAC specifics
- Power Platform Copilot: data residency (Norway/Europe → EU DB, Bing → USA)
- RAG security-rbac: 4-approach table (GA + 3 preview access control methods)
- IaC MLOps: Well-Architected OE:05 principles, Bicep/Terraform patterns
- Translator: image file batch translation Preview (JPEG/PNG/BMP/WebP)

All 106 files: Last updated 2026-04 | Verified: MCP 2026-04

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Kjell Tore Guttormsen 2026-04-10 09:13:24 +02:00
commit ff6a50d14f
104 changed files with 1986 additions and 520 deletions

View file

@ -1,6 +1,6 @@
# Error Handling and Fallback Prompting Strategies
**Last updated:** 2026-02
**Last updated:** 2026-04 | Verified: MCP 2026-04
**Status:** GA
**Category:** Prompt Engineering & LLM Optimization
@ -374,10 +374,20 @@ APIM kan enforces content safety checks automatisk:
</policies>
```
**Policy-attributter (Verified MCP 2026-04):**
- `backend-id`: Azure AI Content Safety backend i APIM
- `shield-prompt`: Sjekk for brukerangrep/adversarial prompts (true/false)
- `enforce-on-completions`: Aktiver content safety på responser i tillegg til requests
- `window-size`: Tegn per vindu for evaluering (maks 10 000 tegn, konfigurerbart for responser)
- `output-type`: FourSeverityLevels (0,2,4,6) eller EightSeverityLevels (0-7)
- Threshold 0 = mest restriktivt, 7 = minst restriktivt. Threshold 4 blokkerer nivå 4-7, tillater 0-3.
- Støtter også `blocklists` for tilpassede ord/uttrykk
**Fordeler:**
- Sentralisert content safety enforcement
- Automatisk blokkering av requester som matcher attack patterns
- Sentralisert content safety enforcement på API-lag
- Automatisk blokkering (HTTP 403) av requester som matcher attack patterns
- Ingen endringer nødvendig i applikasjonskode
- Fungerer for streaming responses (buffer-basert sliding window)
### Azure Monitor + Action Groups for Automated Healing
@ -680,7 +690,7 @@ User Request
2. [Architecture strategies for self-preservation](https://learn.microsoft.com/en-us/azure/well-architected/reliability/self-preservation) Azure Well-Architected Framework reliability-mønstre
3. [Azure OpenAI Priority-Based Load Balancer (GitHub)](https://github.com/Azure-Samples/openai-aca-lb) Referanseimplementasjon av smart load balancing
4. [Troubleshooting Azure OpenAI On Your Data](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/how-to/on-your-data-best-practices) Best practices for debugging og error handling
5. [llm-content-safety policy (APIM)](https://learn.microsoft.com/en-us/azure/api-management/llm-content-safety-policy) Content safety enforcement i API Management
5. [llm-content-safety policy (APIM)](https://learn.microsoft.com/en-us/azure/api-management/llm-content-safety-policy) (Re-verified MCP 2026-04) Content safety enforcement i API Management. Policy-attributter: backend-id, shield-prompt, enforce-on-completions, window-size, output-type, threshold (0-7), blocklists.
**Sekundærkilder:**
6. [Azure OpenAI FAQ](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/faq) Vanlige feilsituasjoner og workarounds
@ -690,7 +700,7 @@ User Request
**Verifisert:** Alle tekniske detaljer er hentet fra offisielle Microsoft-kilder (learn.microsoft.com, GitHub samples). Kodeeksempler er basert på offisielle SDK-dokumentasjon (januar 2026).
**Confidence markers:**
- **Høy confidence:** HTTP error codes, SDK retry defaults, `Retry-After` header, content safety policies
- **Høy confidence:** HTTP error codes, SDK retry defaults, `Retry-After` header, content safety policies (re-verified MCP 2026-04)
- **Medium confidence:** Kostnadsestimater (prisene kan variere), spesifikke PTU-priser for norske kunder
- **Lav confidence:** N/A alle anbefalinger er basert på etablerte mønstre

View file

@ -1,6 +1,6 @@
# Few-Shot and Zero-Shot Learning Techniques
**Last updated:** 2026-02
**Last updated:** 2026-04 | Verified: MCP 2026-04
**Status:** GA
**Category:** Prompt Engineering & LLM Optimization
@ -10,7 +10,7 @@
Few-shot og zero-shot learning er grunnleggende teknikker i prompt engineering som endrer hvordan språkmodeller tilpasser seg nye oppgaver uten permanent modelltrening. Zero-shot learning utfører oppgaver basert kun på instruksjoner, mens few-shot learning bruker eksempler (input-output par) for å "prime" modellen til ønsket oppførsel. Begge teknikkene opererer via in-context learning — modellen endres ikke permanent, men eksemplene påvirker kun gjeldende inference. Disse metodene er sentrale for Azure OpenAI Service, Copilot Studio og Microsoft Agent Framework.
**Verifikasjonsgrad:** Verified (MCP microsoft-learn, januar 2026)
**Verifikasjonsgrad:** Verified (MCP microsoft-learn, januar 2026, re-verified april 2026)
---
@ -40,6 +40,10 @@ messages = [
- Modellen "gjetter" ønsket format
- Mindre pålitelig for domene-spesifikke oppgaver
**To primære bruksområder for zero-shot (Verified .NET AI docs, MCP 2026-04):**
1. **Fine-tunede LLM-er**: Fungerer godt med modeller som allerede er trent på instruksjonsdatasett
2. **Etablere performance baselines**: Simuler reell brukeratferd → evaluer accuracy/precision → eksperimenter deretter med few-shot
### One-Shot Learning
**Definisjon:** Én eksempel-par (input + output) i promptet.
@ -92,6 +96,15 @@ response = client.chat.completions.create(
- Eksemplene "konditionerer" modellen for gjeldende inference
- Demonstrerer edge cases og ønsket tone
**To primære bruksområder for few-shot (Verified .NET AI docs, MCP 2026-04):**
1. **Tuning av LLM**: Legger til kunnskap og kan forbedre performance. Produserer flere tokens enn zero-shot — kan bli kostbart.
2. **Fikse performance-problemer**: Bruk zero-shot for baseline → eksperimenter med few-shot basert på svake punkter → iterer
**Caveats (Verified .NET AI docs):**
- Fungerer dårlig for komplekse resonneringsoppgaver — legg til instruksjoner for å motvirke dette
- Lange few-shot prompts øker latency og kostnad; det er en grense for prompt-lengde
- Med mange eksempler kan modellen lære falske mønstre (f.eks. "sentiment er dobbelt så ofte positivt som negativt")
---
## Arkitekturmønstre
@ -504,9 +517,9 @@ User Query
- https://learn.microsoft.com/en-us/azure/ai-foundry/openai/how-to/chatgpt
- Seksjon: Few-shot learning with chat completion
3. **Zero-shot and few-shot learning** (.NET)
3. **Zero-shot and few-shot learning** (.NET AI conceptual) (Re-verified MCP 2026-04)
- https://learn.microsoft.com/en-us/dotnet/ai/conceptual/zero-shot-learning
- Primære use cases, performance baselines
- Primære use cases, performance baselines, caveats (false patterns, token limits, reasoning gaps)
4. **Chat Markup Language ChatML**
- https://learn.microsoft.com/en-us/azure/ai-foundry/openai/how-to/chat-markup-language

View file

@ -1,6 +1,6 @@
# Multimodal Prompt Design with Images and Text
**Last updated:** 2026-02
**Last updated:** 2026-04 | Verified: MCP 2026-04
**Status:** GA
**Category:** Prompt Engineering & LLM Optimization
@ -187,12 +187,20 @@ messages = [
| Verbalization | Semantisk dybde, LLM-sitérbare beskrivelser | LLM-kall per bilde, høyere latency | Diagrammer, flowcharts, infografikk |
| Direct embeddings | Rask, ingen LLM-kall ved indexing | Ingen forklaring av relasjoner | Visual similarity, produktsøk |
**Azure AI Search multimodal pipeline:**
1. Document extraction (Document Extraction / Layout / Content Understanding skill)
2. Text chunking (Text Split skill)
3. Image verbalization (GenAI Prompt skill + LLM)
4. Embedding (Azure OpenAI / Foundry / Azure Vision)
5. Knowledge store (for image storage og retrieval)
**Azure AI Search multimodal pipeline (Verified MCP 2026-04):**
1. **Content extraction** — velg mellom:
- Document Extraction skill: rask prototyping, PDF-støtte
- Document Layout skill: presise sidetall, bounding boxes, RAG-optimalisert
- Azure Content Understanding skill: avansert — cross-page tabeller, semantisk chunking, DOCX/XLSX/PPTX
2. **Text chunking:** Text Split skill
3. **Image verbalization:** GenAI Prompt skill + LLM (phi-4, gpt-4o, gpt-5) → naturlig-språklig beskrivelse
4. **Embedding:** Azure OpenAI / Microsoft Foundry / Azure Vision multimodal embeddings
5. **Knowledge store:** Lagrer bilder for retrieval; image-lokasjon lagres i indeks for sitert visning
**To retrieval-stier:**
- Verbalized content → hybrid queries (text + vector). Gir semantisk dybde og LLM-siterbare beskrivelser.
- Direct multimodal embeddings (Azure Vision) → image-to-vector queries. Effektiv visual similarity uten LLM-kall ved indexing.
- Mange løsninger kombinerer begge: forklaringsrike visuals verbaliseres, foto/produktbilder embeddes direkte.
## Beslutningsveiledning
@ -445,8 +453,12 @@ Multimodal scenario?
├─ Volum > 10k bilder/dag
│ └─ Azure AI Search multimodal pipeline + Azure Vision embeddings
└─ Trengs søk over historiske bilder?
└─ Azure AI Search multimodal RAG (verbalization eller direct embeddings)
├─ Trengs søk over historiske bilder?
│ └─ Azure AI Search multimodal RAG (verbalization eller direct embeddings)
└─ RAG over PDF/Office-dokumenter med embedded diagrammer?
├─ Forklaringsrike visuals: Document Layout skill + GenAI Prompt verbalization
└─ Visual similarity: Azure Content Understanding + Azure Vision embeddings
```
### Red Flags
@ -532,7 +544,7 @@ AzureDiagnostics
**Microsoft Learn dokumentasjon (verifisert 2026-02):**
- [Use vision-enabled chat models](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/how-to/gpt-with-vision) — Offisiell how-to guide for GPT-4o/GPT-4 Turbo with Vision
- [Image prompt engineering techniques](https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/gpt-4-v-prompt-engineering) — Best practices for multimodal prompting
- [Multimodal search in Azure AI Search](https://learn.microsoft.com/en-us/azure/search/multimodal-search-overview) — RAG-arkitektur med image verbalization og direct embeddings
- [Multimodal search in Azure AI Search](https://learn.microsoft.com/en-us/azure/search/multimodal-search-overview) (Re-verified MCP 2026-04) — RAG-arkitektur; extraction skill-sammenligning (Document Extraction vs Layout vs Content Understanding); verbalization vs direct embeddings; hybrid query-alternativ
- [Azure OpenAI models](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/concepts/models) — Modelloversikt og token-kostnader
- [Quickstart: Multimodal search in Azure portal](https://learn.microsoft.com/en-us/azure/search/search-get-started-portal-image-search) — Wizard-basert oppsett
- [Get started with multimodal vision chat apps](https://learn.microsoft.com/en-us/azure/developer/ai/get-started-app-chat-vision) — End-to-end sample app med Base64 encoding
@ -547,5 +559,5 @@ AzureDiagnostics
- ⚠️ **Medium confidence:** Kostberegninger i NOK (basert på jan 2026 pricing, kan variere)
- ⚠️ **Medium confidence:** Offentlig sektor use cases (inferert fra generelle patterns, ikke Microsoft-spesifikt)
**Sist verifisert:** 2026-02-04
**Neste review:** 2026-04 (eller ved nye GPT-modeller)
**Sist verifisert:** 2026-04-10
**Neste review:** 2026-07 (eller ved nye GPT-modeller/AI Search features)

View file

@ -1,6 +1,6 @@
# Real-Time Reasoning and Performance Optimization
**Last updated:** 2026-02
**Last updated:** 2026-04 | Verified: MCP 2026-04
**Status:** GA (Realtime API: Public Preview)
**Category:** Prompt Engineering & LLM Optimization
@ -307,6 +307,25 @@ Deployment C: Chatbot (variabel prompt, medium output)
- **Non-streaming:** End-to-end Request Time
- **Streaming:** Time to Response (TTFT), Average Token Generation Rate
### Azure Speech Service (TTS Latency)
**Teknikker for å redusere speech synthesis latency (Verified MCP 2026-04):**
| Teknikk | Effekt |
|---------|--------|
| **Streaming (AudioDataStream)** | Start avspilling ved første audio-chunk; ikke vent på komplett audio |
| **Pre-connect** | Åpne WebSocket-forbindelsen proaktivt mens bruker snakker; kall `SpeakTextAsync` når svar er klart |
| **Gjenbruk SpeechSynthesizer** | Unngå ny TCP/SSL/HTTP-handshake per request; bruk object pool |
| **Komprimert lyd** | MP3 (48kbps) vs PCM (384kbps) — 87% lavere nettverkspayload for mobil/ustabile nettverk |
| **Text streaming (WebSocket v2)** | Send GPT-output til TTS chunk for chunk via `wss://{region}.tts.speech.microsoft.com/cognitiveservices/websocket/v2`. Ideelt for real-time AI-dialoger. |
**Latency-metrikker fra Speech SDK:**
- `first byte client latency` — fra syntese starter til første audio-chunk mottas (inkl. nettverks-RTT)
- `finish client latency` — fra syntese starter til all lyd er mottatt
- `first byte service latency` — behandlingstid på Azure TTS-siden
**Anbefaling:** For sanntids AI-dialoger (GPT + TTS), kombiner Realtime API (audio in/out) med Speech SDK text streaming for hybrid norsk/engelsk-løsninger.
### Copilot Studio
**Relevans:** Copilot Studio kan integrere Azure OpenAI custom models via Power Platform connectors.
@ -477,9 +496,9 @@ Deployment C: Chatbot (variabel prompt, medium output)
[https://learn.microsoft.com/en-us/azure/ai-foundry/openai/realtime-audio-quickstart](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/realtime-audio-quickstart)
Hentet: januar 2026. Kode-eksempler for Python, JavaScript, deployment steps.
4. **Lower speech synthesis latency using Speech SDK**
4. **Lower speech synthesis latency using Speech SDK** (Re-verified MCP 2026-04)
[https://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-lower-speech-synthesis-latency](https://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-lower-speech-synthesis-latency)
Hentet: januar 2026. Dekker text streaming for TTS (komplementær til Realtime API).
Hentet: januar 2026, re-verified april 2026. Dekker: first byte latency vs finish latency, streaming via AudioDataStream, pre-connect og SpeechSynthesizer-gjenbruk (object pool), komprimert lyd (MP3 48kbps vs PCM 384kbps), text streaming via WebSocket v2 (wss endpoint) for real-time GPT-output vocalization.
**Verification steps:**
@ -487,6 +506,7 @@ Deployment C: Chatbot (variabel prompt, medium output)
2. ✅ **Realtime API models:** Bekreftet at `gpt-4o-mini-realtime-preview` og `gpt-4o-realtime-preview` er tilgjengelige i East US 2 / Sweden Central.
3. ✅ **VAD modes:** Bekreftet at `server_vad`, `semantic_vad`, og `none` er supported turn detection types.
4. ✅ **Latency metrics:** Bekreftet at Time to Response (TTFT) og Average Token Generation Rate er recommended metrics for streaming.
5. ✅ **Speech latency:** first byte client latency og AudioDataStream-streaming bekreftet. Text streaming via WebSocket v2 bekreftet for C#, Python.
5. ⚠️ **Pricing:** Audio token pricing ikke eksplisitt i dokumentasjon per januar 2026. Brukt representative estimates basert på historisk OpenAI pricing structure.
**Confidence level:** Høy (✅) for tekniske detaljer, Middels (⚠️) for pricing og production-readiness av Realtime API (public preview).

View file

@ -1,6 +1,6 @@
# Role-Playing and Persona-Based Prompting
**Last updated:** 2026-02
**Last updated:** 2026-04 | Verified: MCP 2026-04
**Status:** GA
**Category:** Prompt Engineering & LLM Optimization
@ -359,17 +359,22 @@ You are a friendly technical support specialist for [Product].
5. ✅ Give **a way out** "If unable, respond with 'not found'"
6. ✅ Test and refine Iterer basert på faktisk bruk
**Prompt Node for Dynamic Personas:**
**Prompt Node for Dynamic Personas (nlu-prompt-node, Verified 2026-04):**
Bruk prompt nodes i topics for å endre persona mid-flow:
Bruk prompt nodes i topics for å endre persona mid-flow. Legges til via "Add a tool" → "New prompt" i topic:
```yaml
Node Type: Prompt
Persona Override:
"For this specific question, act as a billing specialist.
Provide detailed information about payment terms and invoice procedures."
Node Type: Prompt (Add a tool > New prompt)
Best practices:
- Be specific: Klare instruksjoner gir forutsigbare svar
- Use examples: Illustrer forventet oppførsel
- Keep it brief: Lange instruksjoner → latency og timeouts
- Give a way out: "respond with not found if answer isn't present"
- Temperature: Kontroller kreativitet/determinisme per prompt
```
Prompts kan også legges til på agent-nivå (Tools tab) eller som node i agent flows (AI capabilities → Run a prompt).
### Microsoft 365 Copilot (Enterprise)
**Grounding prompts:**
@ -670,8 +675,8 @@ If uncertain, explain limitations.
3. [Prompt engineering techniques - Azure OpenAI](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/concepts/prompt-engineering)
*Bredere prompt-veiledning inkludert few-shot og token efficiency*
4. [Use prompts in Copilot Studio](https://learn.microsoft.com/en-us/microsoft-copilot-studio/nlu-prompt-node)
*Best practices for Copilot Studio prompt instructions*
4. [Use prompts in Copilot Studio](https://learn.microsoft.com/en-us/microsoft-copilot-studio/nlu-prompt-node) (Re-verified MCP 2026-04)
*Prompt editor features: natural language creation, template library, model selection (Azure OpenAI/Foundry), temperature, knowledge retrieval, code interpreter. Prompt-nivå: agent-tool, topic-node, agent flow-node.*
5. [Azure OpenAI On Your Data - Best practices](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/concepts/use-your-data)
*System message bruk i RAG-scenarier*
@ -689,5 +694,5 @@ If uncertain, explain limitations.
- ✅ **Documented best practices:** Authoring techniques tabeller
- ⚠️ **Implementation-dependent:** Nøyaktig token cost varierer med model version
**Siste oppdatering:** 2026-02-04
**Neste review:** 2026-05 (når nye prompt engineering features annonseres på Build 2026)
**Siste oppdatering:** 2026-04-10
**Neste review:** 2026-07