Initial addition of ms-ai-architect plugin to the open-source marketplace. Private content excluded: orchestrator/ (Linear tooling), docs/utredning/ (client investigation), generated test reports and PDF export script. skill-gen tooling moved from orchestrator/ to scripts/skill-gen/. Security scan: WARNING (risk 20/100) — no secrets, no injection found. False positive fixed: added gitleaks:allow to Python variable reference in output-validation-grounding-verification.md line 109. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
15 KiB
Logging & Analytics for AI Traffic in APIM
Last updated: 2026-02 Status: GA Category: API Management & AI Gateway
Introduksjon
Observability er fundamentalt for a drifte AI-applikasjoner i produksjon. Azure API Management tilbyr omfattende logging- og analysekapabiliteter spesielt tilpasset AI-trafikk, inkludert token-sporring, prompt/completion-logging og innebygde dashboards for LLM-bruk. Disse verktoyene lar organisasjoner spore kostnader, overvake ytelse, sikre compliance og feilsoke problemer med AI-API-er.
For norsk offentlig sektor er logging og analytics spesielt viktig av flere grunner: Riksrevisjonen og Datatilsynet krever sporbarhet, offentlighetsloven krever dokumentasjon av automatiserte beslutninger, og budsjettkontroll krever presise kostnadsrapporter for AI-forbruk. APIM sin AI gateway gir de nodvendige verktoyene for a oppfylle disse kravene uten a bygge egne losninger.
APIM tilbyr to hovedkanaler for AI-logging: Application Insights-integrasjon for sanntidsmetrikker og Azure Monitor diagnostic settings for langtidslagring og analyse i Log Analytics. Begge kanalene stotter AI-spesifikke datapunkter som token-forbruk, modellnavn og valgfritt prompt/completion-innhold.
Application Insights-integrasjon
Oppsett av Application Insights Logger
- Opprett eller koble til en Application Insights-ressurs
- Konfigurer logger i APIM
- Aktiver diagnostikk for spesifikke eller alle API-er
Konfigurere logger med Bicep
resource appInsights 'Microsoft.Insights/components@2020-02-02' existing = {
name: appInsightsName
}
resource apimLogger 'Microsoft.ApiManagement/service/loggers@2023-09-01-preview' = {
parent: apiManagement
name: 'ai-gateway-logger'
properties: {
loggerType: 'applicationInsights'
credentials: {
connectionString: appInsights.properties.ConnectionString
}
resourceId: appInsights.id
}
}
resource apiDiagnostic 'Microsoft.ApiManagement/service/apis/diagnostics@2023-09-01-preview' = {
parent: aiApi
name: 'applicationinsights'
properties: {
loggerId: apimLogger.id
alwaysLog: 'allErrors'
logClientIp: true
sampling: {
samplingType: 'fixed'
percentage: 100
}
frontend: {
request: {
headers: [ 'x-request-id', 'x-correlation-id', 'x-tenant-id' ]
body: { bytes: 8192 }
}
response: {
headers: [ 'x-model-used', 'x-cache-hit' ]
body: { bytes: 8192 }
}
}
backend: {
request: {
headers: [ 'Authorization' ]
body: { bytes: 0 } // Don't log auth tokens
}
response: {
body: { bytes: 8192 }
}
}
}
}
Custom Metrics med Token-sporring
Emit Token Metrics Policy
APIM tilbyr dedikerte policies for a sende token-metrikker til Application Insights:
<policies>
<outbound>
<base />
<!-- Emit token metrics for Azure OpenAI APIs -->
<azure-openai-emit-token-metric namespace="ai-gateway-metrics">
<dimension name="Subscription ID" value="@(context.Subscription.Id)" />
<dimension name="API ID" value="@(context.Api.Id)" />
<dimension name="Client IP" value="@(context.Request.IpAddress)" />
</azure-openai-emit-token-metric>
</outbound>
</policies>
For andre LLM-API-er (ikke Azure OpenAI):
<policies>
<outbound>
<base />
<!-- Emit token metrics for generic LLM APIs -->
<llm-emit-token-metric namespace="llm-metrics">
<dimension name="Client IP" value="@(context.Request.IpAddress)" />
<dimension name="API ID" value="@(context.Api.Id)" />
<dimension name="User ID"
value="@(context.Request.Headers.GetValueOrDefault("x-user-id", "N/A"))" />
<dimension name="Department"
value="@(context.Request.Headers.GetValueOrDefault("x-department", "unknown"))" />
<dimension name="Application"
value="@(context.Request.Headers.GetValueOrDefault("x-app-id", "unknown"))" />
</llm-emit-token-metric>
</outbound>
</policies>
Custom Metrics med emit-metric
For generelle metrikker utover token-sporring:
<policies>
<outbound>
<base />
<!-- Emit custom request metrics -->
<emit-metric name="ai-request-processed" value="1" namespace="ai-gateway">
<dimension name="Model" value="@{
var body = context.Response.Body.As<JObject>(preserveContent: true);
return body?["model"]?.ToString() ?? "unknown";
}" />
<dimension name="StatusCode" value="@(context.Response.StatusCode.ToString())" />
<dimension name="CacheHit" value="@(context.Response.Headers.GetValueOrDefault("x-cache-hit", "false"))" />
<dimension name="Subscription" value="@(context.Subscription?.Name ?? "unknown")" />
</emit-metric>
<!-- Emit latency metric -->
<emit-metric name="ai-backend-latency-ms" namespace="ai-gateway"
value="@{
var start = (DateTime)context.Variables["backendStartTime"];
return ((DateTime.UtcNow - start).TotalMilliseconds).ToString();
}">
<dimension name="Model" value="@{
var body = context.Response.Body.As<JObject>(preserveContent: true);
return body?["model"]?.ToString() ?? "unknown";
}" />
</emit-metric>
</outbound>
</policies>
Begrensninger for custom metrics
| Begrensning | Verdi |
|---|---|
| Maks dimensjoner per metric | 10 (5 default + 5 custom) |
| Aktive tidsserier per region | 50 000 (innen 12-timers periode) |
| Default dimensjoner (bruker 5) | Region, Service ID, Service Name, Service Type, + 1 reservert |
| Tilgjengelige for custom | 5 dimensjoner |
Token Tracking
Diagnostics Setting for LLM Logs
Aktiver spesialisert LLM-logging via Azure Monitor diagnostic settings:
- Ga til APIM-instansen i Azure Portal
- Monitoring > Diagnostic settings > + Add diagnostic setting
- Velg Logs related to generative AI gateway
- Under Destination: Send to Log Analytics workspace
Aktivere prompt/completion-logging per API
- Velg API-en > Settings > Diagnostic Logs > Azure Monitor
- Log LLM messages: Enabled
- Log prompts: Velg og angi maks storrelse (f.eks. 32768 bytes)
- Log completions: Velg og angi maks storrelse (f.eks. 32768 bytes)
Viktig: Meldinger opp til 32 KB logges i en enkelt oppforing. Storre meldinger splittes i 32 KB-biter med sekvensnumre. Maks 2 MB per request/response.
KQL-sporring: Join request og response
ApiManagementGatewayLlmLog
| extend RequestArray = parse_json(RequestMessages)
| extend ResponseArray = parse_json(ResponseMessages)
| mv-expand RequestArray
| mv-expand ResponseArray
| project
TimeGenerated,
CorrelationId,
OperationName,
ModelDeploymentName,
PromptTokens,
CompletionTokens,
TotalTokens,
RequestContent = tostring(RequestArray.content),
ResponseContent = tostring(ResponseArray.content)
| summarize
Input = strcat_array(make_list(RequestContent), " . "),
Output = strcat_array(make_list(ResponseContent), " . "),
PromptTokens = max(PromptTokens),
CompletionTokens = max(CompletionTokens),
TotalTokens = max(TotalTokens)
by TimeGenerated, CorrelationId, OperationName, ModelDeploymentName
| where isnotempty(Input) and isnotempty(Output)
KQL: Token-forbruk per applikasjon per dag
ApiManagementGatewayLlmLog
| where TimeGenerated > ago(30d)
| summarize
TotalPromptTokens = sum(PromptTokens),
TotalCompletionTokens = sum(CompletionTokens),
TotalTokens = sum(TotalTokens),
RequestCount = count()
by bin(TimeGenerated, 1d), SubscriptionName = tostring(split(OperationName, "/")[0])
| order by TimeGenerated desc
KQL: Modellbruk og kostnad
ApiManagementGatewayLlmLog
| where TimeGenerated > ago(7d)
| summarize
PromptTokens = sum(PromptTokens),
CompletionTokens = sum(CompletionTokens),
Requests = count()
by ModelDeploymentName
| extend EstimatedCostUSD =
case(
ModelDeploymentName contains "gpt-4o",
(PromptTokens / 1000000.0 * 2.5) + (CompletionTokens / 1000000.0 * 10.0),
ModelDeploymentName contains "gpt-4o-mini",
(PromptTokens / 1000000.0 * 0.15) + (CompletionTokens / 1000000.0 * 0.60),
ModelDeploymentName contains "gpt-4",
(PromptTokens / 1000000.0 * 30.0) + (CompletionTokens / 1000000.0 * 60.0),
0.0
)
| extend EstimatedCostNOK = EstimatedCostUSD * 11.0
| order by EstimatedCostNOK desc
Latency-overvaking
Maling av end-to-end latency
<policies>
<inbound>
<base />
<set-variable name="requestStartTime" value="@(DateTime.UtcNow)" />
</inbound>
<backend>
<base />
</backend>
<outbound>
<base />
<!-- Calculate and expose latency -->
<set-header name="x-total-latency-ms" exists-action="override">
<value>@{
var start = (DateTime)context.Variables["requestStartTime"];
return ((DateTime.UtcNow - start).TotalMilliseconds).ToString("F0");
}</value>
</set-header>
<!-- Emit latency as custom metric -->
<emit-metric name="ai-total-latency" namespace="ai-gateway"
value="@{
var start = (DateTime)context.Variables["requestStartTime"];
return ((DateTime.UtcNow - start).TotalMilliseconds).ToString();
}">
<dimension name="API" value="@(context.Api.Name)" />
<dimension name="StatusCode" value="@(context.Response.StatusCode.ToString())" />
</emit-metric>
</outbound>
</policies>
Latency-terskelvarsel
// Alert: AI API latency exceeds 5 seconds
ApiManagementGatewayLogs
| where TimeGenerated > ago(15m)
| where ApiId contains "ai-gateway"
| where ResponseTime > 5000
| summarize
Count = count(),
AvgLatency = avg(ResponseTime),
P95Latency = percentile(ResponseTime, 95)
by bin(TimeGenerated, 5m), ApiId
| where Count > 10
Brukeratferdsanalyse
Analytics Dashboard i APIM
APIM tilbyr et innebygd Azure Monitor-basert dashboard under Monitoring > Analytics > Language models med:
- Token-forbruk over tid
- Fordeling per modell
- Request-volum og feilrate
- Gjennomsnittlig responstid
KQL: Topp-brukere etter token-forbruk
ApiManagementGatewayLlmLog
| where TimeGenerated > ago(7d)
| summarize
TotalTokens = sum(TotalTokens),
Requests = count(),
AvgTokensPerRequest = avg(TotalTokens)
by SubscriptionId
| order by TotalTokens desc
| take 20
KQL: Populaere temaer (basert pa prompts)
ApiManagementGatewayLlmLog
| where TimeGenerated > ago(7d)
| extend RequestArray = parse_json(RequestMessages)
| mv-expand RequestArray
| where tostring(RequestArray.role) == "user"
| extend UserMessage = tostring(RequestArray.content)
| where strlen(UserMessage) > 10
| extend Topic = case(
UserMessage contains "azure" or UserMessage contains "cloud", "Azure/Cloud",
UserMessage contains "kode" or UserMessage contains "code", "Programmering",
UserMessage contains "sikkerhet" or UserMessage contains "security", "Sikkerhet",
UserMessage contains "data" or UserMessage contains "database", "Data",
"Annet"
)
| summarize Count = count() by Topic
| order by Count desc
Eksport til Microsoft Foundry for modellevaluering
LLM-logger kan eksporteres som datasett for modellevaluering i Microsoft Foundry:
- Join request/response med KQL (se over)
- Eksporter til CSV-format
- Last opp i Microsoft Foundry portal
- Kjor evaluering med innebygde eller egne metrikker
Personvern og compliance
Logging-policyer for norsk offentlig sektor
| Krav | Tiltak i APIM |
|---|---|
| GDPR Art. 5 (dataminimering) | Logg kun nodvendige felter, anonymiser PII |
| Offentlighetsloven | Sikre sporbarhet for automatiserte beslutninger |
| Datatilsynets retningslinjer | Ikke logg personopplysninger i prompts uten behandlingsgrunnlag |
| Arkivloven | Langtidslagring i Log Analytics med retention policy |
PII-filtrering i logging
<policies>
<outbound>
<base />
<!-- Sanitize prompts before logging -->
<set-variable name="sanitizedRequest" value="@{
var body = context.Request.Body.As<string>(preserveContent: true);
// Remove Norwegian national ID (11 digits)
body = System.Text.RegularExpressions.Regex.Replace(
body, @"\b\d{11}\b", "[FODSELSNUMMER]");
// Remove email addresses
body = System.Text.RegularExpressions.Regex.Replace(
body, @"\b[\w.-]+@[\w.-]+\.\w+\b", "[EMAIL]");
return body;
}" />
<trace source="ai-gateway" severity="information">
<message>@((string)context.Variables["sanitizedRequest"])</message>
</trace>
</outbound>
</policies>
Referanser
- Log token usage, prompts, and completions for LLM APIs -- hovedveiledning for LLM-logging
- AI gateway capabilities - Observability -- oversikt over observability
- How to integrate Azure API Management with Application Insights -- App Insights-integrasjon
- llm-emit-token-metric policy -- token-metrikk policy
- emit-metric policy -- generell metrikk-policy
- Monitor API Management -- overordnet overvakning
- ApiManagementGatewayLlmLog table -- Log Analytics-tabellreferanse
- Monitor AI agents with Application Insights -- AI-agent-overvaking
For Cosmo
- Bruk denne referansen nar kunden trenger a sette opp logging, dashboard eller kostnadsrapportering for sine AI-API-er, eller nar de ma oppfylle compliance-krav rundt sporbarhet av AI-bruk.
- Anbefal alltid a aktivere bade Application Insights (sanntidsmetrikker) og diagnostic settings (Log Analytics for langtidsanalyse) -- de utfyller hverandre.
- For kostnadsovervaking, bruk
llm-emit-token-metricmed dimensjoner for applikasjon, avdeling og abonnement -- dette gir granular kostnadstildeling uten manuell beregning. - Var oppmerksom pa personvern: Prompt-logging kan inneholde sensitiv informasjon. Anbefal PII-filtrering i policies for norsk offentlig sektor, og sorg for at lagringstid i Log Analytics samsvarer med organisasjonens retningslinjer.
- KQL-sporringene i denne referansen kan brukes direkte i Azure Monitor Workbooks for a bygge tilpassede dashboards for ledelse og fagavdelinger.