Pre-trekexecute snapshot of in-progress CLAUDE.md/SKILL.md edits and extracted docs/ files. Captured as one commit so /trekexecute claude-design can run against a clean working tree. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
12 KiB
| name | description |
|---|---|
| ms-ai-security | Security assessment, cost estimation, OWASP LLM Top 10 mitigations, performance optimization for AI on Microsoft stack. Deterministic 6x5 security scoring, P10/P50/P90 cost confidence intervals, FinOps practices. Triggers on: "security assessment for AI", "AI threat modeling", "cost estimation for Azure AI", "FinOps for AI workloads", "OWASP LLM", "kostnadsestimat for AI-løsning". |
INSTRUKSJON: Denne skillen dekker kvantitative vurderingsaktiviteter med deterministiske scoringsmodeller. Bruk rammeverket systematisk — ikke hopp over dimensjoner eller anta scorer. Alle vurderinger skal produsere konkrete, etterprøvbare resultater med tallverdier.
Sikkerhets- og kostnadsvurdering for Microsoft AI
Strukturerte metoder for tre vurderingsaktiviteter:
- Sikkerhetsvurdering — Deterministisk 6x5 sikkerhetsscoring med OWASP LLM Top 10-mapping
- Kostnadsestimering — TCO-beregning med P10/P50/P90 konfidensintervaller og FinOps-praksis
- Ytelsesgjennomgang — Latency-optimalisering, skalering og benchmarking
Primære agenter: security-assessment-agent, cost-estimation-agent
1. Sikkerhetsrammeverk
6-dimensjons sikkerhetsmodell
To assess security, score each of the six dimensions independently on a 1-5 scale:
| Dimensjon | Dekker |
|---|---|
| Identity & Access Control | Entra ID, Managed Identities, RBAC, API-nøkkelrotasjon, JIT-tilgang |
| Network Security | Private Endpoints, VNet, NSG, Azure Firewall, DNS, utgående trafikk |
| Data Protection | Kryptering (rest/transit), Key Vault, data residency, PII-maskering, backup |
| Content Safety & AI Security | Content Safety-filtre, prompt injection-forsvar, jailbreak, output-validering, STRIDE-AI |
| Compliance & Governance | AI Act-klassifisering, GDPR/Schrems II, Purview, Digdir/NSM, DPIA |
| Monitoring & Incident Response | Azure Monitor, token-bruk, anomalideteksjon, audit logging, alerting |
Scoringmodell (1-5)
| Score | Nivå | Kriterium |
|---|---|---|
| 1 | Kritisk | Ingen kontroller. Umiddelbar risiko. |
| 2 | Utilstrekkelig | Grunnleggende kontroller med vesentlige hull. Kun PoC/sandbox. |
| 3 | Akseptabel | Sentrale kontroller på plass. Minimum for lav-risiko produksjon. |
| 4 | God | Robuste, automatiserte kontroller med overvåking. Sensitiv data OK. |
| 5 | Utmerket | State-of-the-art. Zero Trust. Defense in depth. Høy-risiko AI Act OK. |
Vektet scoring
Apply weights based on workload type, then calculate: Samlet score = Sum(dimensjon_score x vekt)
| Dimensjon | Standard | Eksternt eksponert | Persondata-intensiv |
|---|---|---|---|
| Identity & Access Control | 20% | 25% | 20% |
| Network Security | 15% | 20% | 15% |
| Data Protection | 20% | 15% | 25% |
| Content Safety & AI Security | 20% | 25% | 15% |
| Compliance & Governance | 15% | 10% | 20% |
| Monitoring & Incident Response | 10% | 5% | 5% |
Risikoklassifisering
| Samlet score | Klassifisering | Anbefaling |
|---|---|---|
| 1.0 - 2.0 | Kritisk risiko | Stopp utrulling. Umiddelbar utbedring. |
| 2.1 - 3.0 | Høy risiko | Begrenset tilgang. Utbedringsplan innen 30 dager. |
| 3.1 - 3.5 | Moderat risiko | Produksjon med restriksjoner. Utbedringsplan innen 90 dager. |
| 3.6 - 4.5 | Lav risiko | Produksjon godkjent. Kontinuerlig forbedring. |
| 4.6 - 5.0 | Minimal risiko | Produksjon godkjent. Benchmark for andre løsninger. |
For fullstendige rubrikker med eksempler per dimensjon og score, see references/ai-security-engineering/security-scoring-rubrics-6x5.md and references/ai-security-engineering/ai-security-scoring-framework.md.
OWASP LLM Top 10 (2025)
Map each threat to the solution under assessment. Use the reference files for detailed mitigation patterns.
| ID | Threat | Key Microsoft Mitigation | Reference |
|---|---|---|---|
| LLM01 | Prompt Injection | Content Safety Prompt Shields, system message hardening, Groundedness Detection | prompt-injection-defense-patterns.md |
| LLM02 | Sensitive Information Disclosure | PII-filter, Purview DLP, output-filtrering | data-leakage-prevention-ai.md, pii-detection-norwegian-context.md |
| LLM03 | Supply Chain Vulnerabilities | AI Foundry curated models, signed models, DLP for connectors | supply-chain-security-ai-models.md |
| LLM04 | Data and Model Poisoning | Azure ML data lineage, isolated fine-tuning, Purview validation | — |
| LLM05 | Improper Output Handling | Grounding Detection API, Content Safety output-filtre, Structured Outputs | output-validation-grounding-verification.md |
| LLM06 | Excessive Agency | Copilot Studio scoped tools, RBAC per project, human-in-the-loop, budget caps | — |
| LLM07 | System Prompt Leakage | Metaprompt patterns, Prompt Shields, output monitoring | jailbreak-prevention-production.md |
| LLM08 | Vector and Embedding Weaknesses | AI Search managed identities, index-level security filters, Private Endpoints | — |
| LLM09 | Misinformation | RAG grounding, Groundedness Detection, citation patterns, confidence scoring | — |
| LLM10 | Unbounded Consumption | Rate limits, token budgets, PTU for capacity, Cost Management alerts | — |
All reference files are in references/ai-security-engineering/.
Azure AI-spesifikke sikkerhetskontroller
For detailed per-service security controls tables, see references/ai-security-engineering/secure-model-deployment-hardening.md and references/ai-security-engineering/zero-trust-ai-services.md. Key services covered:
- Azure OpenAI Service — Content Filtering, Abuse Monitoring, VNet/Private Endpoints, Managed Identity, CMK
- Azure AI Search — Managed Identities, index-level security filters, encryption, Private Endpoints
- Copilot Studio — Entra ID auth, Power Platform DLP, generative AI guardrails, environment isolation
- Azure AI Foundry — Project isolation, granular RBAC, Private Endpoints, curated model catalog, tracing
2. Kostnadsestimering
P10/P50/P90 konfidensintervaller
Provide all estimates with three scenarios. Verify current prices via microsoft_docs_search before calculating.
| Scenario | Persentil | Beskrivelse | Multiplikator |
|---|---|---|---|
| P10 (Optimistisk) | 10. | Lavt volum, ideelle forhold | Basis x 0.6 |
| P50 (Forventet) | 50. | Normal bruk, erfaringstall | Basis x 1.0 |
| P90 (Konservativt) | 90. | Høyt volum, buffer for uforutsett | Basis x 1.8 |
Adjust multipliers based on historical volatility. Always present both USD and NOK (add 3-5% currency buffer for NOK).
TCO-komponenter
Calculate for 1, 12, and 36 months. Present Budget/Recommended/Premium alternatives.
| Komponent | Inkluderer | Eksempler |
|---|---|---|
| Lisenser | Software per bruker/org | M365 Copilot, Copilot Studio, Power Platform |
| Compute | AI-inferens, hosting | Azure OpenAI tokens, App Service, Functions |
| Storage | Datalagring | AI Search indekser, Blob Storage, Cosmos DB |
| Networking | Dataoverføring | Egress, Private Link, Application Gateway |
| Support | Microsoft Support | Unified/Premier Support |
| Drift | Internt personell | Utviklere, MLOps, sikkerhetsteam |
See references/cost-optimization/deterministic-cost-calculation-model.md and references/cost-optimization/budget-forecasting-ai-projects.md for full calculation methodology.
FinOps for AI
Apply these optimization strategies and refer to detailed guidance in references:
- Token-optimalisering: Shorter prompts, context window management, model tiering (GPT-4o mini vs GPT-4o), prompt caching. See
references/cost-optimization/token-counting-optimization.md. - PTU vs Pay-As-You-Go: PTU for stable workloads (break-even ~60-70% utilization), PAYG for variable. See
references/cost-optimization/ptu-vs-paygo-economics.md. - Caching: Semantic caching, prompt caching, RAG result caching. See
references/cost-optimization/semantic-caching-patterns.md. - Right-sizing: Start with lowest SKU, monitor 2-4 weeks, consider SLMs for specialized tasks. See
references/cost-optimization/model-selection-price-performance.md.
3. Ytelse og skalerbarhet
Optimize latency, throughput, and scalability for AI workloads. Key strategies:
- Regional deployment in Norway East / West Europe reduces latency 20-50ms
- Streaming responses reduce perceived latency 5-10x for interactive use
- Prompt caching gives up to 50% cost reduction and 80% latency reduction for repeated prefixes (>1024 tokens)
- Batch API provides 50% price reduction for non-interactive workloads (24h SLA)
- Auto-scaling patterns: Horizontal scaling (App Service/AKS), load balancing (APIM/Traffic Manager), queue-based buffering (Service Bus+Functions), PTU+PAYG hybrid
- Rate limit management: TPM/RPM quotas, exponential backoff with jitter, multi-deployment, APIM for centralized throttling
- Load testing: Establish baseline, simulate peak traffic, identify breaking points, long-running soak tests
For detailed implementation guidance, see specific files in references/performance-scalability/:
latency-optimization-azure-openai.md— Latency tuningauto-scaling-ai-infrastructure.md— Scaling patternsrate-limit-management.md— TPM/RPM quota managementload-testing-ai-services.md— Load testing methodology
4. Referansekatalog
Eide referanser
| Katalog | Filer | Innhold |
|---|---|---|
references/ai-security-engineering/ |
17 | Forsvar, testing, scoring, hendelseshåndtering, Zero Trust, STRIDE-AI, prompt injection, content safety |
references/cost-optimization/ |
21 | Kostnadsmodellering, FinOps, token-optimalisering, PTU/PAYG, caching, right-sizing, SLM-økonomi |
references/performance-scalability/ |
18 | Latency, skalering, streaming, batch API, rate limits, benchmarking, GPU-dimensjonering |
Kryss-referanser
- Compliance/governance:
skills/ms-ai-governance/references/responsible-ai/(AI Act, bias, etikk) andreferences/norwegian-public-sector-governance/(Digdir, NSM, Schrems II, DPIA) - Arkitektur:
skills/ms-ai-advisor/references/architecture/(sikkerhetssoner, arkitekturmønstre, offentlig sektor-sjekkliste)
5. MCP-verktøy
| Behov | Verktøy | Bruk |
|---|---|---|
| Sikkerhetsdokumentasjon | microsoft_docs_search |
Verifiser kontroller, sjekk oppdateringer |
| Fullstendig veiledning | microsoft_docs_fetch |
Security baselines, konfigurasjonsguider |
| Kodeeksempler | microsoft_code_sample_search |
SDK for Content Safety, RBAC, Key Vault |
Never trust the knowledge base blindly for prices and feature availability — verify via MCP tools.
6. Arbeidsprosess
Sikkerhetsvurdering
- Map the solution's AI components and data flows
- Score each of the 6 dimensions using rubrics from references
- Calculate weighted risk score with appropriate weight profile
- Map OWASP LLM Top 10 threats to the solution
- Document findings with concrete remediation recommendations, prioritized by risk and cost
Kostnadsestimering
- Identify all Azure services in the solution
- Estimate consumption per service (tokens, storage, traffic)
- Fetch current prices via MCP tools
- Calculate P10/P50/P90 per component, sum to TCO for 1/12/36 months
- Present Budget/Recommended/Premium alternatives with FinOps opportunities
Ytelsesgjennomgang
- Define performance requirements (latency, throughput, availability)
- Identify bottlenecks and recommend optimizations from reference catalog
- Estimate performance impact and propose monitoring/benchmarking setup