Initial addition of ms-ai-architect plugin to the open-source marketplace. Private content excluded: orchestrator/ (Linear tooling), docs/utredning/ (client investigation), generated test reports and PDF export script. skill-gen tooling moved from orchestrator/ to scripts/skill-gen/. Security scan: WARNING (risk 20/100) — no secrets, no injection found. False positive fixed: added gitleaks:allow to Python variable reference in output-validation-grounding-verification.md line 109. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
12 KiB
| name | description |
|---|---|
| ms-ai-security | This skill should be used when the user needs a security assessment for an AI solution, wants cost estimation for Azure AI workloads, asks about OWASP LLM Top 10 mitigations, or needs performance optimization guidance. Provides deterministic 6x5 security scoring, P10/P50/P90 cost confidence intervals, and FinOps practices for AI. Triggers on: "security assessment for AI", "AI threat modeling", "cost estimation for Azure AI", "FinOps for AI workloads", "prompt injection defense", "kostnadsestimat for AI-løsning", "sikkerhetsscoring for AI", "OWASP LLM", "6x5 scoring", "PTU vs pay-as-you-go". |
INSTRUKSJON: Denne skillen dekker kvantitative vurderingsaktiviteter med deterministiske scoringsmodeller. Bruk rammeverket systematisk — ikke hopp over dimensjoner eller anta scorer. Alle vurderinger skal produsere konkrete, etterprøvbare resultater med tallverdier.
Sikkerhets- og kostnadsvurdering for Microsoft AI
Strukturerte metoder for tre vurderingsaktiviteter:
- Sikkerhetsvurdering — Deterministisk 6x5 sikkerhetsscoring med OWASP LLM Top 10-mapping
- Kostnadsestimering — TCO-beregning med P10/P50/P90 konfidensintervaller og FinOps-praksis
- Ytelsesgjennomgang — Latency-optimalisering, skalering og benchmarking
Primære agenter: security-assessment-agent, cost-estimation-agent
1. Sikkerhetsrammeverk
6-dimensjons sikkerhetsmodell
To assess security, score each of the six dimensions independently on a 1-5 scale:
| Dimensjon | Dekker |
|---|---|
| Identity & Access Control | Entra ID, Managed Identities, RBAC, API-nøkkelrotasjon, JIT-tilgang |
| Network Security | Private Endpoints, VNet, NSG, Azure Firewall, DNS, utgående trafikk |
| Data Protection | Kryptering (rest/transit), Key Vault, data residency, PII-maskering, backup |
| Content Safety & AI Security | Content Safety-filtre, prompt injection-forsvar, jailbreak, output-validering, STRIDE-AI |
| Compliance & Governance | AI Act-klassifisering, GDPR/Schrems II, Purview, Digdir/NSM, DPIA |
| Monitoring & Incident Response | Azure Monitor, token-bruk, anomalideteksjon, audit logging, alerting |
Scoringmodell (1-5)
| Score | Nivå | Kriterium |
|---|---|---|
| 1 | Kritisk | Ingen kontroller. Umiddelbar risiko. |
| 2 | Utilstrekkelig | Grunnleggende kontroller med vesentlige hull. Kun PoC/sandbox. |
| 3 | Akseptabel | Sentrale kontroller på plass. Minimum for lav-risiko produksjon. |
| 4 | God | Robuste, automatiserte kontroller med overvåking. Sensitiv data OK. |
| 5 | Utmerket | State-of-the-art. Zero Trust. Defense in depth. Høy-risiko AI Act OK. |
Vektet scoring
Apply weights based on workload type, then calculate: Samlet score = Sum(dimensjon_score x vekt)
| Dimensjon | Standard | Eksternt eksponert | Persondata-intensiv |
|---|---|---|---|
| Identity & Access Control | 20% | 25% | 20% |
| Network Security | 15% | 20% | 15% |
| Data Protection | 20% | 15% | 25% |
| Content Safety & AI Security | 20% | 25% | 15% |
| Compliance & Governance | 15% | 10% | 20% |
| Monitoring & Incident Response | 10% | 5% | 5% |
Risikoklassifisering
| Samlet score | Klassifisering | Anbefaling |
|---|---|---|
| 1.0 - 2.0 | Kritisk risiko | Stopp utrulling. Umiddelbar utbedring. |
| 2.1 - 3.0 | Høy risiko | Begrenset tilgang. Utbedringsplan innen 30 dager. |
| 3.1 - 3.5 | Moderat risiko | Produksjon med restriksjoner. Utbedringsplan innen 90 dager. |
| 3.6 - 4.5 | Lav risiko | Produksjon godkjent. Kontinuerlig forbedring. |
| 4.6 - 5.0 | Minimal risiko | Produksjon godkjent. Benchmark for andre løsninger. |
For fullstendige rubrikker med eksempler per dimensjon og score, see references/ai-security-engineering/security-scoring-rubrics-6x5.md and references/ai-security-engineering/ai-security-scoring-framework.md.
OWASP LLM Top 10 (2025)
Map each threat to the solution under assessment. Use the reference files for detailed mitigation patterns.
| ID | Threat | Key Microsoft Mitigation | Reference |
|---|---|---|---|
| LLM01 | Prompt Injection | Content Safety Prompt Shields, system message hardening, Groundedness Detection | prompt-injection-defense-patterns.md |
| LLM02 | Sensitive Information Disclosure | PII-filter, Purview DLP, output-filtrering | data-leakage-prevention-ai.md, pii-detection-norwegian-context.md |
| LLM03 | Supply Chain Vulnerabilities | AI Foundry curated models, signed models, DLP for connectors | supply-chain-security-ai-models.md |
| LLM04 | Data and Model Poisoning | Azure ML data lineage, isolated fine-tuning, Purview validation | — |
| LLM05 | Improper Output Handling | Grounding Detection API, Content Safety output-filtre, Structured Outputs | output-validation-grounding-verification.md |
| LLM06 | Excessive Agency | Copilot Studio scoped tools, RBAC per project, human-in-the-loop, budget caps | — |
| LLM07 | System Prompt Leakage | Metaprompt patterns, Prompt Shields, output monitoring | jailbreak-prevention-production.md |
| LLM08 | Vector and Embedding Weaknesses | AI Search managed identities, index-level security filters, Private Endpoints | — |
| LLM09 | Misinformation | RAG grounding, Groundedness Detection, citation patterns, confidence scoring | — |
| LLM10 | Unbounded Consumption | Rate limits, token budgets, PTU for capacity, Cost Management alerts | — |
All reference files are in references/ai-security-engineering/.
Azure AI-spesifikke sikkerhetskontroller
For detailed per-service security controls tables, see references/ai-security-engineering/secure-model-deployment-hardening.md and references/ai-security-engineering/zero-trust-ai-services.md. Key services covered:
- Azure OpenAI Service — Content Filtering, Abuse Monitoring, VNet/Private Endpoints, Managed Identity, CMK
- Azure AI Search — Managed Identities, index-level security filters, encryption, Private Endpoints
- Copilot Studio — Entra ID auth, Power Platform DLP, generative AI guardrails, environment isolation
- Azure AI Foundry — Project isolation, granular RBAC, Private Endpoints, curated model catalog, tracing
2. Kostnadsestimering
P10/P50/P90 konfidensintervaller
Provide all estimates with three scenarios. Verify current prices via microsoft_docs_search before calculating.
| Scenario | Persentil | Beskrivelse | Multiplikator |
|---|---|---|---|
| P10 (Optimistisk) | 10. | Lavt volum, ideelle forhold | Basis x 0.6 |
| P50 (Forventet) | 50. | Normal bruk, erfaringstall | Basis x 1.0 |
| P90 (Konservativt) | 90. | Høyt volum, buffer for uforutsett | Basis x 1.8 |
Adjust multipliers based on historical volatility. Always present both USD and NOK (add 3-5% currency buffer for NOK).
TCO-komponenter
Calculate for 1, 12, and 36 months. Present Budget/Recommended/Premium alternatives.
| Komponent | Inkluderer | Eksempler |
|---|---|---|
| Lisenser | Software per bruker/org | M365 Copilot, Copilot Studio, Power Platform |
| Compute | AI-inferens, hosting | Azure OpenAI tokens, App Service, Functions |
| Storage | Datalagring | AI Search indekser, Blob Storage, Cosmos DB |
| Networking | Dataoverføring | Egress, Private Link, Application Gateway |
| Support | Microsoft Support | Unified/Premier Support |
| Drift | Internt personell | Utviklere, MLOps, sikkerhetsteam |
See references/cost-optimization/deterministic-cost-calculation-model.md and references/cost-optimization/budget-forecasting-ai-projects.md for full calculation methodology.
FinOps for AI
Apply these optimization strategies and refer to detailed guidance in references:
- Token-optimalisering: Shorter prompts, context window management, model tiering (GPT-4o mini vs GPT-4o), prompt caching. See
references/cost-optimization/token-counting-optimization.md. - PTU vs Pay-As-You-Go: PTU for stable workloads (break-even ~60-70% utilization), PAYG for variable. See
references/cost-optimization/ptu-vs-paygo-economics.md. - Caching: Semantic caching, prompt caching, RAG result caching. See
references/cost-optimization/semantic-caching-patterns.md. - Right-sizing: Start with lowest SKU, monitor 2-4 weeks, consider SLMs for specialized tasks. See
references/cost-optimization/model-selection-price-performance.md.
3. Ytelse og skalerbarhet
Optimize latency, throughput, and scalability for AI workloads. Key strategies:
- Regional deployment in Norway East / West Europe reduces latency 20-50ms
- Streaming responses reduce perceived latency 5-10x for interactive use
- Prompt caching gives up to 50% cost reduction and 80% latency reduction for repeated prefixes (>1024 tokens)
- Batch API provides 50% price reduction for non-interactive workloads (24h SLA)
- Auto-scaling patterns: Horizontal scaling (App Service/AKS), load balancing (APIM/Traffic Manager), queue-based buffering (Service Bus+Functions), PTU+PAYG hybrid
- Rate limit management: TPM/RPM quotas, exponential backoff with jitter, multi-deployment, APIM for centralized throttling
- Load testing: Establish baseline, simulate peak traffic, identify breaking points, long-running soak tests
For detailed implementation guidance, see specific files in references/performance-scalability/:
latency-optimization-azure-openai.md— Latency tuningauto-scaling-ai-infrastructure.md— Scaling patternsrate-limit-management.md— TPM/RPM quota managementload-testing-ai-services.md— Load testing methodology
4. Referansekatalog
Eide referanser
| Katalog | Filer | Innhold |
|---|---|---|
references/ai-security-engineering/ |
17 | Forsvar, testing, scoring, hendelseshåndtering, Zero Trust, STRIDE-AI, prompt injection, content safety |
references/cost-optimization/ |
21 | Kostnadsmodellering, FinOps, token-optimalisering, PTU/PAYG, caching, right-sizing, SLM-økonomi |
references/performance-scalability/ |
18 | Latency, skalering, streaming, batch API, rate limits, benchmarking, GPU-dimensjonering |
Kryss-referanser
- Compliance/governance:
skills/ms-ai-governance/references/responsible-ai/(AI Act, bias, etikk) andreferences/norwegian-public-sector-governance/(Digdir, NSM, Schrems II, DPIA) - Arkitektur:
skills/ms-ai-advisor/references/architecture/(sikkerhetssoner, arkitekturmønstre, offentlig sektor-sjekkliste)
5. MCP-verktøy
| Behov | Verktøy | Bruk |
|---|---|---|
| Sikkerhetsdokumentasjon | microsoft_docs_search |
Verifiser kontroller, sjekk oppdateringer |
| Fullstendig veiledning | microsoft_docs_fetch |
Security baselines, konfigurasjonsguider |
| Kodeeksempler | microsoft_code_sample_search |
SDK for Content Safety, RBAC, Key Vault |
Never trust the knowledge base blindly for prices and feature availability — verify via MCP tools.
6. Arbeidsprosess
Sikkerhetsvurdering
- Map the solution's AI components and data flows
- Score each of the 6 dimensions using rubrics from references
- Calculate weighted risk score with appropriate weight profile
- Map OWASP LLM Top 10 threats to the solution
- Document findings with concrete remediation recommendations, prioritized by risk and cost
Kostnadsestimering
- Identify all Azure services in the solution
- Estimate consumption per service (tokens, storage, traffic)
- Fetch current prices via MCP tools
- Calculate P10/P50/P90 per component, sum to TCO for 1/12/36 months
- Present Budget/Recommended/Premium alternatives with FinOps opportunities
Ytelsesgjennomgang
- Define performance requirements (latency, throughput, availability)
- Identify bottlenecks and recommend optimizations from reference catalog
- Estimate performance impact and propose monitoring/benchmarking setup