Kjell Tore Guttormsen 6a7632146e feat(ms-ai-architect): add plugin to open marketplace (v1.5.0 baseline)

Initial addition of ms-ai-architect plugin to the open-source marketplace.
Private content excluded: orchestrator/ (Linear tooling), docs/utredning/
(client investigation), generated test reports and PDF export script.
skill-gen tooling moved from orchestrator/ to scripts/skill-gen/.

Security scan: WARNING (risk 20/100) — no secrets, no injection found.
False positive fixed: added gitleaks:allow to Python variable reference
in output-validation-grounding-verification.md line 109.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-04-07 17:17:17 +02:00

12 KiB

Raw Blame History

name	description
ms-ai-security	This skill should be used when the user needs a security assessment for an AI solution, wants cost estimation for Azure AI workloads, asks about OWASP LLM Top 10 mitigations, or needs performance optimization guidance. Provides deterministic 6x5 security scoring, P10/P50/P90 cost confidence intervals, and FinOps practices for AI. Triggers on: "security assessment for AI", "AI threat modeling", "cost estimation for Azure AI", "FinOps for AI workloads", "prompt injection defense", "kostnadsestimat for AI-løsning", "sikkerhetsscoring for AI", "OWASP LLM", "6x5 scoring", "PTU vs pay-as-you-go".

name

description

ms-ai-security

This skill should be used when the user needs a security assessment for an AI solution, wants cost estimation for Azure AI workloads, asks about OWASP LLM Top 10 mitigations, or needs performance optimization guidance. Provides deterministic 6x5 security scoring, P10/P50/P90 cost confidence intervals, and FinOps practices for AI. Triggers on: "security assessment for AI", "AI threat modeling", "cost estimation for Azure AI", "FinOps for AI workloads", "prompt injection defense", "kostnadsestimat for AI-løsning", "sikkerhetsscoring for AI", "OWASP LLM", "6x5 scoring", "PTU vs pay-as-you-go".

INSTRUKSJON: Denne skillen dekker kvantitative vurderingsaktiviteter med deterministiske scoringsmodeller. Bruk rammeverket systematisk — ikke hopp over dimensjoner eller anta scorer. Alle vurderinger skal produsere konkrete, etterprøvbare resultater med tallverdier.

Sikkerhets- og kostnadsvurdering for Microsoft AI

Strukturerte metoder for tre vurderingsaktiviteter:

Sikkerhetsvurdering — Deterministisk 6x5 sikkerhetsscoring med OWASP LLM Top 10-mapping
Kostnadsestimering — TCO-beregning med P10/P50/P90 konfidensintervaller og FinOps-praksis
Ytelsesgjennomgang — Latency-optimalisering, skalering og benchmarking

Primære agenter: security-assessment-agent, cost-estimation-agent

1. Sikkerhetsrammeverk

6-dimensjons sikkerhetsmodell

To assess security, score each of the six dimensions independently on a 1-5 scale:

Dimensjon	Dekker
Identity & Access Control	Entra ID, Managed Identities, RBAC, API-nøkkelrotasjon, JIT-tilgang
Network Security	Private Endpoints, VNet, NSG, Azure Firewall, DNS, utgående trafikk
Data Protection	Kryptering (rest/transit), Key Vault, data residency, PII-maskering, backup
Content Safety & AI Security	Content Safety-filtre, prompt injection-forsvar, jailbreak, output-validering, STRIDE-AI
Compliance & Governance	AI Act-klassifisering, GDPR/Schrems II, Purview, Digdir/NSM, DPIA
Monitoring & Incident Response	Azure Monitor, token-bruk, anomalideteksjon, audit logging, alerting

Scoringmodell (1-5)

Score	Nivå	Kriterium
1	Kritisk	Ingen kontroller. Umiddelbar risiko.
2	Utilstrekkelig	Grunnleggende kontroller med vesentlige hull. Kun PoC/sandbox.
3	Akseptabel	Sentrale kontroller på plass. Minimum for lav-risiko produksjon.
4	God	Robuste, automatiserte kontroller med overvåking. Sensitiv data OK.
5	Utmerket	State-of-the-art. Zero Trust. Defense in depth. Høy-risiko AI Act OK.

Vektet scoring

Apply weights based on workload type, then calculate: Samlet score = Sum(dimensjon_score x vekt)

Dimensjon	Standard	Eksternt eksponert	Persondata-intensiv
Identity & Access Control	20%	25%	20%
Network Security	15%	20%	15%
Data Protection	20%	15%	25%
Content Safety & AI Security	20%	25%	15%
Compliance & Governance	15%	10%	20%
Monitoring & Incident Response	10%	5%	5%

Risikoklassifisering

Samlet score	Klassifisering	Anbefaling
1.0 - 2.0	Kritisk risiko	Stopp utrulling. Umiddelbar utbedring.
2.1 - 3.0	Høy risiko	Begrenset tilgang. Utbedringsplan innen 30 dager.
3.1 - 3.5	Moderat risiko	Produksjon med restriksjoner. Utbedringsplan innen 90 dager.
3.6 - 4.5	Lav risiko	Produksjon godkjent. Kontinuerlig forbedring.
4.6 - 5.0	Minimal risiko	Produksjon godkjent. Benchmark for andre løsninger.

For fullstendige rubrikker med eksempler per dimensjon og score, see references/ai-security-engineering/security-scoring-rubrics-6x5.md and references/ai-security-engineering/ai-security-scoring-framework.md.

OWASP LLM Top 10 (2025)

Map each threat to the solution under assessment. Use the reference files for detailed mitigation patterns.

ID	Threat	Key Microsoft Mitigation	Reference
LLM01	Prompt Injection	Content Safety Prompt Shields, system message hardening, Groundedness Detection	`prompt-injection-defense-patterns.md`
LLM02	Sensitive Information Disclosure	PII-filter, Purview DLP, output-filtrering	`data-leakage-prevention-ai.md`, `pii-detection-norwegian-context.md`
LLM03	Supply Chain Vulnerabilities	AI Foundry curated models, signed models, DLP for connectors	`supply-chain-security-ai-models.md`
LLM04	Data and Model Poisoning	Azure ML data lineage, isolated fine-tuning, Purview validation	—
LLM05	Improper Output Handling	Grounding Detection API, Content Safety output-filtre, Structured Outputs	`output-validation-grounding-verification.md`
LLM06	Excessive Agency	Copilot Studio scoped tools, RBAC per project, human-in-the-loop, budget caps	—
LLM07	System Prompt Leakage	Metaprompt patterns, Prompt Shields, output monitoring	`jailbreak-prevention-production.md`
LLM08	Vector and Embedding Weaknesses	AI Search managed identities, index-level security filters, Private Endpoints	—
LLM09	Misinformation	RAG grounding, Groundedness Detection, citation patterns, confidence scoring	—
LLM10	Unbounded Consumption	Rate limits, token budgets, PTU for capacity, Cost Management alerts	—

All reference files are in references/ai-security-engineering/.

Azure AI-spesifikke sikkerhetskontroller

For detailed per-service security controls tables, see references/ai-security-engineering/secure-model-deployment-hardening.md and references/ai-security-engineering/zero-trust-ai-services.md. Key services covered:

Azure OpenAI Service — Content Filtering, Abuse Monitoring, VNet/Private Endpoints, Managed Identity, CMK
Azure AI Search — Managed Identities, index-level security filters, encryption, Private Endpoints
Copilot Studio — Entra ID auth, Power Platform DLP, generative AI guardrails, environment isolation
Azure AI Foundry — Project isolation, granular RBAC, Private Endpoints, curated model catalog, tracing

2. Kostnadsestimering

P10/P50/P90 konfidensintervaller

Provide all estimates with three scenarios. Verify current prices via microsoft_docs_search before calculating.

Scenario	Persentil	Beskrivelse	Multiplikator
P10 (Optimistisk)	10.	Lavt volum, ideelle forhold	Basis x 0.6
P50 (Forventet)	50.	Normal bruk, erfaringstall	Basis x 1.0
P90 (Konservativt)	90.	Høyt volum, buffer for uforutsett	Basis x 1.8

Adjust multipliers based on historical volatility. Always present both USD and NOK (add 3-5% currency buffer for NOK).

TCO-komponenter

Calculate for 1, 12, and 36 months. Present Budget/Recommended/Premium alternatives.

Komponent	Inkluderer	Eksempler
Lisenser	Software per bruker/org	M365 Copilot, Copilot Studio, Power Platform
Compute	AI-inferens, hosting	Azure OpenAI tokens, App Service, Functions
Storage	Datalagring	AI Search indekser, Blob Storage, Cosmos DB
Networking	Dataoverføring	Egress, Private Link, Application Gateway
Support	Microsoft Support	Unified/Premier Support
Drift	Internt personell	Utviklere, MLOps, sikkerhetsteam

See references/cost-optimization/deterministic-cost-calculation-model.md and references/cost-optimization/budget-forecasting-ai-projects.md for full calculation methodology.

FinOps for AI

Apply these optimization strategies and refer to detailed guidance in references:

Token-optimalisering: Shorter prompts, context window management, model tiering (GPT-4o mini vs GPT-4o), prompt caching. See references/cost-optimization/token-counting-optimization.md.
PTU vs Pay-As-You-Go: PTU for stable workloads (break-even ~60-70% utilization), PAYG for variable. See references/cost-optimization/ptu-vs-paygo-economics.md.
Caching: Semantic caching, prompt caching, RAG result caching. See references/cost-optimization/semantic-caching-patterns.md.
Right-sizing: Start with lowest SKU, monitor 2-4 weeks, consider SLMs for specialized tasks. See references/cost-optimization/model-selection-price-performance.md.

3. Ytelse og skalerbarhet

Optimize latency, throughput, and scalability for AI workloads. Key strategies:

Regional deployment in Norway East / West Europe reduces latency 20-50ms
Streaming responses reduce perceived latency 5-10x for interactive use
Prompt caching gives up to 50% cost reduction and 80% latency reduction for repeated prefixes (>1024 tokens)
Batch API provides 50% price reduction for non-interactive workloads (24h SLA)
Auto-scaling patterns: Horizontal scaling (App Service/AKS), load balancing (APIM/Traffic Manager), queue-based buffering (Service Bus+Functions), PTU+PAYG hybrid
Rate limit management: TPM/RPM quotas, exponential backoff with jitter, multi-deployment, APIM for centralized throttling
Load testing: Establish baseline, simulate peak traffic, identify breaking points, long-running soak tests

For detailed implementation guidance, see specific files in references/performance-scalability/:

latency-optimization-azure-openai.md — Latency tuning
auto-scaling-ai-infrastructure.md — Scaling patterns
rate-limit-management.md — TPM/RPM quota management
load-testing-ai-services.md — Load testing methodology

4. Referansekatalog

Eide referanser

Katalog	Filer	Innhold
`references/ai-security-engineering/`	17	Forsvar, testing, scoring, hendelseshåndtering, Zero Trust, STRIDE-AI, prompt injection, content safety
`references/cost-optimization/`	21	Kostnadsmodellering, FinOps, token-optimalisering, PTU/PAYG, caching, right-sizing, SLM-økonomi
`references/performance-scalability/`	18	Latency, skalering, streaming, batch API, rate limits, benchmarking, GPU-dimensjonering

Kryss-referanser

Compliance/governance: skills/ms-ai-governance/references/responsible-ai/ (AI Act, bias, etikk) and references/norwegian-public-sector-governance/ (Digdir, NSM, Schrems II, DPIA)
Arkitektur: skills/ms-ai-advisor/references/architecture/ (sikkerhetssoner, arkitekturmønstre, offentlig sektor-sjekkliste)

5. MCP-verktøy

Behov	Verktøy	Bruk
Sikkerhetsdokumentasjon	`microsoft_docs_search`	Verifiser kontroller, sjekk oppdateringer
Fullstendig veiledning	`microsoft_docs_fetch`	Security baselines, konfigurasjonsguider
Kodeeksempler	`microsoft_code_sample_search`	SDK for Content Safety, RBAC, Key Vault

Never trust the knowledge base blindly for prices and feature availability — verify via MCP tools.

6. Arbeidsprosess

Sikkerhetsvurdering

Map the solution's AI components and data flows
Score each of the 6 dimensions using rubrics from references
Calculate weighted risk score with appropriate weight profile
Map OWASP LLM Top 10 threats to the solution
Document findings with concrete remediation recommendations, prioritized by risk and cost

Kostnadsestimering

Identify all Azure services in the solution
Estimate consumption per service (tokens, storage, traffic)
Fetch current prices via MCP tools
Calculate P10/P50/P90 per component, sum to TCO for 1/12/36 months
Present Budget/Recommended/Premium alternatives with FinOps opportunities

Ytelsesgjennomgang

Define performance requirements (latency, throughput, availability)
Identify bottlenecks and recommend optimizations from reference catalog
Estimate performance impact and propose monitoring/benchmarking setup

12 KiB Raw Blame History