Initial addition of ms-ai-architect plugin to the open-source marketplace. Private content excluded: orchestrator/ (Linear tooling), docs/utredning/ (client investigation), generated test reports and PDF export script. skill-gen tooling moved from orchestrator/ to scripts/skill-gen/. Security scan: WARNING (risk 20/100) — no secrets, no injection found. False positive fixed: added gitleaks:allow to Python variable reference in output-validation-grounding-verification.md line 109. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
30 KiB
Secure Model Deployment and Runtime Hardening
Kategori: AI Security Engineering Dato: 2026-02-05 Målgruppe: Arkitekter som skal sikre AI-modeller i produksjonsmiljøer
Introduksjon
Sikker modelldeployering og runtime-hardening beskytter AI-modeller mot trusler gjennom hele deployment-syklusen — fra container-bygging til runtime-kjøring. Dette dokumentet dekker fem kritiske sikkerhetslag: container image scanning, runtime memory protection, resource exhaustion defense, model integrity verification og secrets management i deployment.
Uten systematisk hardening eksponeres AI-deployments for supply chain-angrep, modell-manipulasjon, ressurs-uttømming og lekkasje av sensitive nøkler. Microsoft Azure tilbyr et omfattende rammeverk for å sikre AI-deployments gjennom Azure Machine Learning, Azure Container Registry, Microsoft Defender og Azure Key Vault.
Container Image Scanning
Hvorfor container-scanning er kritisk
AI-modeller deployes typisk som Docker-containere. Disse containerne kan inneholde sårbarheter i:
- Base OS images (Ubuntu, Alpine)
- Python-pakker og dependencies
- ML-frameworks (PyTorch, TensorFlow, ONNX Runtime)
- Systembiblioteker og binærer
Microsoft Security Benchmark (MCSB v2): AI-1.1 krever at alle modeller går gjennom formell godkjenning med automatisk security validation inkludert hash verification og scanning for embedded backdoors.
Azure-implementering
1. Microsoft Defender for Container Registry
Automatisk scanning:
# Azure Policy-konfiguration for container scanning
{
"properties": {
"displayName": "Container images should be scanned for vulnerabilities",
"policyType": "BuiltIn",
"mode": "All",
"description": "Enables Microsoft Defender vulnerability scanning for Azure Container Registry",
"parameters": {
"effect": {
"allowedValues": ["AuditIfNotExists", "Disabled"],
"defaultValue": "AuditIfNotExists"
}
}
}
}
Capabilities:
- Automatisk scanning av alle images pushet til Azure Container Registry
- Identifiserer CVE-vulnerabilities i OS-pakker og applikasjonsdependencies
- Genererer vulnerability assessment reports tilgjengelig via Azure Security Center
- Kontinuerlig re-scanning av eksisterende images når nye CVEer oppdages
2. Azure Machine Learning Image Management
Microsoft-managed base images:
- Azure Machine Learning releases oppdaterte base images hver 14. dag
- Commitment: Ingen vulnerabilities eldre enn 30 dager i
:latest-tag - Immutable tags for hver versjon (
mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu22.04:20260115)
Image update-strategi:
from azure.ai.ml.entities import Environment
# Bruk latest-tag for automatiske security patches
env = Environment(
name="secure-training-env",
image="mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu22.04:latest",
conda_file="conda-deps.yaml"
)
# ELLER: Pin til spesifikk versjon for reproduserbarhet
env_pinned = Environment(
name="reproducible-env",
image="mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu22.04:20260115",
conda_file="conda-deps.yaml"
)
Trade-off:
:latest→ Maksimal security, redusert reproducibility- Pinned version → Reproducibility, men krever manuell oppdatering
3. Custom Image Scanning Workflow
Pre-deployment validation:
# Trivy scanning i CI/CD pipeline
az acr login --name myregistry
# Build og push image
docker build -t myregistry.azurecr.io/mymodel:v1.0 .
docker push myregistry.azurecr.io/mymodel:v1.0
# Scan med Trivy (open-source vulnerability scanner)
trivy image myregistry.azurecr.io/mymodel:v1.0 \
--severity HIGH,CRITICAL \
--exit-code 1 # Fail pipeline hvis vulnerabilities funnet
Azure DevOps integration:
# azure-pipelines.yml
- task: AzureCLI@2
displayName: 'Scan container image'
inputs:
azureSubscription: 'MyAzureSubscription'
scriptType: 'bash'
scriptLocation: 'inlineScript'
inlineScript: |
# Install Trivy
wget -qO - https://aquasecurity.github.io/trivy-repo/deb/public.key | sudo apt-key add -
echo "deb https://aquasecurity.github.io/trivy-repo/deb $(lsb_release -sc) main" | sudo tee -a /etc/apt/sources.list.d/trivy.list
sudo apt-get update
sudo apt-get install trivy
# Scan image
trivy image $(containerRegistry)/$(imageName):$(imageTag) \
--format json \
--output trivy-results.json \
--severity CRITICAL,HIGH
# Publiser results
cat trivy-results.json
- task: PublishBuildArtifacts@1
inputs:
pathToPublish: 'trivy-results.json'
artifactName: 'vulnerability-scan'
4. Approved Model Registry Enforcement
Azure Policy for model approval:
{
"policyDefinitionId": "/providers/Microsoft.Authorization/policyDefinitions/model-approval",
"parameters": {
"effect": { "value": "Deny" },
"allowedPublishers": {
"value": ["Microsoft", "MyOrganization"]
},
"approvedAssetIds": {
"value": [
"azureml://registries/myorg/models/bert-base/versions/1",
"azureml://registries/myorg/models/gpt-neo/versions/2"
]
}
},
"scope": "/subscriptions/{subscription-id}/resourceGroups/{rg-name}"
}
Dette blokkerer deployment av modeller som ikke er pre-approved i centralized model registry.
Scanning-frekvens
| Compute Type | Scan Timing | Oppdateringsfrekvens |
|---|---|---|
| Compute Instance | Ved provisioning | Manuell re-create (monthly) |
| Compute Cluster | Ved scale-up fra 0 nodes | Automatisk når min_nodes=0 |
| Managed Online Endpoint | Ved deployment | Automatisk (monthly) |
| Kubernetes (AKS) | Ved amlarc extension upgrade |
Manuell eller auto-upgrade |
Runtime Memory Protection
Trussellandskap
Runtime-angrep mot AI-modeller inkluderer:
- Model extraction: Reverse engineering av modellvekter via inference API
- Data poisoning attacks: Injeksjon av malicious data i runtime
- Side-channel attacks: Lekkasje av sensitiv informasjon via timing eller memory access patterns
Azure Confidential Computing
1. Confidential Containers på ACI
Hardware-based Trusted Execution Environments (TEE):
from azure.mgmt.containerinstance import ContainerInstanceManagementClient
from azure.ai.ml.entities import ManagedOnlineDeployment
# Deploy model i confidential container
deployment = ManagedOnlineDeployment(
name="confidential-inference",
endpoint_name="secure-endpoint",
model=model,
environment=env,
instance_type="Standard_DC4s_v3", # Confidential VM size
instance_count=1,
# Confidential computing enforcement policy
environment_variables={
"CONFIDENTIAL_COMPUTING": "enabled",
"ATTESTATION_ENDPOINT": "https://myattestation.attest.azure.net"
}
)
Key capabilities:
- Memory encryption: All model data og inference data krypteres i minnet (AMD SEV-SNP eller Intel TDX)
- Remote attestation: Verifiserer at koden kjører i legitimate TEE før secrets releases
- Data clean rooms: Multi-party ML training uten at noen part ser andres rådata
2. Confidential Computing Enforcement (CCE) Policies
Azure CLI confcom extension:
# Generer CCE policy fra ARM template
az confcom acipolicygen \
--input arm-template.json \
--output-type base64 \
--print-policy
# Output: Base64-encoded policy som enforces hvilke containere kan kjøre
CCE policy example:
{
"version": "1.0",
"containers": {
"allow": [
{
"image": "myregistry.azurecr.io/mymodel:v1.0@sha256:abc123...",
"command": ["python", "score.py"],
"env_rules": [
{ "name": "MODEL_PATH", "pattern": "^/models/.*$" }
]
}
]
},
"enforcement": "block"
}
Dette sikrer at BARE godkjente containere med spesifikke SHA256-hashes kan kjøre, og blokkerer runtime code injection.
3. Secure Key Release Sidecar
Attestation-basert secrets access:
# Container group med secure key release
apiVersion: '2021-09-01'
location: westeurope
properties:
containers:
- name: inference-container
properties:
image: myregistry.azurecr.io/mymodel:v1.0
resources:
requests:
cpu: 2
memoryInGB: 4
volumeMounts:
- name: model-volume
mountPath: /models
readOnly: true
- name: skr-sidecar
properties:
image: mcr.microsoft.com/aci/skr:latest
environmentVariables:
- name: AKV_ENDPOINT
value: https://myvault.vault.azure.net
- name: KEY_NAME
value: model-encryption-key
- name: ATTESTATION_ENDPOINT
value: https://myattestation.attest.azure.net
confidentialComputeProperties:
ccePolicy: <base64-policy>
volumes:
- name: model-volume
azureFile:
shareName: encrypted-models
storageAccountName: mystorageaccount
Flow:
- SKR sidecar genererer hardware attestation report
- Sender report til Azure Attestation service
- Får attestation token hvis environment er trusted
- Bruker token til å release encryption key fra Azure Key Vault
- Dekrypterer modell-filer i memory (aldri skrevet til disk)
Memory Isolation Techniques
Trusted Launch VMs for Azure ML Compute:
from azure.ai.ml.entities import AmlCompute
compute = AmlCompute(
name="secure-cluster",
size="Standard_DC4s_v3", # Confidential VM
min_instances=0,
max_instances=4,
# Trusted Launch features
security_profile={
"secure_boot": True,
"vtpm": True,
"encryption_at_host": True
}
)
ml_client.compute.begin_create_or_update(compute)
Benefits:
- Secure Boot: Verifiserer at bare trusted boot components lastes
- vTPM (Virtual Trusted Platform Module): Måler boot integrity
- Encryption at host: Temp disks og OS cache krypteres
Resource Exhaustion Defense
Angrepsscenarier
- Model DoS: Adversarial inputs designet for å trigge ekstreme compute-kostnader
- Token flooding: Overwhelming inference endpoint med massive request volumes
- Memory bombs: Inputs som forårsaker OOM (Out of Memory) crashes
Azure-implementering
1. API Management Rate Limiting
Token-level quota enforcement:
<!-- Azure APIM policy -->
<policies>
<inbound>
<base />
<!-- Rate limit per subscription -->
<rate-limit-by-key calls="100" renewal-period="60"
counter-key="@(context.Subscription.Id)" />
<!-- Token quota for generative AI -->
<quota-by-key calls="1000000"
renewal-period="86400"
counter-key="@(context.Subscription.Id)"
increment-count="@{
var tokens = context.Variables.GetValueOrDefault<int>("response-tokens", 0);
return tokens;
}" />
<!-- Request timeout -->
<timeout timeout-ms="30000" />
</inbound>
<outbound>
<base />
<!-- Extract token count from response -->
<set-variable name="response-tokens"
value="@(context.Response.Body.As<JObject>()?["usage"]?["total_tokens"]?.Value<int>() ?? 0)" />
</outbound>
</policies>
2. Azure Machine Learning Endpoint Quotas
Instance auto-scaling med caps:
from azure.ai.ml.entities import ManagedOnlineDeployment, OnlineRequestSettings
deployment = ManagedOnlineDeployment(
name="blue",
endpoint_name="my-endpoint",
model=model,
instance_type="Standard_DS3_v2",
instance_count=1,
# Request settings
request_settings=OnlineRequestSettings(
request_timeout_ms=30000, # 30s timeout
max_concurrent_requests_per_instance=10,
max_queue_wait_ms=5000
),
# Auto-scaling
scale_settings={
"scale_type": "target_utilization",
"min_instances": 1,
"max_instances": 10,
"target_utilization_percentage": 70
}
)
Resource limits per instance:
# Kubernetes deployment med resource limits
apiVersion: apps/v1
kind: Deployment
metadata:
name: model-inference
spec:
replicas: 3
template:
spec:
containers:
- name: inference
image: myregistry.azurecr.io/mymodel:v1.0
resources:
requests:
cpu: "2"
memory: "4Gi"
limits:
cpu: "4"
memory: "8Gi"
# Readiness probe to prevent traffic during startup
readinessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
3. Input Validation og Size Limits
Pre-inference validation:
# score.py i Azure ML deployment
import logging
import json
def init():
global model
global MAX_INPUT_SIZE
MAX_INPUT_SIZE = 1024 * 1024 # 1 MB limit
model = load_model()
def run(raw_data):
try:
# Size validation
if len(raw_data) > MAX_INPUT_SIZE:
return json.dumps({
"error": "Input exceeds maximum size limit",
"max_size_bytes": MAX_INPUT_SIZE
}), 413 # Payload Too Large
data = json.loads(raw_data)
# Input shape validation
if "input" not in data:
return json.dumps({"error": "Missing 'input' field"}), 400
input_data = data["input"]
if not isinstance(input_data, list):
return json.dumps({"error": "Input must be a list"}), 400
if len(input_data) > 1000: # Max batch size
return json.dumps({
"error": "Batch size exceeds limit",
"max_batch_size": 1000
}), 400
# Inference
result = model.predict(input_data)
return json.dumps({"predictions": result.tolist()})
except json.JSONDecodeError:
return json.dumps({"error": "Invalid JSON"}), 400
except Exception as e:
logging.error(f"Inference error: {str(e)}")
return json.dumps({"error": "Internal server error"}), 500
4. Circuit Breaker Pattern
Polly-implementering (C#) eller tenacity (Python):
from tenacity import retry, stop_after_attempt, wait_exponential
from azure.ai.ml import MLClient
class ModelClient:
def __init__(self, endpoint_url, api_key):
self.endpoint_url = endpoint_url
self.api_key = api_key
self.failure_count = 0
self.circuit_open = False
@retry(
stop=stop_after_attempt(3),
wait=wait_exponential(multiplier=1, min=2, max=10)
)
def predict(self, data):
if self.circuit_open:
raise Exception("Circuit breaker is open")
try:
response = requests.post(
self.endpoint_url,
headers={"Authorization": f"Bearer {self.api_key}"},
json=data,
timeout=30
)
response.raise_for_status()
# Reset failure count on success
self.failure_count = 0
return response.json()
except Exception as e:
self.failure_count += 1
# Open circuit after 5 failures
if self.failure_count >= 5:
self.circuit_open = True
logging.error("Circuit breaker opened due to repeated failures")
raise
Model Integrity Verification
Digital Signatures og Hash Verification
Azure ML Model Registry med provenance tracking:
from azure.ai.ml.entities import Model
from azure.ai.ml import MLClient
import hashlib
def register_model_with_hash(ml_client: MLClient, model_path: str, model_name: str):
# Calculate SHA256 hash
sha256_hash = hashlib.sha256()
with open(model_path, "rb") as f:
for byte_block in iter(lambda: f.read(4096), b""):
sha256_hash.update(byte_block)
file_hash = sha256_hash.hexdigest()
# Register med metadata
model = Model(
path=model_path,
name=model_name,
description="Production model with integrity verification",
tags={
"sha256": file_hash,
"signed_by": "security-team@example.com",
"approval_date": "2026-02-05",
"training_run_id": "run-123456"
},
properties={
"framework": "pytorch",
"framework_version": "2.1.0",
"training_dataset": "secure-dataset-v1"
}
)
registered_model = ml_client.models.create_or_update(model)
print(f"Model registered with hash: {file_hash}")
return registered_model
def verify_model_integrity(ml_client: MLClient, model_name: str, model_version: str):
# Hent model metadata
model = ml_client.models.get(name=model_name, version=model_version)
expected_hash = model.tags.get("sha256")
if not expected_hash:
raise ValueError("Model does not have integrity hash in metadata")
# Download og verify
model_path = ml_client.models.download(name=model_name, version=model_version, download_path="./temp")
sha256_hash = hashlib.sha256()
with open(model_path, "rb") as f:
for byte_block in iter(lambda: f.read(4096), b""):
sha256_hash.update(byte_block)
actual_hash = sha256_hash.hexdigest()
if actual_hash != expected_hash:
raise ValueError(f"Model integrity check failed! Expected {expected_hash}, got {actual_hash}")
print(f"✓ Model integrity verified: {actual_hash}")
return True
Model Signing med Azure Key Vault
Sign model artifacts:
# Generate signing key i Azure Key Vault
az keyvault key create \
--vault-name myvault \
--name model-signing-key \
--kty RSA \
--size 4096 \
--ops sign verify
# Sign model file
az keyvault key sign \
--vault-name myvault \
--name model-signing-key \
--algorithm RS256 \
--value $(cat model.pkl | base64 -w 0) \
--output json > model.pkl.sig
Verify signature ved deployment:
from azure.keyvault.keys.crypto import CryptographyClient, SignatureAlgorithm
from azure.identity import DefaultAzureCredential
import base64
def verify_model_signature(model_path: str, signature_path: str, key_vault_url: str, key_name: str):
credential = DefaultAzureCredential()
# Read model file
with open(model_path, "rb") as f:
model_data = f.read()
# Read signature
with open(signature_path, "r") as f:
signature_b64 = f.read()
signature = base64.b64decode(signature_b64)
# Verify med Key Vault
crypto_client = CryptographyClient(
key=f"{key_vault_url}/keys/{key_name}",
credential=credential
)
result = crypto_client.verify(
algorithm=SignatureAlgorithm.rs256,
digest=model_data,
signature=signature
)
if result.is_valid:
print("✓ Model signature verified")
return True
else:
raise ValueError("Model signature verification failed!")
Model Drift Monitoring (Indirect Integrity Check)
Azure Monitor custom metrics:
from azure.monitor.opentelemetry import configure_azure_monitor
from opentelemetry import metrics
import numpy as np
configure_azure_monitor(
connection_string="InstrumentationKey=xxx;IngestionEndpoint=https://xxx.in.applicationinsights.azure.com/"
)
meter = metrics.get_meter_provider().get_meter("model-monitoring")
accuracy_gauge = meter.create_gauge(
name="model.accuracy",
description="Model prediction accuracy",
unit="percent"
)
def monitor_inference(predictions, ground_truth):
# Calculate accuracy
accuracy = np.mean(predictions == ground_truth) * 100
# Record metric
accuracy_gauge.set(accuracy, {"model": "prod-model-v1"})
# Anomaly detection: alert if accuracy drops > 10%
if accuracy < 85.0: # Baseline accuracy = 95%
logging.warning(f"Model accuracy degraded to {accuracy}%")
# Trigger alert via Azure Monitor
Azure Monitor alert rule:
{
"name": "ModelDriftAlert",
"properties": {
"description": "Alert when model accuracy drops significantly",
"severity": 2,
"enabled": true,
"scopes": ["/subscriptions/{sub-id}/resourceGroups/{rg}/providers/Microsoft.Insights/components/{app-insights}"],
"criteria": {
"allOf": [
{
"metricName": "model.accuracy",
"operator": "LessThan",
"threshold": 85,
"timeAggregation": "Average"
}
]
},
"actions": [
{
"actionGroupId": "/subscriptions/{sub-id}/resourceGroups/{rg}/providers/Microsoft.Insights/actionGroups/security-team"
}
]
}
}
Secrets Management i Deployment
Problem Statement
AI deployments krever tilgang til:
- Model artifacts: Krypterte modell-filer
- Data sources: Database connection strings, API keys
- External services: Azure Storage, Azure Cognitive Services
- Inference credentials: OAuth tokens, service principals
Anti-pattern: Hardkodede secrets i Docker images eller environment variables.
Azure Key Vault Integration
1. Managed Identity for Deployments
System-assigned managed identity:
from azure.ai.ml.entities import ManagedOnlineEndpoint, IdentityConfiguration, ManagedIdentityConfiguration
# Create endpoint med system-assigned identity
endpoint = ManagedOnlineEndpoint(
name="secure-endpoint",
auth_mode="key",
identity=IdentityConfiguration(
type="system_assigned"
)
)
ml_client.online_endpoints.begin_create_or_update(endpoint).result()
# Grant Key Vault access
# (gjøres via Azure Portal eller CLI)
# az keyvault set-policy \
# --name myvault \
# --object-id <endpoint-identity-object-id> \
# --secret-permissions get list
User-assigned managed identity:
# Create user-assigned identity først
from azure.mgmt.msi import ManagedServiceIdentityClient
msi_client = ManagedServiceIdentityClient(credential, subscription_id)
identity = msi_client.user_assigned_identities.create_or_update(
resource_group_name="my-rg",
resource_name="ml-deployment-identity",
parameters={
"location": "westeurope"
}
)
# Bruk i endpoint
endpoint = ManagedOnlineEndpoint(
name="secure-endpoint",
auth_mode="key",
identity=IdentityConfiguration(
type="user_assigned",
user_assigned_identities=[
ManagedIdentityConfiguration(
resource_id=identity.id
)
]
)
)
2. Key Vault References i Scoring Script
score.py med Key Vault integration:
from azure.identity import DefaultAzureCredential, ManagedIdentityCredential
from azure.keyvault.secrets import SecretClient
import os
def init():
global model
global db_connection_string
# Use managed identity to access Key Vault
key_vault_name = os.environ["KEY_VAULT_NAME"]
key_vault_url = f"https://{key_vault_name}.vault.azure.net"
# DefaultAzureCredential automatisk bruker managed identity i Azure
credential = DefaultAzureCredential()
secret_client = SecretClient(vault_url=key_vault_url, credential=credential)
# Retrieve secrets
db_connection_string = secret_client.get_secret("db-connection-string").value
storage_key = secret_client.get_secret("storage-account-key").value
# Load model fra encrypted storage
from azure.storage.blob import BlobServiceClient
blob_client = BlobServiceClient(
account_url=f"https://{os.environ['STORAGE_ACCOUNT']}.blob.core.windows.net",
credential=storage_key
)
blob = blob_client.get_blob_client(container="models", blob="production-model.pkl")
model_bytes = blob.download_blob().readall()
import pickle
model = pickle.loads(model_bytes)
print("Model loaded successfully with secure secrets")
def run(raw_data):
import json
data = json.loads(raw_data)
# Use db_connection_string for feature lookup (example)
# predictions = model.predict(data)
return json.dumps({"status": "ok"})
3. Key Vault Secret Rotation
Automatisk rotation med Azure Functions:
import azure.functions as func
from azure.keyvault.secrets import SecretClient
from azure.identity import DefaultAzureCredential
import random
import string
def main(mytimer: func.TimerRequest) -> None:
key_vault_url = "https://myvault.vault.azure.net"
credential = DefaultAzureCredential()
secret_client = SecretClient(vault_url=key_vault_url, credential=credential)
# Generate new API key
new_api_key = ''.join(random.choices(string.ascii_letters + string.digits, k=32))
# Store som ny secret version (gammel versjon beholdes)
secret_client.set_secret("inference-api-key", new_api_key)
# Trigger deployment restart for å hente ny secret
# (implementeres via Azure ML SDK eller REST API)
print(f"Secret rotated successfully at {mytimer.past_due}")
Function app timer trigger:
{
"bindings": [
{
"name": "mytimer",
"type": "timerTrigger",
"direction": "in",
"schedule": "0 0 0 1 * *"
}
]
}
Dette roterer secrets hver 1. dag i måneden.
4. Azure App Configuration for Non-Secret Settings
Separer configuration fra secrets:
from azure.appconfiguration import AzureAppConfigurationClient
from azure.identity import DefaultAzureCredential
# Configuration (non-sensitive)
config_client = AzureAppConfigurationClient(
base_url="https://myappconfig.azconfig.io",
credential=DefaultAzureCredential()
)
model_version = config_client.get_configuration_setting(key="model.version").value
batch_size = int(config_client.get_configuration_setting(key="inference.batch_size").value)
# Secrets (sensitive)
secret_client = SecretClient(
vault_url="https://myvault.vault.azure.net",
credential=DefaultAzureCredential()
)
api_key = secret_client.get_secret("external-api-key").value
Fordeler:
- Configuration kan caches og deles åpent
- Secrets forblir i Key Vault med strict access control
- Feature flags og A/B testing uten secrets exposure
Sikkerhetsjekkliste for Deployment
| Kontroll | Beskrivelse | Azure Service |
|---|---|---|
| Container Scanning | Alle images scannet for CVE vulnerabilities | Microsoft Defender for Container Registry |
| Image Approval | Kun approved images kan deployes | Azure Policy + ML Model Registry |
| Runtime Isolation | Models kjører i isolated memory spaces | Azure Confidential Computing (TEE) |
| Resource Limits | CPU/memory caps + request timeouts | Azure ML Request Settings |
| Rate Limiting | Token quotas og request throttling | Azure API Management |
| Model Integrity | SHA256 hashes + digital signatures | Azure Key Vault + ML Model Registry |
| Secrets Management | Zero hardcoded secrets, managed identities | Azure Key Vault + Managed Identity |
| Monitoring | Model drift + resource exhaustion alerts | Azure Monitor + Application Insights |
| Network Isolation | Private endpoints + VNet integration | Azure Virtual Network + Private Link |
| Access Control | RBAC + MFA for deployment pipelines | Microsoft Entra ID |
Best Practices: Deployment Hardening Workflow
graph TD
A[Model Training Complete] --> B[Container Build]
B --> C{Trivy Scan Pass?}
C -->|No| D[Fix Vulnerabilities]
D --> B
C -->|Yes| E[Push to ACR]
E --> F[Microsoft Defender Scan]
F --> G{Vulnerabilities Found?}
G -->|Yes| H[Security Review]
H --> I{Approved?}
I -->|No| D
I -->|Yes| J[Register Model]
G -->|No| J
J --> K[Calculate SHA256 Hash]
K --> L[Sign with Key Vault]
L --> M[Deploy to Staging]
M --> N[Load Test + Resource Monitoring]
N --> O{Performance OK?}
O -->|No| P[Tune Resource Limits]
P --> M
O -->|Yes| Q[Production Deployment]
Q --> R[Enable Monitoring Alerts]
R --> S[Continuous Drift Detection]
For Cosmo
Når du diskuterer secure model deployment med kunder:
-
Start med risiko-kartlegging:
- "Hvilke modeller er production-critical?"
- "Håndterer dere sensitive data (personopplysninger, helseinformasjon)?"
- "Hva er konsekvensen av model downtime eller data leakage?"
-
Prioriter basert på threat profile:
- Høy-risiko: Confidential computing + full scanning + signed models
- Medium-risiko: Standard scanning + Key Vault + monitoring
- Lav-risiko: Basic security controls + automated updates
-
Implementer i faser:
- Fase 1: Container scanning + Key Vault migration (quick wins)
- Fase 2: Resource limits + rate limiting + monitoring
- Fase 3: Model signing + integrity verification
- Fase 4: Confidential computing for sensitive workloads
-
Norsk offentlig sektor-spesifikt:
- GDPR Art. 32: "Appropriate technical measures" → Container scanning + encryption
- NSM Grunnprinsipper: Defense in depth → Layered security (scanning + runtime + secrets)
- Sikkerhetsloven § 3-1: Risk assessment → Mandatory threat modeling før deployment
-
Cost-benefit balance:
- Confidential computing koster 30-50% mer enn standard VMs
- Men: Eliminerer risk for memory-based model extraction
- Anbefaling: Bruk kun for models med høy IP-verdi eller PII-data
-
Automatisering er nøkkelen:
- Manual security checks skalerer ikke
- CI/CD integration med automated scanning = kontinuerlig sikkerhet
- Azure DevOps pipelines med security gates = enforced compliance
Red flags å se etter:
- "Vi hårdkoder API keys i Docker images" → KRITISK, fiks ASAP
- "Vi bruker latest-tag uten pinning" → Medium risk, vurder trade-offs
- "Vi har aldri scannet våre containers" → Start med Trivy i dag
- "Vi kjører production uten resource limits" → DoS-sårbar, sett caps nå
Nyttige spørsmål:
- "Hvordan verifiserer dere at modellen i prod er den som ble godkjent?"
- "Hva skjer hvis noen injiserer malicious code i inference-containeren?"
- "Hvor lagres API keys for eksterne tjenester?"
- "Hvor raskt kan dere detektere en model extraction attack?"
Success metrics:
- Zero hardcoded secrets i repositories
- 100% av images scannet før deployment
- Model integrity verification i alle environments
- Resource exhaustion alerts konfigurert
- Mean time to detect (MTTD) security incidents < 5 minutter