Initial addition of ms-ai-architect plugin to the open-source marketplace. Private content excluded: orchestrator/ (Linear tooling), docs/utredning/ (client investigation), generated test reports and PDF export script. skill-gen tooling moved from orchestrator/ to scripts/skill-gen/. Security scan: WARNING (risk 20/100) — no secrets, no injection found. False positive fixed: added gitleaks:allow to Python variable reference in output-validation-grounding-verification.md line 109. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
21 KiB
Supply Chain Security for AI Models and Dependencies
Kategori: AI Security Engineering Dato: 2026-02-05 Relatert plattform: Azure AI Foundry, Azure Machine Learning, Azure DevOps, Microsoft Defender for Cloud
Oversikt
Supply chain security for AI-modeller handler om å sikre integriteten og autentisiteten til AI-komponenter gjennom hele livssyklusen — fra treningsdata og pre-trained models til dependencies og deployment artifacts. I motsetning til tradisjonell software supply chain security, må AI-systemer også beskytte modellvekter, datasett, og ML-spesifikke komponenter mot kompromittering.
Angrep mot AI supply chain kan introdusere backdoors i modeller, forgifte treningsdata, eller eksfiltrere sensitiv informasjon via model inference. Microsoft Azure Security Benchmark klassifiserer dette under AI-1: Ensure use of approved models som en "must have"-kontroll.
Unike utfordringer for AI supply chain
- Model provenance: Modeller lastes ned fra public repositories (HuggingFace, Model Zoo) uten verifisering
- Data poisoning: Treningsdata fra untrusted sources kan inneholde skadelig innhold
- Transitive dependencies: Python-pakker (PyTorch, TensorFlow) har dype dependency trees
- Immutable artifacts: Kompilerte modeller (ONNX, MLflow) er vanskelig å inspisere for backdoors
- Third-party MLaaS: Outsourcing av trening til tredjepartsleverandører introduserer tillit-risiko
1. Model Provenance Tracking
Hva er model provenance?
Model provenance er end-to-end sporbarhet av en modells opprinnelse, treningsprosess, og modifikasjoner. Dette inkluderer:
- Datasett-lineage: Hvilke data ble brukt for trening?
- Treningsjobb-metadata: Hyperparametere, compute resources, tidspunkt
- Model registry history: Versjonering, approvals, deployment records
- Audit trails: Hvem registrerte, godkjente, eller deployet modellen?
Implementering i Azure Machine Learning
Azure Machine Learning Model Registry fungerer som single source of truth:
from azure.ai.ml import MLClient
from azure.ai.ml.entities import Model
from azure.identity import DefaultAzureCredential
ml_client = MLClient(
credential=DefaultAzureCredential(),
subscription_id="<subscription-id>",
resource_group_name="<resource-group>",
workspace_name="<workspace-name>"
)
# Registrer modell med provenance metadata
model = Model(
path="./model",
name="fraud-detection-v2",
version="2.0",
description="Trained on 2025-Q4 dataset",
tags={
"training_job": "run_12345",
"data_version": "v2.3",
"approved_by": "security-team",
"scan_status": "passed"
},
properties={
"training_dataset_id": "azureml:fraud-data:2",
"validation_accuracy": "0.94"
}
)
ml_client.models.create_or_update(model)
Beste praksis
- Hash verification: Lagre SHA-256 hash av modellvekter ved registrering
- Immutable tags: Bruk tags som ikke kan overskrives (
created_date,git_commit) - Signed models: Implementer code signing for modell artifacts
- Centralized registry: Bruk Azure ML registries på tvers av subscriptions/workspaces
2. Dependency Vulnerability Scanning
Trusselbildet
AI-modeller avhenger av dype Python dependency trees (eksempel: PyTorch → NumPy → BLAS). Sårbarheter i disse komponentene kan utnyttes til:
- Remote code execution: Via malicious pickle files i modellformater
- Data exfiltration: Kompromitterte pakker som sender treningsdata til eksternt endepunkt
- Supply chain attacks: Typosquatting (pytorch vs. py-torch), package hijacking
MITRE ATT&CK klassifiserer dette som T1195: Supply Chain Compromise.
Azure-verktøy for scanning
1. Azure DevOps Dependency Scanning
Aktivert via GitHub Advanced Security for Azure DevOps:
# azure-pipelines.yml
trigger:
branches:
include:
- main
pool:
vmImage: 'ubuntu-latest'
steps:
- task: AdvancedSecurity-Dependency-Scanning@1
displayName: 'Scan Python dependencies'
inputs:
scanMode: 'all' # Scan både direkte og transitive dependencies
ecosystem: 'pip'
Dependency scanning genererer alerts for:
- Direct vulnerabilities: Pakker i
requirements.txt - Transitive vulnerabilities: Pakker som direkte dependencies bruker
- CVE severity mapping: Critical (CVSS ≥9.0), High (7.0-9.0), Medium (4.0-7.0), Low (1.0-4.0)
2. Microsoft Defender for Containers
Scanner container images (inkludert Azure ML environments) for vulnerabilities:
from azure.ai.ml.entities import Environment
# Opprett miljø med base image som scannes
env = Environment(
name="secure-training-env",
image="mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04",
conda_file="conda_dependencies.yml",
description="Environment with vulnerability scanning"
)
ml_client.environments.create_or_update(env)
Defender for Containers:
- Genererer vulnerability assessments automatisk når image pushes til Azure Container Registry
- Blokkerer deployment av images med critical vulnerabilities (konfigurerbart via Azure Policy)
- Integrerer med Azure Monitor for alerting
3. Quarantine Pattern for Package Management
Implementer self-serve package management med sikkerhetslag:
Data Scientist → Safe-listed repos (Microsoft Artifact Registry, PyPI, Conda)
↓
Automated testing (vulnerability scan)
↓
Pass → Container Registry
Fail → Deployment blocked, container removed
Process flow:
- Data scientists arbeider i Azure ML workspace med network restrictions
- Selv-serve fra curated package repositories
- Azure ML bygger Docker containers under deployment
- Microsoft Defender for Containers scanner for vulnerabilities
- Ved failure: Elegant exit fra deployment, fjern container
3. Vendor Security Assessment
Evaluering av tredjepartsleverandører
Når du bruker pre-trained models eller MLaaS-leverandører:
| Vurderingskriterium | Spørsmål |
|---|---|
| Model provenance | Kan leverandøren dokumentere treningsdata og prosess? |
| Security practices | Har de SOC 2 Type II / ISO 27001-sertifisering? |
| Data retention | Brukes dine data til å trene deres modeller? |
| Compromise notification | Har de en incident response plan og disclosure policy? |
| Access controls | Kan du revoke access raskt ved mistanke om kompromittering? |
| Contractual safeguards | Garanterer de mot bruk av copyrighted material? |
Azure-spesifikke leverandører
Microsoft tilbyr verifiserte modeller via:
- Azure Machine Learning Model Catalog: Curated models med security attestation
- HuggingFace Registry i Azure: Integrert med Azure ML, med provenance tracking
# Deploy verifisert modell fra Azure ML registry
registry_name = "azureml"
model_name = "gpt-35-turbo"
model_version = "0301"
model_id = f"azureml://registries/{registry_name}/models/{model_name}/versions/{model_version}"
deployment = ManagedOnlineDeployment(
name="verified-deployment",
endpoint_name="secure-endpoint",
model=model_id,
instance_type="Standard_DS3_v2",
instance_count=1
)
Red flags ved vendor assessment
- ❌ Unnvikende om datakilder ("proprietary dataset")
- ❌ Ingen dokumentasjon av security scanning
- ❌ Manglende API rate limiting (øker risiko for model stealing)
- ❌ Krever upload av sensitive treningsdata uten encryption garantier
4. Model Poisoning Prevention
Angrepsvektorer
Backdoor ML (MITRE ATT&CK: AML.T0050):
- Malicious MLaaS provider trojaner modell med trigger som aktiverer ved deployment
- Eksempel: Modell klassifiserer virus som "benign" når spesifikt filnavn inkluderes
Compromise Model Supply Chain (AML.T0020):
- Adversary uploader poisoned models til public marketplaces (HuggingFace Hub, Caffe Model Zoo)
- Modeller inneholder embedded logic som exfiltrerer data eller manipulerer outputs
Data Poisoning (AML.T0022):
- Malicious data injisert under pre-training, fine-tuning, eller embedding
- Eksempel: SQL injection i scrapet dataset → modell lærer å returnere falske resultater
Azure-kontroller for prevention
1. Centralized Model Approval Workflow
Implementer multi-stage approval via Azure Policy:
{
"policyDefinitionName": "[Preview]: Azure Machine Learning Deployments should only use approved Registry Models",
"effect": "Deny",
"parameters": {
"allowedPublishers": ["Microsoft", "OpenAI", "Meta"],
"approvedAssetIds": [
"azureml://registries/azureml/models/gpt-35-turbo/versions/0301",
"azureml://registries/azureml-meta/models/Llama-2-7b/versions/18"
]
}
}
Workflow:
- Data scientist registrerer modell i Azure ML workspace
- Automated security scanning: Hash verification, adversarial input testing
- Security team review: Validation av training data provenance
- Business owner approval: Sign-off før production deployment
- Azure Monitor logging: Comprehensive audit trail
2. Anomaly Detection på Training Data
Deploy Azure AI Anomaly Detector for å identifisere data poisoning:
from azure.ai.anomalydetector import AnomalyDetectorClient
from azure.core.credentials import AzureKeyCredential
anomaly_detector_client = AnomalyDetectorClient(
endpoint="https://<resource-name>.cognitiveservices.azure.com",
credential=AzureKeyCredential("<api-key>")
)
# Analyser time-series av training data metrics
response = anomaly_detector_client.detect_entire_series(
body={
"series": training_metrics, # Loss, accuracy over time
"granularity": "daily",
"sensitivity": 95
}
)
if response.is_anomaly:
# Alert security team, quarantine dataset
raise DataPoisoningAlert("Anomalous training metrics detected")
3. Model Integrity Validation
Implementer static analysis og adversarial robustness testing:
# Hash verification ved model loading
import hashlib
def verify_model_integrity(model_path, expected_hash):
with open(model_path, 'rb') as f:
file_hash = hashlib.sha256(f.read()).hexdigest()
if file_hash != expected_hash:
raise SecurityException("Model hash mismatch - possible tampering")
# Adversarial robustness testing (pre-approval)
from art.attacks.evasion import FastGradientMethod
from art.estimators.classification import PyTorchClassifier
classifier = PyTorchClassifier(model=model, loss=loss_fn, input_shape=(3, 224, 224), nb_classes=10)
attack = FastGradientMethod(estimator=classifier, eps=0.1)
adversarial_samples = attack.generate(x=test_images)
adversarial_accuracy = evaluate(model, adversarial_samples)
if adversarial_accuracy < 0.5:
raise SecurityException("Model vulnerable to adversarial attacks")
5. Software Bill of Materials (SBOM) for AI
Hva er AI SBOM?
Tradisjonelle SBOM-er (Software Bill of Materials) dekker ikke:
- Model artifacts: Vekter, biases, arkitektur
- Training datasets: Datasett-versjoner, opprinnelse
- Experiment tracking: Hyperparametere, compute resources
AI SBOM er en utvidet BOM som inkluderer ML-komponenter.
Implementering i Azure ML
Azure ML gir delvis SBOM-funksjonalitet via:
-
Model Registry Metadata:
- Model name, version, tags, properties
- Linked training job med full parameter logging
-
Environment Registry:
- Conda dependencies, pip packages, Docker base image
- Cryptographic hash av environment definition
-
Dataset Versioning:
- Azure ML Data Assets med versjonering
- Lineage tracking: Hvilke jobs brukte hvilket datasett
Manuell SBOM-generering
import json
from azure.ai.ml import MLClient
ml_client = MLClient.from_config()
def generate_ai_sbom(model_name, model_version):
model = ml_client.models.get(name=model_name, version=model_version)
# Hent training job metadata
job_id = model.tags.get("training_job")
job = ml_client.jobs.get(name=job_id)
# Hent environment dependencies
env_name = job.environment.name
env_version = job.environment.version
environment = ml_client.environments.get(name=env_name, version=env_version)
sbom = {
"model": {
"name": model.name,
"version": model.version,
"hash": model.properties.get("sha256"),
"created_date": model.creation_context.created_at.isoformat()
},
"training": {
"job_id": job_id,
"dataset": job.inputs.get("training_data"),
"compute": job.compute,
"hyperparameters": job.inputs
},
"dependencies": {
"base_image": environment.image,
"conda_packages": environment.conda_dependencies.get("dependencies", []),
"pip_packages": environment.conda_dependencies.get("pip", [])
}
}
with open(f"sbom_{model_name}_{model_version}.json", "w") as f:
json.dump(sbom, f, indent=2)
return sbom
SBOM i CI/CD Pipeline
Integrer SBOM-generering i deployment workflow:
# Azure DevOps Pipeline
- task: AzureCLI@2
displayName: 'Generate AI SBOM'
inputs:
azureSubscription: 'service-connection'
scriptType: 'bash'
scriptLocation: 'inlineScript'
inlineScript: |
az ml model download --name fraud-detection --version 2.0 --download-path ./model
python generate_sbom.py --model-name fraud-detection --version 2.0
- task: PublishBuildArtifacts@1
inputs:
PathtoPublish: 'sbom_fraud-detection_2.0.json'
ArtifactName: 'ai-sbom'
6. Secure ML Supply Chain: Oppsummert Implementasjon
Architecture: Defense in Depth
┌─────────────────────────────────────────────────────────────┐
│ Layer 1: Source Verification │
│ - Azure ML Model Catalog (curated models) │
│ - Package safe-listing (Microsoft Artifact Registry) │
│ - Code signing for custom models │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ Layer 2: Automated Security Validation │
│ - Dependency scanning (Azure DevOps Advanced Security) │
│ - Container image scanning (Defender for Containers) │
│ - Hash verification, adversarial robustness testing │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ Layer 3: Approval Workflow │
│ - Multi-stage review (security team, business owner) │
│ - Azure Policy enforcement (deny unapproved models) │
│ - RBAC via Microsoft Entra ID (separation of duties) │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ Layer 4: Monitoring & Response │
│ - Azure Monitor + Defender for AI (threat detection) │
│ - Anomaly detection på model outputs │
│ - Audit trails for compliance (Azure Log Analytics) │
└─────────────────────────────────────────────────────────────┘
Implementasjonssteg
-
Week 1-2: Foundation
- Aktiver Azure ML Model Registry for alle workspaces
- Konfigurer Azure Policy: "[Preview]: Azure Machine Learning Deployments should only use approved Registry Models"
- Opprett approval workflow (Azure DevOps Boards, Linear, eller ServiceNow)
-
Week 3-4: Scanning Infrastructure
- Aktiver GitHub Advanced Security for Azure DevOps (Dependency Scanning)
- Deploy Microsoft Defender for Containers
- Konfigurer automated testing pipeline (hash verification, adversarial tests)
-
Week 5-6: SBOM & Provenance
- Implementer AI SBOM-generering script
- Integrer SBOM i CI/CD pipeline
- Etabler dataset versioning practices (Azure ML Data Assets)
-
Week 7-8: Monitoring & Response
- Deploy Azure Monitor alerts for model registry events
- Konfigurer Microsoft Defender for AI threat detection
- Etabler incident response playbook for supply chain compromise
For Cosmo: Veiledning i Arkitekturdialog
Når klienten spør om AI supply chain security:
Diagnosespørsmål:
- "Bruker dere pre-trained models fra public repositories (HuggingFace, GitHub)?"
- "Har dere oversikt over alle Python-pakker som brukes i ML-miljøene?"
- "Hvordan verifiserer dere at en modell ikke er manipulert før deployment?"
- "Har dere noen gang opplevd at en dependency plutselig ble fjernet eller kompromittert?"
Risikovurdering:
- Høy risiko: Public sector, healthcare, finance (PII/sensitive data i treningsdata)
- Middels risiko: Generelle business applications uten kritisk påvirkning
- Lav risiko: Prototyping/eksperimentering uten production deployment
Anbefalinger basert på modenhet:
| Modenhetsnivå | Implementering |
|---|---|
| Starter | Azure ML Model Registry + Azure Policy for approved models |
| Intermediate | + Dependency scanning (Azure DevOps) + Defender for Containers |
| Advanced | + AI SBOM + Adversarial robustness testing + Anomaly detection |
| Expert | + Homomorphic encryption for training + Zero-trust model serving |
Red flags som krever umiddelbar oppmerksomhet:
- ⚠️ Modeller lastes direkte fra GitHub/HuggingFace uten verifikasjon
- ⚠️ Ingen versjonering av modeller eller datasett
- ⚠️ Treningsdata kommer fra ukjente eksterne kilder
- ⚠️ MLaaS-leverandør har ingen SOC 2 / ISO 27001-sertifisering
- ⚠️ Ingen monitoring av model registry access events
Kostnadsestimering:
| Komponent | Estimat (NOK/måned) |
|---|---|
| Azure DevOps Advanced Security (Dependency Scanning) | 5 000 - 15 000 (per aktiv committer) |
| Microsoft Defender for Containers | 20 - 50 per container image (1000 images = 20 000 - 50 000) |
| Azure ML Model Registry | Inkludert i workspace cost (0 tilleggskostnad) |
| Azure Monitor + Log Analytics | 10 000 - 50 000 (avhenger av log volume) |
| Total baseline | 35 000 - 130 000 NOK/måned |
Referanser og Videre Lesning
Microsoft Documentation
- AI-1: Ensure use of approved models (Azure Security Benchmark)
- Threat Modeling AI/ML Systems and Dependencies
- Vulnerability Management for Azure Machine Learning
- Security planning for LLM-based applications
MITRE ATT&CK Framework
Compliance Mappings
- NIST SP 800-53 Rev. 5: SA-3, SA-10, SA-15 (System and Services Acquisition)
- ISO 27001:2022: A.5.19 (Information security in supplier relationships), A.5.20 (Addressing information security within supplier agreements)
- NIST Cybersecurity Framework v2.0: ID.SC-04 (Suppliers and third-party partners are identified, prioritized, and assessed), GV.SC-06 (Planning and due diligence performed to reduce risks from suppliers)
Tools & Frameworks
- Microsoft Secure Supply Chain Consumption Framework (S2C2F)
- Azure Artifacts for package management
- OpenSSF Scorecard for .NET/NuGet
- AI Risk Database for public vulnerability tracking
Sist oppdatert: 2026-02-05 Neste review: 2026-05-05 (eller ved store endringer i Azure ML supply chain features)