ktg-plugin-marketplace/plugins/ms-ai-architect/skills/ms-ai-infrastructure/references/bcdr/geo-redundancy-azure-ai-search.md
Kjell Tore Guttormsen 6a7632146e feat(ms-ai-architect): add plugin to open marketplace (v1.5.0 baseline)
Initial addition of ms-ai-architect plugin to the open-source marketplace.
Private content excluded: orchestrator/ (Linear tooling), docs/utredning/
(client investigation), generated test reports and PDF export script.
skill-gen tooling moved from orchestrator/ to scripts/skill-gen/.

Security scan: WARNING (risk 20/100) — no secrets, no injection found.
False positive fixed: added gitleaks:allow to Python variable reference
in output-validation-grounding-verification.md line 109.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-07 17:17:17 +02:00

15 KiB
Raw Blame History

Geo-Redundancy for Azure AI Search

Last updated: 2026-02 Status: GA Category: Business Continuity & Disaster Recovery


Introduksjon

Azure AI Search er en regional tjeneste uten innebygd geo-replikering eller automatisk failover. Hvis regionen blir utilgjengelig, blir også search-tjenesten utilgjengelig. For AI-løsninger med RAG-arkitektur (Retrieval-Augmented Generation) er dette en kritisk svakhet fordi search-indeksen er hjørnesteinen i hele kunnskapsgjenfinningen.

For å oppnå geo-redundans for Azure AI Search må organisasjoner implementere egne løsninger: identiske search-tjenester i flere regioner, synkroniserte indekser, og load balancing med failover-logikk. Dette krever nøye planlegging av indekseringsstrategier, konsistensgarantier og trafikkstyring.

For norsk offentlig sektor med strenge tilgjengelighetskrav er multi-region AI Search en viktig komponent i BCDR-strategien. Typisk oppsett er primær i Norway East med sekundær i Sweden Central, noe som sikrer data residency innenfor EU/EØS samtidig som det gir regional redundans.

Indeksreplikering på tvers av regioner

Arkitekturoversikt

Azure AI Search har ingen innebygd mekanisme for indeksreplikering mellom regioner. Du må implementere en av følgende strategier:

Strategi 1: Dual Push Indexing
┌──────────────┐     ┌──────────────────┐     ┌──────────────────┐
│  Datakilde   │────▶│ Indexer Pipeline  │────▶│ Search Region A  │
│  (Blob/SQL)  │     │ (Azure Functions) │────▶│ Search Region B  │
└──────────────┘     └──────────────────┘     └──────────────────┘

Strategi 2: Pull from Replicated Source
┌──────────────┐     ┌──────────────────┐     ┌──────────────────┐
│ Datakilde A  │◀───▶│    GRS / GZRS    │◀───▶│ Datakilde B      │
│ (Region A)   │     │   Replikering    │     │ (Region B)       │
└──────┬───────┘     └──────────────────┘     └──────┬───────────┘
       │                                              │
       ▼                                              ▼
┌──────────────┐                              ┌──────────────────┐
│ AI Search A  │                              │ AI Search B      │
│ (Indexer)    │                              │ (Indexer)        │
└──────────────┘                              └──────────────────┘

Dual Push Indexing med Azure Functions

# Azure Function: Push-based dual-region indexing
import azure.functions as func
from azure.search.documents import SearchClient
from azure.core.credentials import AzureKeyCredential
import json

# Konfigurer klienter for begge regioner
primary_client = SearchClient(
    endpoint="https://search-primary-norwayeast.search.windows.net",
    index_name="knowledge-base",
    credential=AzureKeyCredential("<primary-key>")
)

secondary_client = SearchClient(
    endpoint="https://search-secondary-swedencentral.search.windows.net",
    index_name="knowledge-base",
    credential=AzureKeyCredential("<secondary-key>")
)

def main(msg: func.QueueMessage) -> None:
    """Process document and index to both regions."""
    document = json.loads(msg.get_body().decode('utf-8'))

    # Indekser til primær region
    try:
        primary_result = primary_client.upload_documents(documents=[document])
        logging.info(f"Primary indexed: {primary_result[0].key}")
    except Exception as e:
        logging.error(f"Primary indexing failed: {e}")
        # Send til dead-letter queue for retry
        raise

    # Indekser til sekundær region (asynkront er OK)
    try:
        secondary_result = secondary_client.upload_documents(documents=[document])
        logging.info(f"Secondary indexed: {secondary_result[0].key}")
    except Exception as e:
        logging.warning(f"Secondary indexing failed (will retry): {e}")
        # Legg i retry-kø — sekundær er ikke kritisk
        send_to_retry_queue(document)

Pull-basert indeksering med Built-in Indexers

# Opprett identiske indexer i begge regioner
# Primær region — kobler til primær datakilde
az search indexer create \
  --service-name "search-primary-norwayeast" \
  --resource-group "rg-ai-prod" \
  --name "blob-indexer" \
  --data-source-name "blob-source-primary" \
  --target-index-name "knowledge-base" \
  --schedule '{"interval": "PT5M"}'

# Sekundær region — kobler til GRS-replikert datakilde
az search indexer create \
  --service-name "search-secondary-swedencentral" \
  --resource-group "rg-ai-dr" \
  --name "blob-indexer" \
  --data-source-name "blob-source-secondary" \
  --target-index-name "knowledge-base" \
  --schedule '{"interval": "PT5M"}'

Replikatelling og dimensjonering for tilgjengelighet

Intra-region tilgjengelighet med replikaer

Azure AI Search distribuerer automatisk replikaer på tvers av Availability Zones når du har 2+ replikaer i en region som støtter AZ.

Replikaer SLA Lesbare spørringer Skriveoperasjoner Merknader
1 99.9% Ja Ja Ingen AZ-redundans
2 99.9% Ja Ja AZ-distribuert automatisk
3+ 99.99% Ja Ja Anbefalt for prod (read/write SLA)

Dimensjoneringsveiledning for multi-region

Per region (produksjon):
├── Replikaer: 3 (for 99.99% SLA og AZ-redundans)
├── Partisjoner: Basert på indeksstørrelse
│   ├── < 25 GB → 1 partisjon
│   ├── 2550 GB → 2 partisjoner
│   ├── 50150 GB → 36 partisjoner
│   └── > 150 GB → 612 partisjoner
└── SKU: Standard eller Standard S2/S3

Sekundær region (DR):
├── Replikaer: 2 (minimum for AZ, scale up ved failover)
├── Partisjoner: Identisk med primær
└── SKU: Identisk med primær

Kostnadsoptimalisering for sekundær region

# Sekundær region starter med færre replikaer
# Scale up automatisk ved failover via Azure Automation

# Opprett Automation Runbook for scale-up
az automation runbook create \
  --automation-account-name "aa-ai-dr" \
  --resource-group "rg-ai-dr" \
  --name "scale-up-search-dr" \
  --type "PowerShell" \
  --content '
    # Scale sekundær AI Search fra 2 til 3 replikaer
    $searchService = Get-AzSearchService `
      -ResourceGroupName "rg-ai-dr" `
      -Name "search-secondary-swedencentral"
    Set-AzSearchService `
      -ResourceGroupName "rg-ai-dr" `
      -Name "search-secondary-swedencentral" `
      -ReplicaCount 3
    Write-Output "Scaled to 3 replicas for DR"
  '

Failover- og routingstrategier

Azure Front Door for AI Search failover

// Bicep: Azure Front Door med failover for AI Search
resource frontDoor 'Microsoft.Cdn/profiles@2023-05-01' = {
  name: 'fd-ai-search'
  location: 'global'
  sku: {
    name: 'Premium_AzureFrontDoor'
  }
}

resource originGroup 'Microsoft.Cdn/profiles/originGroups@2023-05-01' = {
  parent: frontDoor
  name: 'search-origins'
  properties: {
    loadBalancingSettings: {
      sampleSize: 4
      successfulSamplesRequired: 3
    }
    healthProbeSettings: {
      probePath: '/indexes/knowledge-base/docs?api-version=2024-07-01&search=*&$top=1'
      probeRequestType: 'GET'
      probeProtocol: 'Https'
      probeIntervalInSeconds: 30
    }
  }
}

resource primaryOrigin 'Microsoft.Cdn/profiles/originGroups/origins@2023-05-01' = {
  parent: originGroup
  name: 'primary-norwayeast'
  properties: {
    hostName: 'search-primary-norwayeast.search.windows.net'
    priority: 1
    weight: 1000
  }
}

resource secondaryOrigin 'Microsoft.Cdn/profiles/originGroups/origins@2023-05-01' = {
  parent: originGroup
  name: 'secondary-swedencentral'
  properties: {
    hostName: 'search-secondary-swedencentral.search.windows.net'
    priority: 2
    weight: 1000
  }
}

Application-level failover

# Python: Application-level failover for Azure AI Search
from azure.search.documents import SearchClient
from azure.core.credentials import AzureKeyCredential
from azure.core.exceptions import ServiceResponseError, HttpResponseError
import time

class ResilientSearchClient:
    """AI Search client with automatic failover."""

    def __init__(self, primary_endpoint, secondary_endpoint, index_name, api_key):
        self.primary = SearchClient(
            endpoint=primary_endpoint,
            index_name=index_name,
            credential=AzureKeyCredential(api_key)
        )
        self.secondary = SearchClient(
            endpoint=secondary_endpoint,
            index_name=index_name,
            credential=AzureKeyCredential(api_key)
        )
        self.use_primary = True
        self.failover_time = None
        self.health_check_interval = 60  # sekunder

    def search(self, search_text, **kwargs):
        """Search with automatic failover."""
        client = self.primary if self.use_primary else self.secondary

        try:
            results = client.search(search_text=search_text, **kwargs)
            # Sjekk om vi kan falle tilbake til primær
            if not self.use_primary and self._should_check_primary():
                self._try_failback()
            return results

        except (ServiceResponseError, HttpResponseError) as e:
            if self.use_primary:
                print(f"Primary search failed, failing over: {e}")
                self.use_primary = False
                self.failover_time = time.time()
                return self.secondary.search(search_text=search_text, **kwargs)
            else:
                raise  # Begge regioner feiler

    def _should_check_primary(self):
        """Check if enough time has passed to try primary again."""
        if self.failover_time is None:
            return False
        return time.time() - self.failover_time > self.health_check_interval

    def _try_failback(self):
        """Attempt to fail back to primary region."""
        try:
            self.primary.search(search_text="*", top=1)
            self.use_primary = True
            self.failover_time = None
            print("Failback to primary successful")
        except Exception:
            pass  # Primær er fortsatt nede

Holde indekser synkroniserte

Synkroniseringstrategier

Strategi Forsinkelse Kompleksitet Anbefalt for
Dual push (samtidige) ~0 Middels Sanntidskritiske data
Event-driven sync Sekunder Middels Generelt anbefalt
Scheduled indexer 560 min Lav Batch-baserte oppdateringer
Full rebuild Timer Lav Sjeldne endringer

Event-driven synkronisering med Event Grid

# Sett opp Event Grid for blob-endringer → trigger dual indexing
az eventgrid event-subscription create \
  --name "blob-change-to-search-sync" \
  --source-resource-id "/subscriptions/{sub}/resourceGroups/rg-ai-prod/providers/Microsoft.Storage/storageAccounts/staiprod" \
  --included-event-types "Microsoft.Storage.BlobCreated" "Microsoft.Storage.BlobDeleted" \
  --endpoint-type "azurefunction" \
  --endpoint "/subscriptions/{sub}/resourceGroups/rg-ai-prod/providers/Microsoft.Web/sites/func-search-sync/functions/SyncToSecondary"

Indeks-konsistensvalidering

# Periodisk validering av indekskonsistens mellom regioner
import requests

def validate_index_consistency(primary_endpoint, secondary_endpoint, index_name, api_key):
    """Compare document counts and sample documents between regions."""
    headers = {"api-key": api_key, "Content-Type": "application/json"}

    # Sammenlign dokumenttellinger
    primary_count = requests.get(
        f"{primary_endpoint}/indexes/{index_name}/docs/$count?api-version=2024-07-01",
        headers=headers
    ).json()

    secondary_count = requests.get(
        f"{secondary_endpoint}/indexes/{index_name}/docs/$count?api-version=2024-07-01",
        headers=headers
    ).json()

    drift = abs(primary_count - secondary_count)
    drift_pct = (drift / max(primary_count, 1)) * 100

    return {
        "primary_count": primary_count,
        "secondary_count": secondary_count,
        "drift": drift,
        "drift_percentage": round(drift_pct, 2),
        "in_sync": drift_pct < 1.0  # < 1% avvik er akseptabelt
    }

Query-ytelse i multi-region oppsett

Latensoptimalisering

Strategi Latensreduksjon Merknad
Latency-based routing 2050 ms Brukere sendes til nærmeste region
Semantic caching 8095% Cache hyppige spørringer i APIM
Read replicas (intra-region) 1030 ms Fordel lesninger over replikaer
Query optimalisering Varierer $select, $top for å redusere payload

Azure API Management for caching og routing

<!-- APIM Policy: Cache og failover for AI Search queries -->
<policies>
  <inbound>
    <base />
    <cache-lookup vary-by-query-parameter="search,filter,top,skip"
                  caching-type="internal" />
  </inbound>
  <backend>
    <retry condition="@(context.Response.StatusCode >= 500)"
           count="1" interval="0" first-fast-retry="true">
      <choose>
        <when condition="@(context.Variables.GetValueOrDefault<bool>("usePrimary", true))">
          <set-backend-service
            base-url="https://search-primary-norwayeast.search.windows.net" />
        </when>
        <otherwise>
          <set-backend-service
            base-url="https://search-secondary-swedencentral.search.windows.net" />
        </otherwise>
      </choose>
    </retry>
  </backend>
  <outbound>
    <cache-store duration="300" />
    <base />
  </outbound>
</policies>

Referanser

For Cosmo

  • Bruk denne referansen når kunden bygger RAG-løsninger med Azure AI Search og trenger geo-redundans for search-indeksene.
  • Azure AI Search har INGEN innebygd geo-replikering — dette er en vanlig misforståelse. Kunden må implementere dual-indexing selv.
  • Anbefal minimum 3 replikaer per region for 99.99% SLA og AZ-redundans — 2 replikaer gir kun 99.9%.
  • For kostnadsoptimalisering: Sekundær region kan kjøre med 2 replikaer og skalere opp til 3 ved failover via Azure Automation.
  • Indekskonsistens bør valideres automatisk — sett opp periodisk sjekk av dokumenttelling og samplingsbasert innholdsvalidering.