“Where does our data live?” is the first question every European CTO, DPO, and procurement team asks when evaluating AI solutions. And it’s the right question – because for AI systems processing personal data of EU residents, data residency isn’t optional. It’s a legal requirement with real consequences.

This guide covers what data residency means for AI systems, what the law actually requires, and how to make practical infrastructure decisions that satisfy compliance without over-engineering.

What Data Residency Actually Means

Data residency refers to the physical and legal jurisdiction where data is stored and processed. For GDPR purposes, this means:

  • Storage at rest – Where are your databases, file systems, and backups physically located?
  • Processing – Where does computation happen? (This includes LLM inference, vector search, and data transformation)
  • Transit – What path does data take between components?
  • Backups and replicas – Where are your disaster recovery copies?

A common mistake: assuming that using an “EU region” on a cloud provider automatically guarantees full data residency. It doesn’t – you need to verify that every component in your pipeline, including logging, monitoring, and backups, stays within the jurisdiction.

GDPR Articles 44-49: International Transfers

GDPR Chapter V (Articles 44-49) governs the transfer of personal data to countries outside the EU/EEA. The key rules:

Adequacy decisions (Article 45): Some countries have been deemed to offer adequate data protection. Transfers to these countries are permitted without additional safeguards. Current adequacy countries include:

  • UK, Switzerland, Japan, South Korea, Canada (commercial), Israel, New Zealand, Argentina, Uruguay
  • USA (under the EU-US Data Privacy Framework since July 2023)

Standard Contractual Clauses (Article 46): For transfers to non-adequate countries, you can use EU-approved contractual clauses. However, since the Schrems II ruling, these may require additional technical measures.

Binding Corporate Rules (Article 47): For intra-group transfers within multinational companies. Complex to set up but provides ongoing transfer authorisation.

The Schrems II Ruling

The 2020 Schrems II ruling invalidated the EU-US Privacy Shield and raised the bar for international data transfers. Key implications for AI systems:

  1. Transfer Impact Assessments (TIAs) are required – You must assess whether the destination country’s laws provide adequate protection
  2. Technical measures may be needed – Encryption, pseudonymisation, and contractual guarantees
  3. US government surveillance laws (FISA 702, EO 12333) were specifically flagged as problematic

The EU-US Data Privacy Framework (2023) addressed some Schrems II concerns for US transfers, but its long-term stability is uncertain. Engineering teams should design for the possibility that the framework could be challenged again.

Practical Implication

The safest approach for AI systems processing EU personal data is to keep everything within the EU. This eliminates the need for transfer impact assessments, additional safeguards, and ongoing monitoring of adequacy decisions.

EU Cloud Regions for AI Workloads

AWS

RegionLocationAI Services Available
eu-central-1Frankfurt, GermanyBedrock, SageMaker, Comprehend, Textract, Lambda
eu-west-1IrelandFull service availability
eu-west-2LondonMost services (note: UK is post-Brexit, separate adequacy)
eu-west-3ParisLimited AI services
eu-north-1StockholmGrowing AI service availability
eu-south-1MilanLimited AI services

Recommendation: eu-central-1 (Frankfurt) for AI workloads – best service availability and German jurisdiction signals strong compliance intent to regulators.

For a detailed technical guide on building AI pipelines on AWS Frankfurt, see our AWS GDPR-compliant AI pipeline guide.

Microsoft Azure

RegionLocationAI Services Available
West EuropeNetherlandsAzure OpenAI, Cognitive Services, ML
North EuropeIrelandAzure OpenAI, Cognitive Services, ML
Germany West CentralFrankfurtAzure OpenAI (limited), ML
France CentralParisCognitive Services, ML
Sweden CentralGävleAzure OpenAI, ML

Recommendation: West Europe (Netherlands) or Sweden Central for Azure OpenAI workloads.

Google Cloud

RegionLocationAI Services Available
europe-west1BelgiumVertex AI, Cloud AI APIs
europe-west2LondonVertex AI (limited)
europe-west3FrankfurtVertex AI, Cloud AI APIs
europe-west4NetherlandsVertex AI, Cloud AI APIs
europe-north1FinlandLimited AI services

Recommendation: europe-west1 (Belgium) or europe-west3 (Frankfurt) for Vertex AI workloads.

Architecture Decisions for Data Residency

Decision 1: External LLM APIs vs. Self-Hosted Models

FactorExternal APIs (Bedrock, Azure OpenAI)Self-Hosted (SageMaker, GKE)
Data residencyEU region available, but data processed by cloud providerFull control – data never leaves your infrastructure
DPA requiredYes – with the cloud providerOnly with cloud infra provider
Model qualityFrontier models availableOpen-source models (Llama, Mistral)
CostPay per tokenFixed infra cost (GPU instances)
Compliance documentationProvider supplies DPA, security docsYou document everything yourself

For most enterprises: Start with managed EU-hosted APIs (Bedrock in eu-central-1, Azure OpenAI in West Europe). The data residency guarantees are contractually backed, and the providers handle SOC 2, ISO 27001, and other certifications.

For highly regulated industries (healthcare, finance): Consider self-hosted models for the most sensitive workloads, with managed APIs for less sensitive tasks.

Decision 2: Vector Database Location

If you’re building RAG (Retrieval-Augmented Generation) pipelines, your vector database contains embeddings derived from your data. These embeddings can potentially be reversed to reveal information about the source data, so they must be treated as personal data if the source data was personal.

EU-hosted options:

  • Managed: AWS OpenSearch (eu-central-1), Azure Cognitive Search (West Europe), GCP Matching Engine (europe-west1)
  • Self-hosted: Weaviate, Qdrant, Milvus on EU instances
  • SaaS: Check provider’s EU region availability (varies – Pinecone has limited EU presence)

Decision 3: Monitoring and Logging

Often overlooked: your monitoring and logging infrastructure must also respect data residency.

  • Datadog – EU data centre available (EU1)
  • Sentry – EU data hosting available
  • Grafana Cloud – EU-hosted option available
  • AWS CloudWatch – stays in-region by default, but verify cross-region replication is disabled
  • ELK Stack – self-hosted on EU instances

Decision 3: Backups and Disaster Recovery

Your DR strategy must keep data within EU jurisdictions:

  • Configure cross-region replication only to other EU regions
  • Ensure backup encryption keys are also in EU regions (AWS KMS, Azure Key Vault)
  • Test your DR runbooks to verify no data leaks to non-EU regions during failover

Data Residency Documentation

Your DPO and auditors will need documentation proving data residency. Prepare:

Data Flow Diagrams

Visual maps showing:

  • Where data enters your system
  • Every service that processes it (with region annotations)
  • Where data is stored at rest
  • Network paths between components
  • External services and their data processing locations

Processing Activity Records (Article 30)

GDPR requires written records of processing activities, including:

  • Categories of data processed
  • Purpose of processing
  • Recipients (including cloud providers)
  • Transfer safeguards
  • Retention periods

Technical Measures Documentation

  • Region-locking configurations (SCPs, policies)
  • Encryption implementation details
  • Access control architecture
  • Audit logging configuration

Common Mistakes

1. Assuming “EU region” means complete data residency

Cloud providers may route metadata, DNS queries, or control plane traffic through non-EU infrastructure. Review your provider’s data processing documentation carefully.

2. Using SaaS tools without checking their data location

That AI evaluation tool, annotation platform, or analytics service might process data in the US. Audit every tool in your pipeline.

3. Developer access from non-EU locations

If your engineers SSH into EU servers from India or the US, they’re technically accessing personal data from a non-EU jurisdiction. Implement technical controls (bastion hosts, VPN with EU endpoints) or ensure proper transfer safeguards.

4. Forgetting about model training data

If you fine-tune models on EU personal data, the training process and resulting model weights must also stay in the EU.

5. CDN caching

Content delivery networks may cache AI responses containing personal data at edge locations worldwide. Configure your CDN to exclude API responses or restrict edge locations to EU POPs.

The Cost of Getting It Wrong

GDPR fines for unlawful international transfers are substantial:

  • Meta (2023): €1.2 billion for transferring EU user data to the US
  • Amazon (2021): €746 million for GDPR violations including data processing
  • WhatsApp (2021): €225 million for transparency failures

For smaller companies, fines are proportional but still significant – up to 4% of global annual turnover.

Beyond fines, data residency failures can:

  • Disqualify you from enterprise procurement processes
  • Trigger data protection authority investigations
  • Damage trust with EU customers and partners
  • Require costly re-architecture under time pressure

Getting Data Residency Right from Day One

The cheapest time to implement data residency is at the start of a project. Retrofitting data residency into an existing AI system typically costs 3-5x more than building it in from the beginning – you’re re-architecting infrastructure, migrating data, and re-certifying compliance documentation.

For a broader view of compliance requirements for AI in Europe, read our EU AI Act compliance checklist and our guide to GDPR-compliant GenAI integration.


See data residency requirements by industry:

At HASORIX, we architect AI systems for data residency from day one – EU-hosted infrastructure, encryption everywhere, and documentation your DPO will actually use. Let’s talk about your project.