“Where does our data live?” is the first question every European CTO, DPO, and procurement team asks when evaluating AI solutions. And it’s the right question – because for AI systems processing personal data of EU residents, data residency isn’t optional. It’s a legal requirement with real consequences.
This guide covers what data residency means for AI systems, what the law actually requires, and how to make practical infrastructure decisions that satisfy compliance without over-engineering.
What Data Residency Actually Means
Data residency refers to the physical and legal jurisdiction where data is stored and processed. For GDPR purposes, this means:
- Storage at rest – Where are your databases, file systems, and backups physically located?
- Processing – Where does computation happen? (This includes LLM inference, vector search, and data transformation)
- Transit – What path does data take between components?
- Backups and replicas – Where are your disaster recovery copies?
A common mistake: assuming that using an “EU region” on a cloud provider automatically guarantees full data residency. It doesn’t – you need to verify that every component in your pipeline, including logging, monitoring, and backups, stays within the jurisdiction.
The Legal Framework
GDPR Articles 44-49: International Transfers
GDPR Chapter V (Articles 44-49) governs the transfer of personal data to countries outside the EU/EEA. The key rules:
Adequacy decisions (Article 45): Some countries have been deemed to offer adequate data protection. Transfers to these countries are permitted without additional safeguards. Current adequacy countries include:
- UK, Switzerland, Japan, South Korea, Canada (commercial), Israel, New Zealand, Argentina, Uruguay
- USA (under the EU-US Data Privacy Framework since July 2023)
Standard Contractual Clauses (Article 46): For transfers to non-adequate countries, you can use EU-approved contractual clauses. However, since the Schrems II ruling, these may require additional technical measures.
Binding Corporate Rules (Article 47): For intra-group transfers within multinational companies. Complex to set up but provides ongoing transfer authorisation.
The Schrems II Ruling
The 2020 Schrems II ruling invalidated the EU-US Privacy Shield and raised the bar for international data transfers. Key implications for AI systems:
- Transfer Impact Assessments (TIAs) are required – You must assess whether the destination country’s laws provide adequate protection
- Technical measures may be needed – Encryption, pseudonymisation, and contractual guarantees
- US government surveillance laws (FISA 702, EO 12333) were specifically flagged as problematic
The EU-US Data Privacy Framework (2023) addressed some Schrems II concerns for US transfers, but its long-term stability is uncertain. Engineering teams should design for the possibility that the framework could be challenged again.
Practical Implication
The safest approach for AI systems processing EU personal data is to keep everything within the EU. This eliminates the need for transfer impact assessments, additional safeguards, and ongoing monitoring of adequacy decisions.
EU Cloud Regions for AI Workloads
AWS
| Region | Location | AI Services Available |
|---|---|---|
| eu-central-1 | Frankfurt, Germany | Bedrock, SageMaker, Comprehend, Textract, Lambda |
| eu-west-1 | Ireland | Full service availability |
| eu-west-2 | London | Most services (note: UK is post-Brexit, separate adequacy) |
| eu-west-3 | Paris | Limited AI services |
| eu-north-1 | Stockholm | Growing AI service availability |
| eu-south-1 | Milan | Limited AI services |
Recommendation: eu-central-1 (Frankfurt) for AI workloads – best service availability and German jurisdiction signals strong compliance intent to regulators.
For a detailed technical guide on building AI pipelines on AWS Frankfurt, see our AWS GDPR-compliant AI pipeline guide.
Microsoft Azure
| Region | Location | AI Services Available |
|---|---|---|
| West Europe | Netherlands | Azure OpenAI, Cognitive Services, ML |
| North Europe | Ireland | Azure OpenAI, Cognitive Services, ML |
| Germany West Central | Frankfurt | Azure OpenAI (limited), ML |
| France Central | Paris | Cognitive Services, ML |
| Sweden Central | Gävle | Azure OpenAI, ML |
Recommendation: West Europe (Netherlands) or Sweden Central for Azure OpenAI workloads.
Google Cloud
| Region | Location | AI Services Available |
|---|---|---|
| europe-west1 | Belgium | Vertex AI, Cloud AI APIs |
| europe-west2 | London | Vertex AI (limited) |
| europe-west3 | Frankfurt | Vertex AI, Cloud AI APIs |
| europe-west4 | Netherlands | Vertex AI, Cloud AI APIs |
| europe-north1 | Finland | Limited AI services |
Recommendation: europe-west1 (Belgium) or europe-west3 (Frankfurt) for Vertex AI workloads.
Architecture Decisions for Data Residency
Decision 1: External LLM APIs vs. Self-Hosted Models
| Factor | External APIs (Bedrock, Azure OpenAI) | Self-Hosted (SageMaker, GKE) |
|---|---|---|
| Data residency | EU region available, but data processed by cloud provider | Full control – data never leaves your infrastructure |
| DPA required | Yes – with the cloud provider | Only with cloud infra provider |
| Model quality | Frontier models available | Open-source models (Llama, Mistral) |
| Cost | Pay per token | Fixed infra cost (GPU instances) |
| Compliance documentation | Provider supplies DPA, security docs | You document everything yourself |
For most enterprises: Start with managed EU-hosted APIs (Bedrock in eu-central-1, Azure OpenAI in West Europe). The data residency guarantees are contractually backed, and the providers handle SOC 2, ISO 27001, and other certifications.
For highly regulated industries (healthcare, finance): Consider self-hosted models for the most sensitive workloads, with managed APIs for less sensitive tasks.
Decision 2: Vector Database Location
If you’re building RAG (Retrieval-Augmented Generation) pipelines, your vector database contains embeddings derived from your data. These embeddings can potentially be reversed to reveal information about the source data, so they must be treated as personal data if the source data was personal.
EU-hosted options:
- Managed: AWS OpenSearch (eu-central-1), Azure Cognitive Search (West Europe), GCP Matching Engine (europe-west1)
- Self-hosted: Weaviate, Qdrant, Milvus on EU instances
- SaaS: Check provider’s EU region availability (varies – Pinecone has limited EU presence)
Decision 3: Monitoring and Logging
Often overlooked: your monitoring and logging infrastructure must also respect data residency.
- Datadog – EU data centre available (EU1)
- Sentry – EU data hosting available
- Grafana Cloud – EU-hosted option available
- AWS CloudWatch – stays in-region by default, but verify cross-region replication is disabled
- ELK Stack – self-hosted on EU instances
Decision 3: Backups and Disaster Recovery
Your DR strategy must keep data within EU jurisdictions:
- Configure cross-region replication only to other EU regions
- Ensure backup encryption keys are also in EU regions (AWS KMS, Azure Key Vault)
- Test your DR runbooks to verify no data leaks to non-EU regions during failover
Data Residency Documentation
Your DPO and auditors will need documentation proving data residency. Prepare:
Data Flow Diagrams
Visual maps showing:
- Where data enters your system
- Every service that processes it (with region annotations)
- Where data is stored at rest
- Network paths between components
- External services and their data processing locations
Processing Activity Records (Article 30)
GDPR requires written records of processing activities, including:
- Categories of data processed
- Purpose of processing
- Recipients (including cloud providers)
- Transfer safeguards
- Retention periods
Technical Measures Documentation
- Region-locking configurations (SCPs, policies)
- Encryption implementation details
- Access control architecture
- Audit logging configuration
Common Mistakes
1. Assuming “EU region” means complete data residency
Cloud providers may route metadata, DNS queries, or control plane traffic through non-EU infrastructure. Review your provider’s data processing documentation carefully.
2. Using SaaS tools without checking their data location
That AI evaluation tool, annotation platform, or analytics service might process data in the US. Audit every tool in your pipeline.
3. Developer access from non-EU locations
If your engineers SSH into EU servers from India or the US, they’re technically accessing personal data from a non-EU jurisdiction. Implement technical controls (bastion hosts, VPN with EU endpoints) or ensure proper transfer safeguards.
4. Forgetting about model training data
If you fine-tune models on EU personal data, the training process and resulting model weights must also stay in the EU.
5. CDN caching
Content delivery networks may cache AI responses containing personal data at edge locations worldwide. Configure your CDN to exclude API responses or restrict edge locations to EU POPs.
The Cost of Getting It Wrong
GDPR fines for unlawful international transfers are substantial:
- Meta (2023): €1.2 billion for transferring EU user data to the US
- Amazon (2021): €746 million for GDPR violations including data processing
- WhatsApp (2021): €225 million for transparency failures
For smaller companies, fines are proportional but still significant – up to 4% of global annual turnover.
Beyond fines, data residency failures can:
- Disqualify you from enterprise procurement processes
- Trigger data protection authority investigations
- Damage trust with EU customers and partners
- Require costly re-architecture under time pressure
Getting Data Residency Right from Day One
The cheapest time to implement data residency is at the start of a project. Retrofitting data residency into an existing AI system typically costs 3-5x more than building it in from the beginning – you’re re-architecting infrastructure, migrating data, and re-certifying compliance documentation.
For a broader view of compliance requirements for AI in Europe, read our EU AI Act compliance checklist and our guide to GDPR-compliant GenAI integration.
See data residency requirements by industry:
- Data residency for healthcare AI — Patient data and clinical systems
- Data residency for fintech — Financial data and regulatory requirements
- Data residency for SaaS — Multi-tenant EU-hosted infrastructure
At HASORIX, we architect AI systems for data residency from day one – EU-hosted infrastructure, encryption everywhere, and documentation your DPO will actually use. Let’s talk about your project.