Training in the Sovereign Cloud, Deploying at the Edge: Part 1

Training in the Sovereign Cloud, Deploying at the Edge: Part 1

New AI regulations in the US and EU, along with data privacy laws like the EU’s GDPR and rulings like Schrems II, are indirectly affecting where AI models can be trained and where inference can occur. These laws dictate how (and whether) AI data can move between jurisdictions, with data residency requirements in countries like China (under the 2017 Cybersecurity Law and 2021 Data Security Law) further restricting data transfers. Organizations must carefully consider where they train and run their AI models, particularly when personal data is involved, to ensure compliance with data processing regulations.

These factors require location-aware dedicated clouds or edge computing to ensure compliance without sacrificing performance. Location-aware clouds facilitate AI model training and inference in jurisdictions where data residency, sovereignty, and privacy laws are observed. Edge computing allows for distributing AI workloads across multiple geographic locations, ensuring compliance with local laws and regulations while minimizing latency.

This approach is essential for businesses operating in multiple countries, as it ensures data sovereignty and residency requirements are met without having to sacrifice speed, agility, or data protection. Additionally, it provides flexibility by decentralizing infrastructure, which can be more cost-effective than building large, dedicated data centers.

Let’s dig into these terms in more detail to understand why your AI training and inference locations matter and how you can easily control them.

What Does AI Training in a Sovereign Cloud Mean?

A sovereign cloud is a specific kind of private cloud that focuses on geographical location and compliance with local laws and regulations. The different principles of sovereign clouds relate to AI training in the following ways:

  • Data sovereignty refers to the principle that data is subject to the laws of the country where it is generated. For AI, this means that the training data must comply with the regulations of the originating country. For example, under the EU’s General Data Protection Regulation (GDPR), the location where data is generated or collected dictates how it can be used for AI training. With the introduction of the EU AI Act, AI models must adhere to these laws, meaning the source of training data determines the legal constraints applied to the entire AI lifecycle.
  • Data residency defines that data is governed by the laws of the country where it is stored, regardless of where it was generated. This can have a significant impact on AI models, especially when law enforcement or regulatory authorities require access to stored data. For example, if AI training data is stored in a specific country, local authorities may request access under legal frameworks. Therefore, data residency ensures that AI model training and data storage remain within the legal boundaries of the region, giving local regulators the power to enforce their laws.
  • Operational sovereignty focuses on how cloud infrastructure and operational processes are managed, subject to local regulations. It’s not just about data location but also about how systems handle data. In AI training, this means organizations must ensure their operational processes, like data backups, disaster recovery, and system resilience, comply with the laws of the region where the cloud resources are located. Ensuring operational sovereignty allows local authorities access to models and systems during critical situations, further embedding AI operations within the legal structure of the country.
  • Digital sovereignty addresses permission management and access control—who has access to the data and systems and how that access is audited. In terms of AI training, this means strict controls over who can interact with the models and their training data. Even if data complies with residency and sovereignty laws, poor access control could lead to unauthorized use or breaches. With digital sovereignty, organizations must enforce stringent access policies and maintain robust audit trails to ensure that data is protected and compliant with national security concerns. For AI, this means ensuring that sensitive data used for training is accessed only by authorized personnel and that access is thoroughly logged.

AI training in a sovereign cloud ensures compliance with local data laws, safeguards sensitive information, and provides organizations with full control over their operations. AI models can be developed securely and legally, reducing regulatory risks and helping to ensure that sensitive data is handled appropriately.

What Are the Options for Sovereign Cloud AI Training?

To enforce location-bound AI training, organizations have two main options: a dedicated private cloud data center within the country of operation or the use of edge computing to ensure training happens at the location closest to the data’s origin.

  1. Dedicated private cloud data centers: Running a private data center within the country provides organizations with complete control over their AI training environment. This option adheres to sovereign cloud principles by ensuring all training data remains within the country’s borders, which can be crucial for compliance with local regulations. While the costs of building and maintaining a dedicated data center can be significant, this approach offers centralized management, minimizing the need for software refactoring or adaptations. For large-scale AI projects or those requiring heightened security and compliance, the centralized control of a dedicated cloud can be a worthwhile investment, ensuring data never leaves the country while keeping the AI training process localized and secure.
  2. Edge computing: Edge computing is a more cost-effective alternative, offering a distributed approach to AI training and data storage. It enables organizations to train AI models closer to the source of the data without building an entire data center. Although edge computing requires software to be refactored to handle its decentralized nature, it offers significant flexibility. Training and storing data at edge locations within the country of origin ensures compliance with data sovereignty and residency laws while reducing latency and improving performance. For organizations looking to balance regulatory compliance with scalability and cost, edge computing provides a flexible and efficient solution for location-bound AI training.

Read on for part two on the benefits of deploying AI models at the edge.

Training in the Sovereign Cloud, Deploying at the Edge: Part 1

Subscribe
to our newsletter

Get the latest industry trends, exclusive insights, and Gcore
updates delivered straight to your inbox.