Radar has landed - discover the latest DDoS attack trends. Get ahead, stay protected.Get the report
Under attack?

Products

Solutions

Resources

Partners

Why Gcore

  1. Home
  2. Blog
  3. Exploring the Benefits of Cloud Development

Exploring the Benefits of Cloud Development

  • By Gcore
  • October 31, 2023
  • 9 min read
Exploring the Benefits of Cloud Development

Cloud development allows you to write, debug, and run code directly in the cloud infrastructure, rather than working in the local environment and subsequently uploading to the cloud. This streamlines the development process, allowing you to deploy and update applications faster. This article explains what cloud development is, what tools can help you with it, and how you can use cloud development to develop and update your application to meet your customers’ needs.

What Is Cloud Development?

Cloud development is the practice of creating and managing applications that run on remote servers, allowing users to access them over the internet.

Every application is made up of different types of services, such as backend services, frontend services, and monitoring services. Normally, without cloud development, creating a new service or updating an existing service means writing and running your code in the local environment first. After ensuring your service works as expected, you then push your code to the cloud environment, and run it there. Finally, you publish your service and integrate it with the app. This process is time-consuming, and requires sufficiently powerful computing resources to run the service on the local machine.

This is when cloud development proves its value. With cloud development, you write your code directly in the cloud infrastructure. After you have finished writing your code, publishing your service takes just one click.

Diagram comparing the difference in steps between traditional development vs. cloud development

Useful Cloud Development Tools

To apply cloud development to your projects, several tools are required to build and manage the code efficiently:

  • Code editor to help you write code more efficiently
  • Version control tool to manage the code changes
  • Compiler/interpreter to run your code
  • Publisher to allow public use of your application

Let’s learn about each of the tools in more detail.

Code Editor

A code editor is a software tool that supports code highlighting, easy navigation within the codebase, test execution, and debugging capabilities. When you’re working on applications that are hosted in a cloud environment, it’s essential for your code editor to support remote access because the code you’re working on is stored on a remote cloud server.

Remote access support enables you to establish an SSH connection between your local code editor and the remote server. You can then use your code editor to create, view, and modify code files as if they were local.

Popular code editors—like Visual Studio Code and JetBrains IDE—have features that support remote development. For example, with Visual Studio Code, you can install Microsoft’s “Remote – SSH” extension to enable remote code access via SSH. Once the extension is installed, you can connect to the remote server by entering its IP address, username, and password, and work on your cloud-based projects just as easily as you would local ones.

Below is an example of using Visual Studio Code to access code via a remote machine.

Example of using Visual Studio Code to access code via a remote machine

Version Control

In software development, it’s common for many people to be working on the same code base. Having a version control tool allows you to review who made the changes, and when, to certain lines of code so that you can trace any problems back to their source. Having a version control tool also allows you to revert your code to the version you want in the instance that new code introduces bugs.

There are several version control tools out there, such as Git, SVN, and Mercurial; Git is currently the most popular. Git is an open-source version control system that allows you to manage the changes in the code. It is distributed software, meaning that you create a local copy of the Git repository on your local machine, and then create new branches, add files, commit, and merge locally. When your code is ready to ship, you then push your code to the Git repository on the server.

Compiler/Interpreter

After tools that help you to write and track changes in the code, the next most important tool required to run code in the cloud is a compiler or interpreter. You need either a compiler or an interpreter to translate your code into machine code, depending on the programming languages or the runtime you are working on. They allow the computer to understand and execute your instructions. Generally speaking, the compiler or interpreter helps to build your code into executable files. Let’s look at each in turn to understand their differences.

Compiler

Before a service actually runs, a compiler translates the high-level code that you have written into low-level code. For example, a Java compiler compiles source code to the bytecode first, then the Java Virtual Machine will interpret and convert the bytecode to a machine-code executable file. As a result, compilers require time to analyze the source code. However, they also show obvious syntax errors, so you don’t need to spend a lot of time debugging your service when it’s running.

The programming languages that use compilers to translate code are Java, C#, and Go.

Interpreter

Unlike a compiler, an interpreter only translates written code into machine code when the service runs. As a result, the interpreter does not take time to compile the code. Instead, the code is executed right away. However, an application using an interpreter is often slower than one using a compiler because the interpreter executes the code line by line.

The programming languages that use interpreters are Python, Javascript, and Ruby.

Publisher

To allow other users to access your service, you need a publisher tool. This manages the following key aspects of your service:

  • Configuring the network
  • Creating the domain name
  • Managing the scalability

Network Configuration

To allow users to access your service, the network configuration is crucial. The method for making your service available online varies based on your technology stack. For instance, if you use the Next.js framework to build your web application, you can choose Vercel to deploy your application code.

You can also customize the behavior of your application with network configuration. Here’s an example of how to use the vercel.json file to redirect requests from one path to another:

{  "redirects": [    { "source": "/book", "destination": "/api/book", "statusCode": 301 }  ]}

Domain Setting

Every service requires a URL for applications to interact with it. However, using direct IP addresses as URLs can be complex and unwieldy, so it’s advisable to assign a domain name to your service, like www.servicedomain.com. Various platforms, such as GoDaddy or SquareSpace, offer domain registration services for this purpose.

Scalability

To allow your service to handle more requests from your users, you need to define a scalability mechanism for your services. This way, your service will automatically scale according to the workload. Scalability also keeps costs in check; you pay for what you use, rather than wasting money by allocating resources based on peak usage.

Below is an example definition file for applying autoscaling to your service, using Kubernetes HorizontalPodAutoscaler.

apiVersion: autoscaling/v1        kind: HorizontalPodAutoscaler        metadata:        name: app        spec:        scaleTargetRef:            apiVersion: apps/v1            kind: Deployment            name: appdeploy        minReplicas: 1        maxReplicas: 10        targetCPUUtilizationPercentage: 70

How to Run Code In the Cloud

Now that you are familiar with the tools you need for cloud development, let’s learn about how to run code in the cloud. There are two ways to run code in the cloud: using virtual machines or using containers. We explain the difference in depth in our dedicated article, but let’s review their relevance to cloud development here.

Virtual Machines

A virtual machine (VM) is like a computer that runs within a computer. It mimics a standalone system, with its own virtual hardware and software. Since a VM is separate from its host computer, you can pick the VM operating system that suits your needs without affecting your host’s system. Plus, its isolation offers an extra layer of security: if one VM is compromised, the others remain unaffected.

Architecture of a VM, which includes a guest OS

While VMs offer versatility in terms of OS choices for cloud development, scaling applications on VMs tends to be more challenging and costly compared to using containers. This is because each VM runs a full operating system, leading to higher resource consumption and longer boot-up times. Containers, on the other hand, share the host OS and isolate only the application environment, making them more lightweight and quicker to scale up or down.

Containers

A container is a software unit that contains a set of software packages and other dependencies. Since it uses the host operating system’s kernel and hardware, it doesn’t possess its own dedicated resources as a virtual machine does. As a result, it’s more lightweight and takes less time to start up. For instance, an e-commerce application can have thousands of containers for its backend and frontend services. This allows the application to easily scale out when needed by increasing the number of containers for its services.

Architecture of a container, which is more lightweight than VM architecture due to the lack of guest OS

Using containers for cloud code enables efficient resource optimization and ease of scaling due to their lightweight nature. However, you have limited flexibility in choosing the operating system, as most containers are Linux-based.

Cloud Development Guide

We’ve addressed cloud development tools and ways to run code in the cloud. In this section, we offer a step-by-step guide to using cloud development for your project.

Check Computing Resources

Before anything else, you’ll need the correct computing resources to power your service. This includes deciding between virtual machines and containers. If your service tends to have a fixed number of user requests everyday, or it needs a specific operating system like Mac or Windows in order to run, go with virtual machines. If you expect your service to experience a wide range in the number of user requests and you want it to be scalable to optimize operational costs, go with containers.

After choosing between virtual machines and containers, you need to allocate computing resources for use. The important resources that you need to consider are CPUs, RAM, disk volumes, and GPUs. The specifications for these resources can vary significantly depending on the service you’re developing. For instance, if you’re building a monitoring service with a one-year data retention plan, you’ll need to allocate disk volumes of approximately 100GB to store all generated logs and metrics. If you’re building a service to apply deep learning models to large datasets, you’ll require not only a powerful CPU and ample RAM, but also a GPU.

Install Software Packages and Dependencies

After preparing the computing resources, you’ll next install the necessary software and dependencies. The installation process varies depending on whether you’re using virtual machines or containers.

As a best practice, you should set up the mechanism to install the required dependencies automatically upon initialization of the virtual machine or container. This ensures that your service has all the necessary dependencies to operate immediately upon deployment. Additionally, it facilitates easy redeployment to a different virtual machine or container, if necessary. For example, if you want to install software packages and dependencies for an Ubuntu virtual machine to host a Node.js service, you can configure cloud-init scripts for the deployed virtual machine as below:

#cloud-config...apt:  sources:    docker.list:      source: deb [signed-by=$KEY_FILE] https://deb.nodesource.com/node_18.x $RELEASE main      keyid: 9FD3B784BC1C6FC31A8A0A1C1655A0AB68576280package_update: truepackage_upgrade: truepackages:  - apt-transport-https  - ca-certificates  - gnupg-agent  - software-properties-common  - gnupg  - nodejspower_state:  mode: reboot  timeout: 30  condition: True

To set up a web service on containers using Node.js, you’ll need to install Node along with the required dependencies. Below is a Dockerfile example for doing so:

# Pull the Node.js image version 18 as a base imageFROM node:18# Set the service directoryWORKDIR /usr/src/appCOPY package*.json ./# Install service dependenciesRUN npm install

Write Code

When you’ve installed the necessary software packages and dependencies, you can begin the fun part: writing the code. You can use code editors that support remote access to write the code for your service directly in the cloud. Built-in debugging tools in these editors can help you to identify any issues during this period.

Below is an example of using IntelliJ to debug a Go service for managing the playlists.

Using IntelliJ to debug a Go service

Test Your Service

After you finish writing your code, it’s crucial to test your service. As a best practice, start with unit tests to ensure that individual components work, followed by integration tests to see how your service interacts with existing application services, and finally E2E (end-to-end) tests to assess the overall user experience and system behavior.

Below is a test pyramid that gives a structured overview of each test type’s coverage. This will help you allocate your testing efforts efficiently across unit and integration tests for your service.

Test pyramid demonstrates the proportion that should be allocated to each test

Configure Network Settings

To make your service available to users, you need to configure its network settings. This might involve configuring the rules for inbound and outbound data, creating a domain name for your service, or setting a static IP address for your service.

Here is an example of using cloud-init configuration to set a static IP for a virtual machine that hosts your service:

#cloud-config...write_files:  - content: |        network:            version: 2            renderer: networkd            ethernets:              enp3s0:                addresses:                - 192.170.1.25/24                - 2020:1::1/64                nameservers:                  addresses:                  - 8.8.8.8                  - 8.8.4.4    path: /etc/netplan/00-add-static-ip.yaml    permissions: 0644power_state:  mode: reboot  timeout: 30  condition: True

Add Autoscaling Mechanism

With everything in place, it’s time to add an autoscaling mechanism. This adjusts resources based on demand, which will save costs during quiet times and boost performance during busy periods.

Assuming that you use Kubernetes to manage the containers of your service, the following is an example of using Gcore Managed Kubernetes to set the autoscaling mechanism for your Kubernetes cluster:

Configuring a Gcore Managed Kubernetes cluster to enable cluster autoscaling

Set Up Security Enhancement

Finally, ensure your service is secure. Enhancements can range from setting up robust authentication measures to using tools like API gateways to safeguard your service. You can even set up a mechanism to protect your service from malicious activities such as DDoS attacks.

Below is an example of how to apply security protection to your service by creating a resource for your service URL using Gcore Web Security:

Create a web security resource for the service domain to protect it from attacks

Gcore Cloud Development

Accelerating feature delivery through cloud development can offer a competitive edge. However, the initial setup of tools and environments can be daunting—and mistakes in this phase can undermine the benefits.

Here at Gcore, we recognize these obstacles and offer Gcore Function as a Service (FaaS) as a solution. Gcore FaaS eliminates the complexities of setup, allowing you to dive straight into coding without worrying about configuring code editors, compilers, debuggers, or deployment tools. Ideally suited for straightforward services that require seamless integration with existing applications, Gcore FaaS excels in the following use cases:

  • Real-time stream processing
  • Third-party service integration
  • Monitoring and analytics services

Conclusion

Cloud development allows you to deliver your service to users immediately after you’ve finished the coding and testing phases. You can resolve production issues and implement features faster to better satisfy your customers. However, setting up cloud infrastructure can be time intensive and ideally needs a team of experienced system administrators for building and maintenance.

With Gcore FaaS, you don’t have to take on that challenge yourself. You can focus on writing code, and we’ll handle the rest—from configuring pods and networking to implementing autoscaling. Plus, you are billed only for time your customers actually use your app, ensuring cost effective operation.

Want to try out Gcore FaaS to see how it works? Get started for free.

Related articles

Announcing a new AI-optimized data center in Southern Europe

Good news for businesses operating in Southern Europe! Our newest cloud regions in Sines, Portugal, give you faster, more local access to the infrastructure you need to run advanced AI, ML, and HPC workloads across the Iberian Peninsula and wider region. Sines-2 marks the first region launched in partnership with Northern Data Group, signaling a new chapter in delivering powerful, workload-optimized infrastructure across Europe. And Sines-3 expands capacity and availability for the region.Strategically positioned in Portugal, Sines-2 and Sines-3 enhance coverage in Southern Europe, providing a lower-latency option for customers operating in or targeting this region. With the explosive growth of AI, machine learning, and compute-intensive workloads, these new regions are designed to meet escalating demand with cutting-edge GPU and storage capabilities.You can activate Sines-2 and Sines-3 for GPU Cloud or Everywhere Inference today with just a few clicks.Built for AI, designed to scaleSines-2 and Sines-3 bring with them next-generation infrastructure features, purpose-built for today’s most demanding workloads:NVIDIA H100 GPUs: Unlock the full potential of AI/ML training, high-performance computing (HPC), and rendering workloads with access to H100 GPUs.VAST NFS (file sharing protocol) support: Benefit from scalable, high-throughput file storage ideal for data-intensive operations, research, and real-time AI workflows.IaaS portfolio: Deploy Virtual Machines, manage storage, and scale infrastructure with the same consistency and reliability as in our flagship regions.Organizations operating in Portugal, Spain, and nearby regions can now deploy workloads closer to end users, improving application performance. For finance, healthcare, public sector, and other organisations running sensitive workloads that must stay within a country or region, Sines-2 and Sines-3 are easy ways to access state-of-the-art GPUs with simplified compliance. Whether you're building AI models, running simulations, or managing rendering pipelines, Sines-2 and Sines-3 offer the performance, capacity, availability, and proximity you need.And best of all, servers are available and ready to deploy today.Run your AI workloads in Portugal todayWith these new Sines regions and our partnership with Northern Data Group, we’re making it easier than ever for you to run AI workloads at scale. If you need speed, flexibility, and global reach, we’re ready to power your next AI breakthrough.Unlock the power of Sines-2 and Sines-3 today

GTC Europe 2025: watch Seva Vayner on European AI trends

Inference is becoming Europe’s core AI workload. Telcos are moving fast on low-latency infrastructure. Data sovereignty is shaping every deployment decision.At GTC Europe, these trends were impossible to miss. The conversation has moved beyond experimentation to execution, with exciting, distinctly European priorities shaping conversations.Gcore’s own Seva Vayner, Product Director of Edge Cloud and AI, shared his take on this year’s event during GTC. He sees a clear shift in what European enterprises are asking for and what the ecosystem is ready to deliver.Scroll on to watch the interview and see where AI in Europe is heading.“It’s really a pleasure to see GTC in Europe”After years of global AI strategy being shaped primarily by the US and China, Europe is carving its own path. Seva notes that this year’s GTC Europe wasn’t just a regional spin-off. it marked the emergence of a distinctly European voice in AI development.“First of all, it's really a pleasure to see that GTC in Europe happened, and that a lot of European companies came together to have the conversation and build the ecosystem.”As Seva notes, the real excitement came from watching European players collaborate. The focus was less on following global trends and more on co-creating the region’s own AI trajectory.“Inference workloads will grow significantly in Europe”Inference was a throughline across nearly every session. As Seva points out, Europe is still at the early stages of adopting inference at scale, but the shift is happening fast.“Europe is only just starting its journey into inference, but we already see the trend. Over the next 5 to 10 years, inference workloads will grow significantly. That’s why GTC Europe is becoming a permanent, yearly event.”This growth won’t just be driven by startups. Enterprises, governments, and infrastructure providers are all waking up to the importance of real-time, regional inference capabilities.“There’s real traction. Companies are more and more interested in how to deliver low-latency inference. In a few years, this will be one of the most crucial workloads for any GPU cloud in Europe.”“Telcos are getting serious about AI”One of the clearest signs of maturity at GTC Europe was that telcos and CSPs are actively looking to deploy AI. And they’re asking the hard questions about how to integrate it into their infrastructure at a vast scale.“One of the most interesting things is how telcos are thinking about adopting AI workloads on their infrastructure to deliver low latency. Sovereignty is crucial, especially for customers looking to serve training or inference workloads inside their region. And also user experience: how can I get GPU capacity in clusters, or deliver inference in just a few clicks?”This theme—fast, sovereign, self-service AI—popped up again and again. Telcos and service providers want frictionless deployment and local control.“Companies are struggling most with data”While model deployment and infrastructure strategy took center stage, Seva reminds us that data processing and storage remains the bottleneck. Enterprises know they need to adopt AI, but they’re still navigating where and how to store and process the data that fuels it.“One of the biggest struggles for end customers is the data: where it’s processed, where it’s stored, and what kind of capabilities are available. From a European perspective, we already see more and more companies looking for sovereign data privacy and simple, mature solutions for end users.”That’s a familiar challenge for enterprises operating under GDPR, NIS2, and other compliance frameworks. The new wave of AI infrastructure has to be built for performance and for trust.AI in Europe: responsible, scalable, and localSeva’s key takeaway is that AI in Europe is no longer about catching up, it’s about doing it differently. The questions have changed from “Should we do AI?” to “How do we scale it responsibly, reliably, and locally?”From sovereign deployment to edge-first infrastructure, GTC Europe 2025 showed that inference is the foundation of how European businesses plan to run AI. “The ecosystem is coming together,” explains Seva. “And the next five years will be crucial for defining how AI will work: not just in the cloud, but everywhere.”If you’re looking to reduce latency, cut costs, and stay compliant while deploying AI in production, we invite you to download our free ebook, The inference optimization playbook.Download our free inference optimization playbook

Introducing FastEdge Triggers: real-time edge logic

When you're building real-time applications, whether for streaming platforms, SaaS dashboards, or security-sensitive services, you need content that adapts on the fly. Blocking suspicious IPs, injecting personalized content, transforming media on the edge—these should be fast, scalable, and reliable.Until now, they weren't.Developers and technical teams often had to work across multiple departments to create brittle, hardcoded solutions. Each use case, like watermarking video or rewriting headers, required a custom integration. There was no easy way to run logic dynamically at the edge. That changes with FastEdge Triggers.Real-time logic, built into the edgeFastEdge Triggers let you execute custom serverless logic at key moments in the HTTP lifecycle:on_request_headerson_request_bodyon_response_headerson_response_bodyFastEdge is built on the proxy-wasm standard, making it easy to adapt existing proxy-wasm applications (e.g., for Envoy or Kong) for use with Gcore. These trigger types align directly with proxy-wasm conventions, meaning less friction for developers familiar with modern proxy architectures.This means that you can now:Authenticate users' tokens, such as JWTBlock access by IP, region, or user agentInject CSS, HTML, or JavaScript into responsesTransform images or convert markdown to HTML before deliveryAdd security tokens or watermarks to video contentRewrite or sanitize request headers and bodiesNo backend round-trips. No manual routing. Just real-time, programmable edge behavior, backed by Gcore's global infrastructure.While FastEdge enables instant logic execution at the edge, response-stage triggers (on_response_headers and on_response_body) naturally depend on receiving data from the origin before acting. Even so, transformations happen at the edge, reducing backend load and improving overall efficiency.Our architecture means that FastEdge logic is executed in ultra-low-latency environments, tightly coupled with CDN. Triggers can be layered across multiple stages of a request without performance degradation.Built for developersFastEdge Triggers were built to solve three core pain points for technical teams:Hard to scale: Custom logic used to require bespoke, team-specific workaroundsHard to maintain: Even single-team solutions became brittle without proper edge infrastructureLimited flexibility: Legacy CDN logic couldn't support complex, dynamic behaviorWith FastEdge, developers have full control: no DevOps bottlenecks, no workarounds, no compromises. Logic runs at the edge, not your origin, minimizing backend exposure. FastEdge apps execute in isolated, sandboxed environments, reducing the risk of vulnerabilities that might otherwise be introduced when logic runs on central infrastructure.How it works behind the scenesEach FastEdge application is written in Rust or AssemblyScript and connected to the HTTP request lifecycle through Gcore's configuration interface. Apps are linked to trigger types through the CDN resource settings page in the Gcore Customer Portal.Configuring FastEdge Triggers from the CDN resource settings screen in the Gcore Customer PortalHere's what happens under the hood:You assign a FastEdge app to a trigger point.Our Core Proxy detects that trigger and automatically routes execution through your custom logic.The result is returned before hitting cache or origin, modified, enriched, and secured.This flow is deeply integrated with our CDN, delivering minimal latency with zero friction.A sequence diagram showing how FastEdge Triggers works under the hood A real-life use case: markdown to HTML at the edgeHere's a real-world example that shows how FastEdge Triggers can power multi-step content transformation without a single backend server.One customer wanted to serve Markdown-based documentation as styled HTML, without spinning up infrastructure. Using this FastEdge app written in Rust, they achieved just that.The app listens at three trigger points: on_request_headers, on_response_headers, and on_response_bodyIt detects requests for .md files and converts them on the flyThe HTML is served directly via CDN, no origin compute requiredYou can see it live here:README renderedTerraform docs renderedThis use case showcases FastEdge's ability to orchestrate multi-stage logic at the edge: ideal for serverless documentation, lightweight rendering, or content transformation pipelines.Ready to build smarter at the edge?FastEdge Triggers are available now for all FastEdge customers. If you're looking to modernize your edge logic, simplify architecture, and ship faster with fewer backend dependencies, FastEdge is built for you.Reach out to your account manager or contact us to activate FastEdge Triggers in your environment.Try Fastedge Triggers

Gcore and Orange Business launch innovation program piloting joint solution to deliver sovereign inference as a service

Gcore and Orange Business have kicked off a strategic co-innovation program with the mission to deliver a scalable, production-grade AI inference service that is sovereign by design. By combining Orange Business’ secure, trusted cloud infrastructure and Gcore’s AI inference private deployment service, the collaboration empowers European enterprises and public sector organizations to run inference workloads at scale, without compromising on latency, control, or compliance.Gcore’s AI inference private deployment service is already live on Orange Business’ Cloud Avenue infrastructure. Selected enterprises across industries are actively testing it in real-world scenarios. These pilot customers are exploring how fast, secure, and compliant inference can accelerate their AI projects, cut deployment times, and reduce infrastructure overhead.The prototype will be demonstrated at NVIDIA GTC Paris, at the Taiga Cloud booth G26. Stop by any time to see it in action.The inference supercycle is underwayBy 2030, inference will comprise 70% of enterprise AI workloads. Telcos are well positioned to lead this shift due to their dense edge presence, licensed national data infrastructure, and long-standing trust relationships.Gcore’s inference solution provides a sovereign, edge-native inference layer. It enables users to serve real-time, GPU-intensive applications like agentic AI, trusted LLMs, computer vision, and predictive analytics, all while staying compliant with Europe’s evolving data and AI governance frameworks.From complexity to three clicksEnterprise AI doesn’t need to be hard. Deploying inference workloads at scale used to demand Kubernetes fluency, large MLOps teams, and costly trial-and-error.Now? It’s just three clicks:Pick a model: Choose from NVIDIA NIMs, open source, or proprietary libraries.Choose a region: Select one of Orange Business’ accredited EU data centers.Deploy: See your workloads go live in under 10 seconds.Enterprises can launch inference projects faster, test ideas more quickly, and deliver production-ready AI services without spending months on ML plumbing.Explore our blog to watch a demo showing how enterprises can deploy inference workloads in just three clicks and ten seconds.Sovereign by designAll model data, logs, and inference results are stored exclusively within Orange Business’ own data centers in France, Germany, Norway, and Sweden. Cross-border data transfer is opt-in only, helping ensure alignment with GDPR, sector-specific regulations, and the forthcoming EU AI Act.This platform is built for trust, transparency, and sovereignty by default. Customers maintain full control over their data, with governance baked into every layer of the deployment.Performance without trade-offsGcore’s AI inference solution avoids the latency spikes, cold starts, and resource waste common in traditional cloud AI setups. Key design features include:Smart GPU routing: Directs each request to the nearest in-region GPU, delivering real-time performance with sub-50ms latency.Pre-loaded models: Reduces cold start delays and improves response times.Secure multi-tenancy: Isolates customer data while maximizing infrastructure efficiency.The result is a production-ready inference platform optimized for both performance and compliance.Powering the future of AI infrastructureThis partnership marks a step forward for Europe’s sovereign AI capabilities. It highlights how telcos can serve as the backbone of next-generation AI infrastructure, hosting, scaling, and securing workloads at the edge.With hundreds of edge POPs, trusted national networks, and deep ties across vertical industries, Orange Business is uniquely positioned to support a broad range of use cases, including real-time customer service AI, fraud detection, healthcare diagnostics, logistics automation, and public sector digital services.What’s next: validating real-world performanceThis phase of the Gcore and Orange Business program is focused on validating the solution through live customer deployments and performance benchmarks. Orange Business will gather feedback from early access customers to shape its future sovereign inference service offering. These insights will drive refinements and shape the roadmap ahead of a full commercial launch planned for later this year.Gcore and Orange Business are committed to delivering a sovereign inference service that meets Europe’s highest standards for speed, simplicity, and trust. This co-innovation program lays the foundation for that future.Ready to discover how Gcore and Orange Business can deliver sovereign inference as a service for your business?Request a preview

Why on-premises AI is making a comeback

In recent years, cloud AI infrastructure has soared in popularity. With its scalability and ease of deployment, it’s no surprise that organizations rushed to transfer their data to the cloud in a bid to become “cloud-first.”But now, the tide is turning.As AI workloads grow more complex and regulatory pressures increase, many companies are reconsidering their reliance on cloud and turning back toward on-premises AI infrastructure.Rather than doubling down on the cloud, organizations are diversifying—adopting multi-cloud models, sovereign cloud environments, and even hybrid or fully on-prem setups. The era of a single cloud provider handling everything is coming to an end. Why? Control, security, and performance are hard to find in the public cloud.Here’s why more businesses are bringing AI back in-house.#1 Enhanced data security and controlData security remains one of the most urgent concerns driving the return to on-prem infrastructure.For sensitive or high-priority workloads—common in sectors like finance, healthcare, and government—keeping data off the cloud is often non-negotiable. Cloud computing inherently increases risk by exposing data to shared environments, wider attack surfaces, and complex supply chains.Choosing a trusted cloud provider can mitigate some of those risks. But it can’t replace the peace of mind that comes from keeping sensitive data in-house.With on-premises AI, organizations gain fine-grained access control. Encryption keys remain internal and breach exposure shrinks dramatically. It’s also much easier to stay compliant with privacy laws when data never leaves your own secure perimeter.For industries where trust and confidentiality are everything, on-prem solutions offer full visibility into where and how data is stored and processed.#2 Performance enhancement and latency reductionLatency matters—especially in AI.On-premises AI systems excel in environments that require real-time performance and heavy compute loads. Processing data locally avoids the physical delays caused by transferring it across the internet to a cloud data center.By eliminating long-haul network hops, companies get near-instant access to computing resources. They also get to fine-tune their internal networks—using private fiber, low-hop switching, and other low-latency optimizations that cloud customers can’t control.Unlike multi-tenant cloud platforms, on-prem resources aren’t shared. That means consistently low, predictable latency.This is vital for use cases where milliseconds—or even microseconds—make a difference: autonomous vehicles, real-time analytics, robotic control systems, and high-speed trading. Fast feedback loops and localized processing enable better outcomes, tighter control, and faster decision-making at the edge.#3 Regulatory compliance and data sovereigntyAround the world, data privacy regulations are tightening. For most organizations, compliance isn’t optional.On-premises infrastructure helps keep data safely inside the organization’s network. This supports data sovereignty, ensuring that sensitive information remains subject only to local laws—not the policies of another country’s cloud provider.It's also a powerful hedge against geopolitical instability.While hyperscalers operate globally, they’re always headquartered somewhere. That makes their infrastructure vulnerable to political shifts, sanctions, or changes in international data law. Governments may require them to restrict access, share data, or cut off services entirely—especially to organizations in sanctioned or adversarial jurisdictions.Businesses relying on these providers risk disruption when regulations change. On-premises infrastructure, by contrast, offers reliable continuity and greater control—especially in uncertain times.#4 Cost control and operational benefitsCloud pricing may look flexible, but costs can escalate quickly.Data transfers, storage, and compute spikes all add up—fast. In contrast, on-premises infrastructure provides a predictable Total Cost of Ownership (TCO). Although upfront CapEx is higher, OpEx remains more stable over time.Organizations can invest in high-performance hardware tailored to their specific needs and amortize those costs across years. That means no surprise bills, no sudden price hikes, and no dependence on vendor pricing models.Of course, running on-prem infrastructure comes with its own challenges. It demands specialized teams for deployment, maintenance, and support. These experts are costly to recruit and retain—but they’re critical to ensure uptime, security, and performance.Still, for companies with relatively stable compute and storage needs, the long-term savings often outweigh the initial setup effort. On-prem also integrates more smoothly into existing IT workflows, without the need for internet access or additional network setup—another operational bonus.#5 Proactive threat detection and automated responsesOn-premises AI sometimes enables smarter, more customized security.Advanced platforms can continuously analyze live data streams using machine learning to detect anomalies and predict threats. When something suspicious is flagged, the system can respond instantly by quarantining data, blocking traffic, and alerting security teams.That kind of automation is essential for minimizing damage and downtime.With full infrastructure control, organizations can deploy bespoke monitoring systems that align with their threat models. Deep packet inspection, real-time anomaly detection, and behavioral analytics can be easier to configure and maintain on-prem than in shared cloud environments.These systems can also work seamlessly with WAAP and DDoS tools to detect and neutralize threats before they spread. The key is flexibility: whether on-prem or cloud-based, AI-driven security should adapt to your architecture and threat landscape, not the other way around.End-to-end visibility can give security teams a clearer picture and faster response options than generic, one-size-fits-all public cloud security tools.How to combine eon-premises control with cloud scalabilityLet’s be clear: on-premises AI isn’t perfect. It demands upfront investment. It requires skilled personnel to deploy and manage systems. And integrating AI into legacy environments takes thoughtful planning.But today’s tools are helping bridge those gaps. Modern platforms reduce the need for constant manual intervention. They support real-time updates to threat models and detection logic. As a result, security teams can spend more time on strategy and less on maintenance.Meanwhile, the cloud still plays an important role. It offers faster access to new tools, software updates, and next-gen GPU hardware.That’s why many organizations are opting for a hybrid model.Our recommendation: Keep your sensitive, high-priority workloads on-prem. Use the cloud for elastic scale and innovation. Together, they deliver the best of both worlds: performance, control, compliance, and flexibility.Secure your digital infrastructure with Gcore on-premises AI inferenceWhether you’re protecting sensitive data or running high-demand workloads, on-premises AI gives you the control and confidence you need. Securing sensitive data and managing high-demand workloads requires a level of control, performance, and predictability that only on-premises AI infrastructure delivers.Gcore Everywhere Inference Private Deployment makes it easier than ever to bring powerful serverless AI inference capabilities directly into your physical environment. Designed for scalable global performance, Everywhere Inference enables robust and secure multi-tenant AI inference deployments across on-prem and cloud environments, helping you meet data sovereignty requirements, reduce latency, and streamline deployment.Talk to us about your on-prem AI plans

3 clicks, 10 seconds: what real serverless AI inference should look like

Deploying a trained AI model could be the easiest part of the AI lifecycle. After the heavy lifting of data collection, training, and optimization, pushing a model into production is where “the rubber hits the road”, meaning the business expects to see the benefits of invested time and resources. In reality, many AI projects fail in production because of poor performance stemming from suboptimal infrastructure conditions.There are, broadly speaking, two paths developers can take when deploying inference: DIY, which is time and resource-consuming and requires domain expertise from several teams within the business, or the ever-so-popular “serverless inference” solution. The latter is supposed to simplify the task at hand and deliver productivity, cutting down effort to seconds, not hours. Yet most platforms offering “serverless” AI inference still feel anything but effortless. They require containers, configs, and custom scripts. They bury users in infrastructure decisions. And they often assume your data scientists are also DevOps engineers. It’s a far cry from what serverless was meant to be.At Gcore, we believe real serverless inference means this: three clicks and ten seconds to deploy a model. That’s not a tagline—it’s the experience we built. And it’s what infrastructure leaders like Mirantis are now enabling for enterprises through partnerships with Gcore.Why deployment UX matters more than you thinkServerless inference isn’t just a backend architecture choice. It’s a business enabler, a go-to-market accelerator, an ROI optimizer, a technology democratizer—or, if poorly executed, a blocker.The reality is that inference workloads are a key point of interface between your AI product or service and the customer. If deployment is clunky, you’re struggling to keep up with demand. If provisioning takes too long, latency spikes, performance is inconsistent, and ultimately your service doesn’t scale. And if the user experience is unclear or inconsistent, customers end up frustrated—or worse, they churn.Developers and data scientists don’t want to manage infrastructure. They want to bring a model and get results without becoming cloud operators in the process.Dom Wilde, SVP Marketing, MirantisThat’s why deployment UX is no longer a nice-to-have. It’s the core of your product.The benchmark: 3 clicks, 10 secondsWe built Gcore Everywhere Inference to remove every unnecessary step between uploading a model and running it in production. That includes GPU provisioning, routing, scaling, isolation, and endpoint generation, all handled behind the scenes.The result is what we believe should be the default:Upload a modelConfirm deployment parametersClick deployAnd within ten seconds, you’re serving live inference.For platform teams supporting AI workloads, this isn’t just a better workflow. It’s a transformation.With Gcore, our customers can deliver not just self-service infrastructure but also inference as a product. End users can deploy models in seconds, and customers don’t have to micromanage the backend to support that.Dom Wilde, MirantisSimple frontend, powerful backendIt’s worth saying: simplifying the frontend doesn’t mean weakening the backend. Gcore’s platform is built for scale and performance, offering the following:Multi-tenant GPU isolationSmart routing based on location and loadAuto-scaling based on demandA unified API and UI for both automation and accessibilityWhat makes this meaningful isn’t just the tech, it’s the way it vanishes behind the scenes. With Gcore, Mirantis customers can deliver low-latency inference, maximize GPU efficiency, and meet data privacy requirements without touching low-level infrastructure.Many enterprises and cloud customers worry about underutilized GPUs. Now, every cycle is optimized. The platform handles the complexity so our customers can focus on building value.Dom Wilde, MirantisIf it’s not 3 clicks and 10 seconds, it’s not really serverlessThere’s a growing gap between what serverless inference promises and what most platforms deliver. Many cloud providers are focused on raw compute or orchestration, but overlook the deployment layer. That’s a mistake. Because when it comes to customer experience, ease of deployment is the product.Mirantis saw that early on and partnered with Gcore to bring inference-as-a-service to CSP and enterprise customers, fast. Now, customers can launch new offerings more quickly, reduce operational overhead, and improve the user experience with a simple, elegant deployment path.Redefine serverless AI with GcoreIf it takes a config file, a container, and a support ticket to deploy a model, it’s not serverless—it’s server-less-ish. With Gcore Everywhere Inference, we’ve set a new benchmark: three clicks and ten seconds to deploy AI. And, our model catalog offers a variety of popular models so you can get started right away.Whether you’re frustrated with slow, inefficient model deployments or looking for the most effective way to start using AI for your company, you need Gcore Everywhere Inference. Give our experts a call to discover how we can simplify your AI so you can focus on scaling and business logic.Let’s talk about your AI project

Subscribe to our newsletter

Get the latest industry trends, exclusive insights, and Gcore updates delivered straight to your inbox.