Radar has landed - discover the latest DDoS attack trends. Get ahead, stay protected.Get the report
Under attack?

Products

Solutions

Resources

Partners

Why Gcore

  1. Home
  2. Blog
  3. Managed Kubernetes with GPU Worker Nodes for Faster AI/ML Inference

Managed Kubernetes with GPU Worker Nodes for Faster AI/ML Inference

  • By Gcore
  • November 23, 2023
  • 6 min read
Managed Kubernetes with GPU Worker Nodes for Faster AI/ML Inference

Currently, 48% of organizations use Kubernetes for AI/ML workloads, and the demand for such workloads also drives usage patterns on Kubernetes. Let’s look at the key technical reasons behind this trend, how AI/ML workloads benefit from running on GPU worker nodes in managed K8s clusters, and some considerations regarding GPU vendors and scheduling.

Why Kubernetes is Good for AI/ML

A number of features make Kubernetes popular and effective in the AI/ML realm:

  • Scalability. K8s enables seamless, on-demand scalability of AI/ML workloads. This is especially critical for inference workloads because they are more dynamic regarding resource utilization than training workloads, and can be resource-intensive. The latter means they often require frequent scaling up or down based on the volume of data being processed.
  • Automated scheduling. The ability to automatically schedule AI/ML workloads reduces the operational overhead for MLOps teams. It also improves the performance of AI/ML applications by ensuring they are scheduled to the nodes that have the required resources.
  • Resource utilization. K8s can help to optimize physical resource utilization for AI/ML workloads. It can dynamically and automatically allocate the required amounts of CPU, GPU, and RAM resources. This is critical due to the resource-intensive nature of these workloads and the potential for cost reduction.
  • Flexibility. With K8s, you can deploy AI/ML workloads across multiple infrastructures, including on-premises, public cloud, and edge cloud. This feature also makes Kubernetes a good option for organizations that need to deploy AI/ML workloads in hybrid or multicloud environments.
  • Portability. You can easily migrate Kubernetes-based AI/ML applications between different environments and installations. This is critical for deploying and managing AI/ML workloads in a hybrid infrastructure.

Use Cases

Here are some examples of how companies have adopted Kubernetes (K8s) for their AI/ML projects:

  • OpenAI has been an early adopter of K8s. In 2017, the company was running machine learning experiments on K8s clusters. With the K8s autoscaler, OpenAI could deploy such a project in a few days and scale it up to hundreds of GPUs in a week or two. Without the Kubernetes autoscaler, such a process would take months. As a result, OpenAI increased the number of AI experiments tenfold. In 2021, the company expanded its K8s infrastructure to 7,500 nodes for large ML models such as GPT-3, DALL-E and CLIP.
  • Shell uses a K8s-based platform Kubeflow to run tests and quickly experiment with ML models on laptops. Engineers can move these workloads from the test environment to production, and the workloads will function just the same. With Kubernetes, Shell builds thousands of ML models in two hours instead of a month. The time to write the underlying code is reduced from two weeks to four hours.
  • IKEA has developed an internal MLOps platform based on K8s to train ML models on-premises and get inference in the cloud. This allows the MLOps team to orchestrate different types of trained models and, ultimately, improve the customer experience.

Of course, these examples are not broadly representative. Most companies are not fully AI-focused like OpenAI and are not as large as IKEA. They can’t afford to train large AI/ML models from scratch, which takes time and money, but instead run pretrained models and integrate them with other internal services. In other words, these companies use AI/ML inference, not training.

Inference workloads tend to be more dynamic regarding resource utilization than training workloads because production clusters are more likely to experience user and traffic spikes. In such cases, the infrastructure needs to scale up and down quickly, whereas AI/ML training typically requires gradual scaling. Therefore, for AI/ML models that are already trained and deployed, the scalability and dynamic resource utilization of K8s are especially beneficial.

Why GPU Is Better than CPU for Worker Nodes

GPU worker nodes are a better fit for containerized AI/ML workloads than CPU worker nodes for the same reasons as for non-containerized workloads: GPU offers parallel processing capabilities and higher performance for AI/ML than CPUs.

Inference for AI/ML workloads running on GPU worker nodes can be faster than those running on CPU worker nodes due to the following factors:

  • The GPU’s memory architecture is specifically optimized for AI/ML processing, enabling higher memory bandwidth than CPUs.
  • GPUs often provide better computational performance than CPUs for AI/ML training and inference because they have more transistors to process data.

Kubernetes adds its own performance benefits to those of GPUs. In addition to hardware acceleration, AI/ML workloads running on GPU worker nodes get scalability and dynamic resource allocation. Kubernetes also includes plugins for GPU vendor support, making it easy to configure GPU resources for use by AI/ML workloads.

Figure 1. The simplified K8s cluster architecture with GPU worker node

With Kubernetes, you can manage GPU resources across multiple worker nodes. Containers consume GPU resources in essentially the same way as they consume CPU resources.

GPU Vendors Comparison

There are three GPU vendors available for Kubernetes: NVIDIA, AMD, and Intel. When choosing GPU vendors for worker nodes, it’s important to keep in mind that their compatibility with Kubernetes, tool ecosystem, performance, and cost can vary.

 NVIDIA GPU worker nodesAMD GPU worker nodesIntel GPU worker nodes
Compatibility with K8sExcellentGoodGood
Tools ecosystemExcellentGoodFair
PerformanceExcellentGoodFair
CostHighMediumMedium

Let’s compare the three vendors.

  • Compatibility with Kubernetes: NVIDIA is the most compatible with K8s. The company provides CUDA drivers, various container runtimes, and other tools and features that simplify GPU integration and management. AMD and Intel support for K8s is less mature and often requires custom configuration.
  • Tools ecosystem: NVIDIA has the best ecosystem of tools for AI/ML, thanks to software such as the GPU Operator and Container Toolkit, and ML frameworks adapted for NVIDIA GPUs, such as TensorFlow, PyTorch, and MXNet. AMD and Intel also have tools for AI/ML, but they are not as comprehensive as NVIDIA’s.
  • Performance: NVIDIA GPUs are known for their high performance on AI workloads, outperforming the competition on most MLPerf benchmarks. NVIDIA GPUs are ideal for demanding tasks such as deep learning and high-performance computing.
  • Cost: NVIDIA GPUs are the most expensive type of GPU worker node.
  • Flexibility: NVIDIA offers several features that make its GPU-based K8s clusters highly flexible in terms of management and resource utilization compared to its competitors:
    • Multi-instance GPU (MIG) mechanism for NVIDIA A100 GPU to allow a GPU to be securely partitioned into up to seven separate instances for better GPU utilization
    • Multicloud GPU clusters, which can be seamlessly managed and scaled as if deployed in a single cloud
    • Heterogeneous GPU and CPU clusters to simplify the training and management of distributed deep learning models
    • GPU metrics monitoring with Prometheus and visualization with Grafana
    • Support for multiple container runtimes, including Docker, CRI-O, and containers

In summary, NVIDIA GPU worker nodes are the best choice for AI/ML workloads in Kubernetes. They offer the best compatibility with K8s, the best tools ecosystem, and the best performance. That’s why we chose NVIDIA GPUs for Gcore Managed Kubernetes. Our customers get all the benefits of NVIDIA, including the highest performance level for faster training and inference of their AI/ML workloads.

Important Specifics of GPU Scheduling in Kubernetes

To enable GPU scheduling and allow pods to access its resources, you need to install a vendor-specific device plugin from your chosen GPU vendor — NVIDIA, AMD, or Intel.

Pods request GPU resources in the same way they request CPU resources. However, Kubernetes is less flexible with GPU than with CPU when it comes to configuring `limits` and `requests`. With `requests`, you set the amount of resources that a pod is guaranteed to get, such as a minimum quantity. With `limits`, you set the amount of resources that won’t be exceeded, for instance, a maximum quantity. When configuring a pod manifest for GPU requests, `limits` and `requests` should be equal, meaning that a pod won’t get more resources than guaranteed if, for example, the application needs them.

Also, by default, you can’t allocate part of a GPU or multiple GPUs to a container because of the way CPU allocation works. You can only allocate one full GPU per container. This limitation doesn’t help with resource economics. But NVIDIA has managed to overcome this. With its GPU, you can use either use:

  • Time-sharing GPUs, which work by sequentially assigning time intervals to shared containers on a physical GPU. This works for all NVIDIA GPUs.
  • Multi-instance GPUs, which allow a GPU to be divided into up to seven instances for better GPU utilization. This only works with the NVIDIA A100 GPU.

These two features help you to use NVIDIA GPU resources more efficiently and save money on renting GPU instances in the cloud. This is also a significant advantage over other GPU vendors.

Managed Kubernetes vs. Vanilla Kubernetes with GPU

A managed Kubernetes service can offer several advantages over vanilla (open source) Kubernetes for AI/ML workloads running on GPU worker nodes:

  • Flexible choice of GPUs. Managed K8s services typically provide support for GPU instances with various specifications. This makes it easier to choose the appropriate level of GPU acceleration for your AI/ML workloads.
  • Reduced operational overhead. Managed Kubernetes handles the everyday responsibilities of overseeing a Kubernetes cluster, like managing the control plane and implementing K8s updates. This enables you to focus on creating, deploying and managing AI/ML applications.
  • Scalability and reliability. Managed K8s services are typically designed with a strong focus on scalability and reliability, ensuring that your AI/ML workloads can adeptly handle fluctuating traffic and spikes in resource demand.

Gcore Managed Kubernetes with NVIDIA GPU Workers

Gcore Managed Kubernetes helps you to deploy Kubernetes clusters fast, without the need to maintain the underlying infrastructure and Kubernetes backend. The Gcore team controls the master nodes while you control only the worker nodes, reducing your operational burden. Worker nodes can be Gcore Virtual Machines or Bare Metal servers in various configurations, including those with NVIDIA GPU modules.

Conclusion

Managed Kubernetes with GPU worker nodes is a powerful and flexible combination for accelerating AI/ML inference. By taking advantage of both Kubernetes and GPUs, managed Kubernetes with GPU worker nodes can help you improve the performance and efficiency of your AI/ML workloads. The service also frees you from the need to maintain the underlying GPU infrastructure and most Kubernetes components.

Gcore Managed Kubernetes can boost your AI/ML workloads with GPU worker nodes on Bare Metal for faster inference and operational efficiency. We offer a 99.9% SLA with free production management and free egress traffic—at outstanding value for money.

Explore Managed Kubernetes

Related Articles

The rise of DDoS attacks on Minecraft and gaming

The gaming industry is a prime target for distributed denial-of-service (DDoS) attacks, which flood servers with malicious traffic to disrupt gameplay. These attacks can cause server outages, leading to player frustration, and financial losses.Minecraft, one of the world’s most popular games with 166 million monthly players, is no exception. But this isn’t just a Minecraft problem. From Call of Duty to GTA, gaming servers worldwide face relentless DDoS attacks as the most-targeted industry, costing game publishers and server operators millions in lost revenue.This article explores what’s driving this surge in gaming-related DDoS attacks, and what lessons can be learned from Minecraft’s experience.How DDoS attacks have disrupted MinecraftMinecraft’s open-ended nature makes it a prime testing ground for cyberattacks. Over the years, major Minecraft servers have been taken down by large-scale DDoS incidents:MCCrash botnet attack: A cross-platform botnet targeted private Minecraft servers, crashing thousands of them in minutes.Wynncraft MC DDoS attack: A Mirai botnet variant launched a multi-terabit DDoS attack on a large Minecraft server. Players could not connect, disrupting gameplay and forcing the server operators to deploy emergency mitigation efforts to restore service.SquidCraft Game attack: DDoS attackers disrupted a Twitch Rivals tournament, cutting off an entire competing team.Why are Minecraft servers frequent DDoS targets?DDoS attacks are widespread in the gaming industry, but certain factors make gaming servers especially vulnerable. Unlike other online services, where brief slowdowns might go unnoticed, even a few milliseconds of lag in a competitive game can ruin the experience. Attackers take advantage of this reliance on stability, using DDoS attacks to create chaos, gain an unfair edge, or even extort victims.Gaming communities rely on always-on availabilityUnlike traditional online services, multiplayer games require real-time responsiveness. A few seconds of lag can ruin a match, and server downtime can send frustrated players to competitors. Attackers exploit this pressure, launching DDoS attacks to disrupt gameplay, extort payments, or damage reputations.How competitive gaming fuels DDoS attacksUnlike other industries where cybercriminals seek financial gain, many gaming DDoS attacks are fueled by rivalry. Attackers might:Sabotage online tournaments by forcing competitors offline.Target popular streamers, making their live games unplayable.Attack rival servers to drive players elsewhere.Minecraft has seen all of these scenarios play out.The rise of DDoS-for-hire servicesDDoS attacks used to require technical expertise. Now, DDoS-as-a-service platforms offer attacks for as little as $10 per hour, making it easier than ever to disrupt gaming servers. The increasing accessibility of these attacks is a growing concern, especially as large-scale incidents continue to emerge.How gaming companies can defend against DDoS attacksWhile attacks are becoming more sophisticated, effective defenses do exist. By implementing proactive security measures, gaming companies can minimize risks and maintain uninterrupted gameplay for customers. Here are four key strategies to protect gaming servers from DDoS attacks.#1 Deploy always-on DDoS protectionGame publishers and server operators need real-time, automated DDoS mitigation. Gcore DDoS Protection analyzes traffic patterns, filters malicious requests, and keeps gaming servers online, even during an attack. In July 2024, Gcore mitigated a massive 1 Tbps DDoS attack on Minecraft servers, highlighting how gaming platforms remain prime targets. While the exact source of such attacks isn’t always straightforward, their frequency and intensity reinforce the need for robust security measures to protect gaming communities from service disruptions.#2 Strengthen network securityGaming companies can reduce attack surfaces in the following ways:Using rate limiting to block excessive requestsImplementing firewalls and intrusion detection systemsObfuscating server IPs to prevent attackers from finding them#3 Educate players and moderatorsSince many DDoS attacks come from within gaming communities, education is key. Server admins, tournament organizers, and players should be trained to recognize and report suspicious behavior.#4 Monitor for early attack indicatorsDDoS attacks often start with warning signs: sudden traffic spikes, frequent disconnections, or network slowdowns. Proactive monitoring can help stop attacks before they escalate.Securing the future of online gamingDDoS attacks against Minecraft servers are part of a broader trend affecting the gaming industry. Whether driven by competition, extortion, or sheer disruption, these attacks compromise gameplay, frustrate players, and cause financial losses. Learning from Minecraft’s challenges can help server operators and game developers build stronger defenses and prevent similar attacks across all gaming platforms.While proactive measures like traffic monitoring and server hardening are essential, investing in purpose-built DDoS protection is the most effective way to guarantee uninterrupted gameplay and protect gaming communities. Gcore provides advanced, multi-layered DDoS protection specifically designed for gaming servers, including countermeasures tailored to Minecraft and other gaming servers. With a deep understanding of the industry’s security challenges, we help server owners keep their platforms secure, responsive, and resilient—no matter the type of attack.Want to take the next step in securing your gaming servers?Download our ultimate guide to preventing Minecraft DDoS

How AI enhances bot protection and anti-automation measures

Bots and automated attacks have become constant issues for organizations across industries, threatening everything from website availability to sensitive customer data. As these attacks become increasingly sophisticated, traditional bot mitigation methods struggle to keep pace. Businesses face a growing need to protect their applications, APIs, and data without diminishing the efficiency of essential automated parts and bots that enhance user experiences.That’s where AI comes in. AI-enabled WAAP is a game-changing solution that marries the adaptive intelligence of AI with information gleaned from historical data. This means WAAP can detect and neutralize malicious bot and anti-automation activity with unprecedented precision. Read on to discover how.The bot problem: why automation threats are growingJust a decade ago, use cases for AI and bots were completely different than they are today. While some modern use cases are benign, such as indexing search engines or helping to monitor website performance, malicious bots account for a large proportion of web traffic. Malicious bots have grown from simple machines that follow scripts to complex creations that can convincingly simulate human behaviors.What makes bots particularly dangerous is their ability to evade detection by mimicking human-like patterns. Simple measures like CAPTCHA tests or IP blocking no longer suffice. Businesses need more intelligent systems capable of identifying and mitigating these evolving threats without impacting real users.Defeating automation threats with AI and machine learningToday’s bots don’t just click on links. They fake human activity convincingly, and defeating them involves a lot more than just simple detection. Battling modern bots requires fighting fire with fire by implementing machine learning and AI to create defensive strategies such as blocking credential stuffing, blocking data scraping, and performing behavioral tagging and profiling.Blocking credential stuffingCredential stuffing is a form of attack in which stolen login credentials are used to gain access to user accounts. AI/ML systems can identify such an attack by patterns, including multiple failed logins or logins from unusual locations. These systems learn with each new attempt, strengthening their defenses after every attack attempt.Data scraping blockingScraping bots can harvest everything from pricing data to intellectual property. AI models detect these through the repetitive patterns of requests or abnormally high frequencies of interactions. Unlike basic anti-scraping tools, AI learns new ways that scraping is done, keeping businesses one step ahead.Behavioral tagging and profilingAI-powered systems are quite good at analyzing user behavior. They study the tendencies of session parameters, IP addresses, and interaction rates. For instance, most regular users save session data, while bots do not prioritize this action. The AI system flags suspicious behavior and highlights the user in question for review.These systems also count the recurrence of certain actions, such as clicks or requests. The AI is supposed to build an in-depth profile for every IP or user and find something out of the ordinary to suggest a way to block or throttle the traffic.IP rescoring for smarter detectionOne of the unique capabilities of AI-driven bot protection is Dynamic IP Scoring. Based on external behavior data and threat intelligence, each incoming IP is accorded a risk score. For example, an IP displaying a number of failed login attempts could be suspicious. If it persists, that score worsens, and the system blocks the traffic.This dynamic scoring system does not focus on mere potential threats. It also allows IPs to “recover” if their behavior normalizes, reducing false positives and helping to ensure that real users are not inadvertently blocked.Practical insights: operationalizing AI-driven bot protectionImplementing AI/ML-driven bot protection requires an understanding of both the technology and the operational context in which it’s deployed. Businesses can take advantage of several unique features offered by platforms like Gcore WAAP:Tagging system synergy: Technology-generated tags, like the Gcore Tagging and Analysis Classification and Tagging (TACT) engine, are used throughout the platform to enforce fine-grained security policies and share conclusions and information between various solution components. Labeling threats allows users to easily track potential threats, provides input for ML analysis, and contributes data to an attacker profile that can be applied and acted on globally. This approach ensures an interlinked approach in which all components interact to mitigate threats effectively.Scalable defense mechanisms: With businesses expanding their online footprints, platforms like Gcore scale seamlessly to accommodate new users and applications. The cloud-based architecture makes continuous learning and adaptation possible, which is critical to long-term protection against automation threats.Cross-domain knowledge sharing: One of the salient features of Gcore WAAP is cross-domain functionality, which means the platform can draw from a large shared database of user behavior and threat intelligence. Even newly onboarded users immediately benefit from the insights gained by the platform from its historical data and are protected against previously encountered threats.Security insights: Gcore WAAP’s Security Insights feature provides visibility into security configurations and policy enforcement, helping users identify disabled policies that may expose them to threats. While the platform’s tagging system, powered by the TACT engine, classifies traffic and identifies potential risks, separate microservices handle policy recommendations and mitigation strategies. This functionality reduces the burden on security teams while enhancing overall protection.API discovery and protection: API security is among the most targeted entry points for automated attacks due to APIs’ ability to open up data exchange between applications. Protecting APIs requires advanced capabilities that can accurately identify suspicious activities without disrupting legitimate traffic. Gcore WAAP’s API discovery engine achieves this with a 97–99% accuracy rate, leveraging AI/ML to detect and prevent threats.Leveraging collective intelligence: Gcore WAAP’s cross-domain functionality creates a shared database of known threats and behaviors, allowing data from one client to protect the entire customer base. New users benefit immediately from the platform’s historical insights, bypassing lengthy learning curves. For example, a flagged suspicious IP can be automatically blocked across the network for faster, more efficient protection.Futureproof your security with Gcore’s AI-enabled WAAPBusinesses are constantly battling increasingly sophisticated botnet threats and have to be much more proactive regarding their security mechanisms. AI and machine learning have become integral to fighting bot-driven attacks, providing an unprecedented level of precision and flexibility that no traditional security systems can keep up with. With advanced behavior analysis, adaptive threat models, and cross-domain knowledge sharing, Gcore WAAP establishes new standards of bot protection.Curious to learn more about WAAP? Check out our ebook for cybersecurity best practices, the most common threats to look out for, and how WAAP can safeguard your businesses’ digital assets. Or, get in touch with our team to learn more about Gcore WAAP.Learn why WAAP is essential for modern businesses with a free ebook

How to choose the right technology tools to combat digital piracy

One of the biggest challenges facing the media and entertainment industry is digital piracy, where stolen content is redistributed without authorization. This issue causes significant revenue and reputational losses for media companies. Consumers who use these unregulated services also face potential threats from malware and other security risks.Governments, regulatory bodies, and private organizations are increasingly taking the ramifications of digital piracy seriously. In the US, new legislation has been proposed that would significantly crack down on this type of activity, while in Europe, cloud providers are being held liable by the courts for enabling piracy. Interpol and authorities in South Korea have also teamed up to stop piracy in its tracks.In the meantime, you can use technology to help stop digital piracy and safeguard your company’s assets. This article explains anti-piracy technology tools that can help content providers, streaming services, and website owners safeguard their proprietary media: geo-blocking, digital rights management (DRM), secure tokens, and referrer validation.Geo-blockingGeo-blocking (or country access policy) restricts access to content based on a user’s geographic location, preventing unauthorized access and limiting content distribution to specific regions. It involves setting rules to allow or deny access based on the user’s IP address and location in order to comply with regional laws or licensing agreements.Pros:Controls access by region so that content is only available in authorized marketsHelps comply with licensing agreementsCons:Can be bypassed with VPNs or proxiesRequires additional security measures to be fully effectiveTypical use cases: Geo-blocking is used by streaming platforms to restrict access to content, such as sports events or film premieres, based on location and licensing agreements. It’s also helpful for blocking services in high-risk areas but should be used alongside other anti-piracy tools for better and more comprehensive protection.Referrer validationReferrer validation is a technique that checks where a content request is coming from and prevents unauthorized websites from directly linking to and using content. It works by checking the “referrer” header sent by the browser to determine the source of the request. If the referrer is from an unauthorized domain, the request is blocked or redirected. This allows only trusted sources to access your content.Pros:Protects bandwidth by preventing unauthorized access and misuse of resourcesGuarantees content is only accessed by trusted sources, preventing piracy or abuseCons:Can accidentally block legitimate requests if referrer headers are not correctly sentMay not work as intended if users access content via privacy-focused methods that strip referrer data, leading to false positivesTypical use cases: Content providers commonly use referrer validation to prevent unauthorized streaming or hotlinking, which involves linking to media from another website or server without the owner’s permission. It’s especially useful for streamers who want to make sure their content is only accessed through their official platforms. However, it should be combined with other security measures for more substantial protection.Secure tokensSecure tokens and protected temporary links provide enhanced security by granting temporary access to specific resources so only authorized users can access sensitive content. Secure tokens are unique identifiers that, when linked to a user’s account, allow them to access protected resources for a limited time. Protected temporary links further restrict access by setting expiration dates, meaning the link becomes invalid after a set time.Pros:Provides a high level of security by allowing only authorized users to access contentTokens are time-sensitive, which prevents unauthorized access after they expireHarder to circumvent compared to traditional password protection methodsCons:Risk of token theft if they’re not managed or stored securelyRequires ongoing management and rotation of tokens, adding complexityCan be challenging to implement properly, especially in high-traffic environmentsTypical use cases: Streaming platforms use secure tokens and protected temporary links so only authenticated users can access premium content, like movies or live streams. They are also useful for secure file downloads or limiting access to exclusive resources, making them effective for protecting digital content and preventing unauthorized sharing or piracy.Digital rights managementDigital rights management (DRM) refers to a set of technologies designed to protect digital content from unauthorized use so that only authorized users can access, copy, or share it, according to licensing agreements. DRM uses encryption, licensing, and authentication mechanisms to control access to digital resources so that only authorized users can view or interact with the content. While DRM offers strong protection against piracy, it comes with higher complexity and setup costs than other security methods.Pros:Robust protection against unauthorized copying, sharing, and piracyHelps safeguard intellectual property and revenue streamsEnforces compliance with licensing agreementsCons:Can be complex and expensive to implementMay cause inconvenience for users, such as limiting playback on unauthorized devices or restricting sharingPotential system vulnerabilities or compatibility issuesTypical use cases: DRM is commonly used by streaming services to protect movies, TV shows, and music from piracy. It can also be used for e-books, software, and video games, ensuring that content is only used by licensed users according to the terms of the agreement. DRM solutions can vary, from software-based solutions for media files to hardware-based or cloud-based DRM for more secure distribution.Protect your content from digital piracy with GcoreDigital piracy remains a significant challenge for the media and entertainment industry as it poses risks in terms of both revenue and security. To combat this, partnering with a cloud provider that can actively monitor and protect your digital assets through advanced multi-layer security measures is essential.At Gcore, our CDN and streaming solutions give rights holders peace of mind that their assets are protected, offering the features mentioned in this article and many more besides. We also offer advanced cybersecurity tools, including WAAP (web application and API protection) and DDoS protection, which further integrate with and enhance these security measures. We provide trial limitations for streamers to curb piracy attempts and respond swiftly to takedown requests from rights holders and authorities, so you can rest assured that your assets are in safe hands.Get in touch to learn more about combatting digital piracy

5 ways to keep gaming customers engaged with optimal performance

Nothing frustrates a gamer more than lag, stuttering, or server crashes. When technical issues interfere with gameplay, it can be a deal breaker. Players know that the difference between winning and losing should be down to a player’s skill, not lag, latency issues, or slow connection speed—and they want gaming companies to make that possible every time they play.And gamers aren’t shy about expressing their opinion if a game hasn’t met their expectations. A game can live or die by word-of-mouth, and, in a highly competitive industry, gamers are more than happy to spend their time and money elsewhere. A huge 78% of gamers have “rage-quit” a game due to latency issues.That’s why reliable infrastructure is crucial for your gaming offering. A solid foundation is good for your bottom line and your reputation and, most importantly, provides a great gaming experience for customers, keeping them happy, loyal, and engaged. This article suggests five technologies to boost player engagement in real-world gaming scenarios.The technology powering seamless gaming experiencesHaving the right technology behind the scenes is essential to deliver a smooth, high-performance gaming experience. From optimizing game deployment and content delivery to enabling seamless multiplayer scalability, these technologies work together to reduce latency, prevent server overloads, and guarantee fast, reliable connections.Bare Metal Servers provide dedicated compute power for high-performing massive multiplayer games without virtualization overhead.CDN solutions reduce download times and minimize patch distribution delays, allowing players to get into the action faster.Managed Kubernetes simplifies multiplayer game scaling, handling sudden spikes in player activity.Load Balancers distribute traffic intelligently, preventing server overload during peak times.Edge Cloud reduces latency for real-time interactions, improving responsiveness for multiplayer gaming.Let’s look at five real-world scenarios illustrating how the right infrastructure can significantly enhance customer experience—leading to smooth, high-performance gaming, even during peak demand.#1 Running massive multiplayer games with bare metal serversImagine a multiplayer FPS (first-person shooter gaming) game studio that’s preparing for launch and needs low-latency, high-performance infrastructure to handle real-time player interactions. They can strategically deploy Gcore Bare Metal servers across global locations, reducing ping times and providing smooth gameplay.Benefit: Dedicated bare metal resources deliver consistent performance, eliminating lag spikes and server crashes during peak hours. Stable connections and seamless playing are assured for precision gameplay.#2 Seamless game updates and patch delivery with CDN integrationLet’s say you have a game that regularly pushes extensive updates to millions of players worldwide. Instead of overwhelming origin servers, they can use Gcore CDN to cache and distribute patches, reducing download times and preventing bottlenecks.Benefit: Faster updates for players, reduced server tension, and seamless game launches and updates.#3 Scaling multiplayer games with Managed KubernetesAfter a big update, a game may experience a sudden spike in the number of players. With Gcore Managed Kubernetes, the game autoscales its infrastructure, dynamically adjusting resources to meet player demand without downtime.Benefit: Elastic, cost-efficient scaling keeps matchmaking fast and smooth, even under heavy loads.#4 Load balancing for high-availability game serversAn online multiplayer game with a global base requires low latency and high availability. Gcore Load Balancers distribute traffic across multiple regional server clusters, reducing ping times and preventing server congestion during peak hours.Benefit: Consistent, lag-free gameplay with improved regional connectivity and failover protection.#5 Supporting live events and seasonal game launchesIn the case of a gaming company hosting a global in-game event, attracting millions of players simultaneously, leveraging Gcore CDN, Load Balancers, and autoscaling cloud infrastructure can prevent crashes and provide a seamless and uninterrupted experience.Benefit: Players enjoy smooth, real-time participation while the infrastructure is stable under extreme load.Building customer loyalty with reliable gaming infrastructureIn a challenging climate, focusing on maintaining customer happiness and loyalty is vital. The most foolproof way to deliver this is by investing in reliable and secure infrastructure behind the scenes. With infrastructure that’s both scalable and high-performing, you can deliver uninterrupted, seamless experiences that keep players engaged and satisfied.Since its foundation in 2014, Gcore has been a reliable partner for game studios looking to deliver seamless, high-performance gaming experiences worldwide, including Nitrado, Saber, and Wargaming. If you’d like to learn more about our global infrastructure and how it provides a scalable, high-performance solution for game distribution and real-time games, get in touch.Talk to our gaming infrastructure experts

How to achieve compliance and security in AI inference

AI inference applications today handle an immense volume of confidential information, so prioritizing data privacy is paramount. Industries such as finance, healthcare, and government rely on AI to process sensitive data—detecting fraudulent transactions, analyzing patient records, and identifying cybersecurity threats in real time. While AI inference enhances efficiency, decision-making, and automation, neglecting security and compliance can lead to severe financial penalties, regulatory violations, and data breaches. Industries handling sensitive information—such as finance, healthcare, and government—must carefully manage AI deployments to avoid costly fines, legal action, and reputational damage.Without robust security measures, AI inference environments present a unique security challenge as they process real-time data and interact directly with users. This article explores the security challenges enterprises face and best practices for guaranteeing compliance and protecting AI inference workloads.Key inference security and compliance challengesAs businesses scale AI-powered applications, they will likely encounter challenges in meeting regulatory requirements, preventing unauthorized access, and making sure that AI models (whether proprietary or open source) produce reliable and unaltered outputs.Data privacy and sovereigntyRegulations such as GDPR (Europe), CCPA (California), HIPAA (United States, healthcare), and PCI DSS (finance) impose strict rules on data handling, dictating where and how AI models can be deployed. Businesses using public cloud-based AI models must verify that data is processed and stored in appropriate locations to avoid compliance violations.Additionally, compliance constraints restrict certain AI models in specific regions. Companies must carefully evaluate whether their chosen models align with regulatory requirements in their operational areas.Best practices:To maintain compliance and avoid legal risks:Deploy AI models in regionally restricted environments to keep sensitive data within legally approved jurisdictions.Use Smart Routing with edge inference to process data closer to its source, reducing cross-border security risks.Model security risksBad actors can manipulate AI models to produce incorrect outputs, compromising their reliability and integrity. This is known as adversarial manipulation, where small, intentional alterations to input data can deceive AI models. For example, researchers have demonstrated that minor changes to medical images can trick AI diagnostic models into misclassifying benign tumors as malignant. In a security context, attackers could exploit these vulnerabilities to bypass fraud detection in finance or manipulate AI-driven cybersecurity systems, leading to unauthorized transactions or undetected threats.To prevent such threats, businesses must implement strong authentication, encryption strategies, and access control policies for AI models.Best practices:To prevent adversarial attacks and maintain model integrity:Enforce strong authentication and authorization controls to limit access to AI models.Encrypt model inputs and outputs to prevent data interception and tampering.Endpoint protection for AI deploymentsThe security of AI inference does not stop at the model level. It also depends on where and how models are deployed.For private deployments, securing AI endpoints is crucial to prevent unauthorized access.For public cloud inference, leveraging CDN-based security can help protect workloads against cyber threats.Processing data within the country of origin can further reduce compliance risks while improving latency and security.AI models rely on low-latency, high-performance processing, but securing these workloads against cyber threats is as critical as optimizing performance. CDN-based security strengthens AI inference protection in the following ways:Encrypts model interactions with SSL/TLS to safeguard data transmissions.Implements rate limiting to prevent excessive API requests and automated attacks.Enhances authentication controls to restrict access to authorized users and applications.Blocks bot-driven threats that attempt to exploit AI vulnerabilities.Additionally, CDN-based security supports compliance by:Using Smart Routing to direct AI workloads to designated inference nodes, helping align processing with data sovereignty laws.Optimizing delivery and security while maintaining adherence to regional compliance requirements.While CDNs enhance security and performance by managing traffic flow, compliance ultimately depends on where the AI model is hosted and processed. Smart Routing allows organizations to define policies that help keep inference within legally approved regions, reducing compliance risks.Best practices:To protect AI inference environments from endpoint-related threats, you should:Deploy monitoring tools to detect unauthorized access, anomalies, and potential security breaches in real-time.Implement logging and auditing mechanisms for compliance reporting and proactive security tracking.Secure AI inference with Gcore Everywhere InferenceAI inference security and compliance are critical as businesses handle sensitive data across multiple regions. Organizations need a robust, security-first AI infrastructure to mitigate risks, reduce latency, and maintain compliance with data sovereignty laws.Gcore’s edge network and CDN-based security provide multi-layered protection for AI workloads, combining DDoS protection and WAAP (web application and API protection. By keeping inference closer to users and securing every stage of the AI pipeline, Gcore helps businesses protect data, optimize performance, and meet industry regulations.Explore Gcore AI Inference

Mobile World Congress 2025: the year of AI

As Mobile World Congress wrapped up for another year, it was apparent that only one topic was on everyone’s minds: artificial intelligence.Major players—such as Google, Ericsson, and Deutsche Telekom—showcased the various ways in which they’re piloting AI applications—from operations to infrastructure management and customer interactions. It’s clear there is a great desire to see AI move from the research lab into the real world, where it can make a real difference to people’s everyday lives. The days of more theoretical projects and gimmicky robots seem to be behind us: this year, it was all about real-world applications.MWC has long been an event for telecommunications companies to launch their latest innovations, and this year was no different. Telco companies demonstrated how AI is now essential in managing network performance, reducing operational downtime, and driving significant cost savings. The industry consensus is that AI is no longer experimental but a critical component of modern telecommunications. While many of the applications showcased were early-stage pilots and stakeholders are still figuring out what wide-scale, real-time AI means in practice, the ambition to innovate and move forward on adoption is clear.Here are three of the most exciting AI developments that caught our eye in Barcelona:Conversational AIChatbots were probably the key telco application showcased across MWC, with applications ranging from contact centers, in-field repairs, personal assistants transcribing calls, booking taxis and making restaurant reservations, to emergency responders using intelligent assistants to manage critical incidents. The easy-to-use, conversational nature of chatbots makes them an attractive means to deploy AI across functions, as it doesn’t require users to have any prior hands-on machine learning expertise.AI for first respondersEmergency responders often rely on telco partners to access novel, technology-enabled solutions to address their challenges. One such example is the collaboration between telcos and large language model (LLM) companies to deliver emergency-response chatbots. These tailored chatbots integrate various decision-making models, enabling them to quickly parse vast data streams and suggest actionable steps for human operators in real time.This collaboration not only speeds up response times during critical situations but also enhances the overall effectiveness of emergency services, ensuring that support reaches those in need faster.Another interesting example in this field was the Deutsche Telekom drone with an integrated LTE base station, which can be deployed in emergencies to deliver temporary coverage to an affected area or extend the service footprint during sports events and festivals, for example.Enhancing Radio Access Networks (RAN)Telecommunication companies are increasingly turning to advanced applications to manage the growing complexity of their networks and provide high-quality, uninterrupted service for their customers.By leveraging artificial intelligence, these applications can proactively monitor network performance, detect anomalies in real time, and automatically implement corrective measures. This not only enhances network reliability but reduces operational costs and minimizes downtime, paving the way for more efficient, agile, and customer-focused network management.One notable example was the Deutsche Telekom and Google Cloud collaboration: RAN Guardian. Built using Gemini 2.0, this agent analyzes network behavior, identifies performance issues, and takes corrective measures to boost reliability, lower operational costs, and improve customer experience.As telecom networks become more complex, conventional rule-based automation struggles to handle real-time challenges. In contrast, agentic AI employs large language models (LLMs) and sophisticated reasoning frameworks to create intelligent systems capable of independent thought, action, and learning.What’s next in the world of AI?The innovation on show at MWC 2025 confirms that AI is rapidly transitioning from a research topic to a fundamental component of telecom and enterprise operations.  Wide-scale AI adoption is, however, a balancing act between cost, benefit, and risk management.Telcos are global by design, operating in multiple regions with varying business needs and local regulations. Ensuring service continuity and a good return on investment from AI-driven applications while carefully navigating regional laws around data privacy and security is no mean feat.If you want to learn more about incorporating AI into your business operations, we can help.Gcore Everywhere Inference significantly simplifies large-scale AI deployments by providing a simple-to-use serverless inference tool that abstracts the complexity of AI hardware and allows users to deploy and manage AI inference globally with just a few clicks. It enables fully automated, auto-scalable deployment of inference workloads across multiple geographic locations, making it easier to handle fluctuating requirements, thus simplifying deployment and maintenance.Learn more about Gcore Everywhere Inference

Subscribe
to our newsletter

Get the latest industry trends, exclusive insights, and Gcore updates delivered straight to your inbox.