We value our clients, that is why we entrust only experienced and friendly technical support specialists to assist them in connecting, integrating, configuring and maintaining our products.
Technical support work principles and feedback from clients
1. Do not hesitate and respond within 10 minutes after the request.
2. Respond with understandable and detailed instructions.
If the client struggles figuring things out, we do not give links to the knowledge base, but explain everything in detail.
Everything is prompt and clear, thanks
Videonow
In the last 2–3 months I have been actively communicating with Gcore’s support by mail and I was extremely surprised by their approach: completeness of responses, not just bits of some professional information, but rather clear instructions for those who only delve into the matter. Work with CDN is mega comfortable. Requests for download and upload speed, cache, Purge, API and other also get easily resolved.
Wargaming
3. Notify about all stages of processing the request.
We continuously monitor the status of sources and immediately inform clients about the performance of their resources.
The most common messages are:
- Client source is unavailable (5xx errors).
- Recommend changing current settings of client’s CDN resource and products.
- Notify that Storage space is running out.
- Notify about changes in current options.
- Offer a subscription to Status Page about current restrictions or system updates.
We also analyze clients’ control panel activity and always offer assistance if we notice that they cannot set up content delivery themselves.
Hello. OK. That is not so urgent. Thank you up updating
Benjamin Y Kweon, RedFox Games
4. Be responsive and helpful in any situation.
I think our conversation is going very well.
Today is the first contact and we’ll see what’s happening next
Angola Cables
It is because Irina gave me excellent support… Thanks Irina
Each of our clients can be sure that we will provide qualified help around the clock, even on holidays and weekends, in English, and soon in Chinese.
Related articles

10 cybersecurity trends set to shape 2025
The 2025 cybersecurity landscape is increasingly complex, driven by sophisticated cyber threats, increased regulation, and rapidly evolving technology. In 2025, organizations will be challenged with protecting sensitive information for their customers while continuing to provide seamless and easy user experiences. Here’s a closer look at ten emerging challenges and threats set to shape the coming year.1. The rise of zero-day vulnerabilitiesZero-day vulnerabilities are still one of the major threats in cybersecurity. By definition, these faults remain unknown to software vendors and the larger security community, thus leaving systems exposed until a fix can be developed. Attackers are using zero-day exploits frequently and effectively, affecting even major companies, hence the need for proactive measures.Advanced threat actors use zero-day attacks to achieve goals including espionage and financial crimes. Organizations should try to mitigate risks by continuous monitoring and advanced detection systems through behavioral identification of exploit attempts. Beyond detection, sharing threat intelligence across industries about emerging zero-days has become paramount for staying ahead of adversaries. Addressing zero-day threats requires response agility to be balanced with prevention through secure software coding, patching, and updating.2. AI as a weapon for attackersThe dual-use nature of AI has created a great deal of risk to organizations as cybercriminals increasingly harness the power of AI to perpetrate highly sophisticated attacks. AI-powered malware can change its behavior in real time. This means it can evade traditional methods of detection and find and exploit vulnerabilities with uncanny precision. Automated reconnaissance tools let attackers compile granular intelligence about systems, employees, and defenses of a target at unprecedented scale and speed. AI use also reduces the planning time for an attack.For example, AI-generated phishing campaigns use advanced natural language processing for crafting extremely personal and convincing emails to increase the chances of successful breaches. Deepfake technology adds a layer of complexity by allowing attackers to impersonate executives or employees with convincing audio and video for financial fraud or reputational damage.Traditional security mechanisms may fail to detect and respond to the adaptive and dynamic nature of AI-driven attacks, leaving organizations open to significant operational and financial impacts. To stay secure in the face of AI threats, organizations should look to AI-enhanced security solutions.3. AI as the backbone of modern cybersecurityArtificial intelligence is rapidly becoming a mainstay in cybersecurity. From handling and processing large volumes of data to detecting even minute anomalies and predicting further threats, AI is taking the fight against cybercrime to new levels of effectiveness. It’s likely that in 2025, AI will become integral in all aspects of cybersecurity, from threat detection and incident response to strategy formulation.AI systems are particularly good at parsing complex datasets to uncover patterns and recognize vulnerabilities that might otherwise go unnoticed. They also excel in performing routine checks, freeing human security teams to focus on more difficult and creative security tasks—and removing the risk of human error or oversight in routine, manual work.4. The growing complexity of data privacyIntegrating regional and local data privacy regulations such as GDPR and CCPA into the cybersecurity strategy is no longer optional. Companies need to look out for regulations that will become legally binding for the first time in 2025, such as the EU’s AI Act. In 2025, regulators will continue to impose stricter guidelines related to data encryption and incident reporting, including in the realm of AI, showing rising concerns about online data misuse.Decentralized security models, such as blockchain, are being considered by some companies to reduce single points of failure. Such systems offer enhanced transparency to users and allow them much more control over their data. When combined with a zero-trust approach that can process requests, these strategies help harden both privacy and security.5. Challenges in user verificationVerifying user identities has become more challenging as browsers enforce stricter privacy controls and attackers develop more sophisticated bots. Modern browsers are designed to protect user privacy by limiting the amount of personal information websites can access, such as location, device details, or browsing history. This makes it harder for websites to determine whether a user is legitimate or malicious. Meanwhile, attackers create bots that behave like real users by mimicking human actions such as typing, clicking, or scrolling, making them difficult to detect using standard security methods.Although AI has added an additional layer of complexity to user verification, AI-driven solutions are also the most reliable way to identify these bots. These systems analyze user behavior, history, and context in real time to enable businesses to adapt security measures with minimal disruption of legitimate users.6. The increasing importance of supply chain securitySupply chain security breaches are indeed on the rise, with attackers exploiting vulnerabilities in third-party vendors to infiltrate larger networks. Monitoring of these third-party relationships is often insufficient. Most companies do not know all the third parties that handle their data and personally identifiable information (PII) and almost all companies are connected to at least one third-party vendor that has experienced a breach. This lack of oversight poses significant risks, as supply chain attacks can have cascading effects across industries.Unsurprisingly, even prominent organizations fall victim to attacks via their suppliers’ vulnerabilities. For example, in a recent attack on Ford, attackers exploited the company’s supply chain to insert malicious code into Ford’s systems, creating a backdoor that the attackers could use to expose sensitive customer data.In 2025, organizations will need to prioritize investing in solutions that can vet and monitor their supply chain. AI-driven and transparency-focused solutions can help identify vulnerabilities in even the most complex supply chains. Organizations should also examine SLAs to select suppliers that maintain strict security protocols themselves, thereby creating ripples of improved security further down the ecosystem.7. Balancing security and user experienceOne of the biggest challenges in cybersecurity is finding a balance between tight security and smooth usability. Overly strict security measures may irritate legitimate users, while lax controls invite the bad guys in. In 2025, as the cyberthreat landscape becomes more sophisticated than ever before, businesses will have to navigate that tension with even greater precision.Context-aware access management systems offer a way forward. These systems take into account user behavior, location, and device type to make intelligent, risk-based decisions about access control.8. Cloud security and misconfiguration risksAs organizations continue to move their services toward the cloud, new risks will emerge. Some of the most frequent reasons for data breaches have to do with misconfigurations of cloud environments: missing access controls, storage buckets that are not secured, or inefficient implementation of security policies.Cloud computing’s benefits need to be balanced by close monitoring and secure configurations in order to prevent the exposure of sensitive data. This requires an organization-wide cloud security strategy: continuous auditing, proper identity and access management, and automation of tools and processes to detect misconfigurations before they become security incidents. Teams will need to be educated on best practices in cloud security and shared responsibility models to mitigate these risks.9. The threat of insider attacksInsider threats are expected to intensify in 2025 due to the continued rise of remote work, AI-powered social engineering, and evolving data privacy concerns. Remote work environments expand the attack surface, making it easier for malicious insiders or negligent employees to expose sensitive data or create access points for external attackers.AI-driven attacks, such as deepfake impersonations and convincing phishing scams, are also likely to become more prevalent, making insider threats harder to detect. The widespread adoption of AI tools also raises concerns about employees inadvertently sharing sensitive data.To mitigate these risks, companies should adopt a multi-layered cybersecurity approach. Implementing zero-trust security models, which assume no entity is inherently trustworthy, can help secure access points and reduce vulnerabilities. Continuous monitoring, advanced threat detection systems, and regular employee training on recognizing social engineering tactics are essential. Organizations must also enforce strict controls over AI tool usage to keep sensitive information protected while maximizing productivity.10. Securing the edge in a decentralized worldWith edge computing, IT infrastructure processes information closer to the end user, reducing latency times significantly and increasing real-time capability. Edge enables innovations such as IoT, autonomous vehicles, and smart cities—major trends for 2025.But decentralization increases security risk. Many edge devices are out of the scope of centralized security perimeters and may have weak protections, thus becoming the main target for an attacker who tries to leverage vulnerable points in a distributed network.Such environments require protection based on multidimensional thinking. AI-powered monitoring systems analyze data in real time and raise flags on suspicious activity before they are exploited. Automated threat detection and response tools allow an organization to take instant measures in a timely manner and minimize the chances of a breach. Advanced solutions, such as those offered by edge-native companies like Gcore, can strengthen edge devices with powerful encryption and anomaly detection capabilities while preserving high performance for legitimate users.Shaping a secure future with GcoreThe trends shaping 2025 show the importance of adopting forward-thinking strategies to address evolving threats. From zero-day attacks and automated cybercrime to data privacy and edge computing, the cybersecurity landscape demands increasingly innovative solutions.Gcore Edge Security is uniquely positioned to help businesses navigate these challenges. By leveraging AI for advanced threat detection, automating compliance processes, and securing edge environments, Gcore empowers organizations to build resilience and maintain trust in an increasingly complex digital world. As the nature of cyber threats becomes more sophisticated, proactive, integrated DDoS and WAAP defenses can help your business stay ahead of emerging threats.Discover Gcore WAAP

Announcing new tools, apps, and regions for your real-world AI use cases
Three updates, one shared goal: helping builders move faster with AI. Our latest releases for Gcore Edge AI bring real-world AI deployments within reach, whether you’re a developer integrating genAI into a workflow, an MLOps team scaling inference workloads, or a business that simply needs access to performant GPUs in the UK.MCP: make AI do moreGcore’s MCP server implementation is now live on GitHub. The Model Context Protocol (MCP) is an open standard, originally developed by Anthropic, that turns AI models into agents that can carry out real-world tasks. It allows you to plug genAI models into everyday tools like Slack, email, Jira, and databases, so your genAI can read, write, and reason directly across systems. Think of it as a way to turn “give me a summary” into “send that summary to the right person and log the action.”“AI needs to be useful, not just impressive. MCP is a critical step toward building AI systems that drive desirable business outcomes, like automating workflows, integrating with enterprise tools, and operating reliably at scale. At Gcore, we’re focused on delivering that kind of production-grade AI through developer-friendly services and top-of-the-range infrastructure that make real-world deployment fast and easy.” — Seva Vayner, Product Director of Edge Cloud and AI, GcoreTo get started, clone the repo, explore the toolsets, and test your own automations.Gcore Application Catalog: inference without overheadWe’ve upgraded the Gcore Model Catalog into something even more powerful: an Application Catalog for AI inference. You can still deploy the latest open models with three clicks. But now, you can also tune, share, and scale them like real applications.We’ve re-architected our inference solution so you can:Run prefill and decode stages in parallelShare KV cache across pods (it’s not tied to individual GPUs) from August 2025Toggle WebUI and secure API independently from August 2025These changes cut down on GPU memory usage, make deployments more flexible, and reduce time to first token, especially at scale. And because everything is application-based, you’ll soon be able to optimize for specific business goals like cost, latency, or throughput.Here’s who benefits:ML engineers can deploy high-throughput workloads without worrying about memory overheadBackend developers get a secure API, no infra setup neededProduct teams can launch demos instantly with the WebUI toggleInnovation labs can move from prototype to production without reconfiguringPlatform engineers get centralized caching and predictable scalingThe new Application Catalog is available now through the Gcore Customer Portal.Chester data center: NVIDIA H200 capacity in the UKGcore’s newest AI cloud region is now live in Chester, UK. This marks our first UK location in partnership with Northern Data. Chester offers 2000 NVIDIA H200 GPUs with BlueField-3 DPUs for secure, high-throughput compute on Gcore GPU Cloud, serving your training and inference workloads. You can reserve your H200 GPU immediately via the Gcore Customer Portal.This launch solves a growing problem: UK-based companies building with AI often face regional capacity shortages, long wait times, or poor performance when routing inference to overseas data centers. Chester fixes that with immediate availability on performant GPUs.Whether you’re training LLMs or deploying inference for UK and European users, Chester offers local capacity, low latency, and impressive capacity and availability.Next stepsExplore the MCP server and start building agentic workflowsTry the new Application Catalog via the Gcore Customer PortalDeploy your workloads in Chester for high-performance UK-based computeDeploy your AI workload in three clicks today!

GPU Acceleration in AI: How Graphics Processing Units Drive Deep Learning
This article discusses how GPUs are shaping a new reality in the hottest subset of AI training: deep learning. We’ll explain the GPU architecture and how it fits with AI workloads, why GPUs are better than CPUs for training deep learning models, and how to choose an optimal GPU configuration.How GPUs Drive Deep LearningThe key GPU features that power deep learning are its parallel processing capability and, at the foundation of this capability, its core (processor) architecture.Parallel ProcessingDeep learning (DL) relies on matrix calculations, which are performed effectively using the parallel computing that GPUs provide. To understand this interrelationship better, let’s consider a simplified training process of a deep learning model. The model takes the input data, such as images, and has to recognize a specific object in these images using a correlation matrix. The matrix summarizes a data set, identifies patterns, and returns results accordingly: If the object is recognized, the model labels it “true”, otherwise it is labeled “false.” Below is a simplified illustration of this process.Figure 1. The simplified illustration of the DL training processAn average DL model has billions of parameters, each of which contributes to the size of the matrix weights used in the matrix calculations. Each of the billion parameters must be taken into account, that’s why the true/false recognition process requires running billions of iterations of the same matrix calculations. The iterations are not linked to each other, they are executed in parallel. GPUs are perfect for handling these types of operations because of their parallel processing capabilities. This is enabled by devoting more transistors to data processing.Core Architecture: Tensor CoresNVIDIA tensor cores are an example of how hardware architecture can effectively adapt to DL and AI. Tensor cores—special kinds of processors—were designed specifically for the mathematical calculations needed for deep learning, while earlier cores were also used for video rendering and 3D graphics. “Tensor” refers to tensor calculations, which are matrix calculations. A tensor is a mathematical object; if a tensor has two dimensions, it is a matrix. Below is a visualization of how a Tensor core calculates matrices.Figure 2. Volta Tensor Core matrix calculations. Source: NVIDIANVIDIA Volta-based chips, like Tesla V100 with 640 tensor cores, became the first fully AI-focused GPUs, and they significantly influenced and accelerated the DL development industry. NVIDIA added tensor cores to its GPU chips in 2017, based on the Volta architecture.Multi-GPU ClustersAnother GPU feature that drives DL training is the ability to increase throughput by building multi-GPU clusters, where many GPUs work simultaneously. This is especially useful when training large, scalable DL models with billions and trillions of parameters. The most effective approach for such training is to scale GPUs horizontally using interfaces such as NVLink and InfiniBand. These high-speed interfaces allow GPUs to exchange data directly, bypassing CPU bottlenecks.Figure 3. NVIDIA H100 with NVLink GPU-to-GPU connections. Source: NVIDIAFor example, with the NVLink switch system, you can connect 256 NVIDIA GPUs in a cluster and get 57.6 Tbps of bandwidth. A cluster of that size can significantly reduce the time needed to train large DL models. Though there are several AI-focused GPU vendors on the market, NVIDIA is the undisputed leader, and makes the greatest contribution to DL. This is one of the reasons why Gcore uses NVIDIA chips for its AI GPU infrastructure.GPU vs. CPU ComparisonA CPU executes tasks serially. Instructions are completed on a first-in, first-out (FIFO) basis. CPUs are better suited for serial task processing because they can use a single core to execute one task after another. CPUs also have a wider range of possible instructions than GPUs and can perform more tasks. They interact with more computer components such as ROM, RAM, BIOS, and input/output ports.A GPU performs parallel processing, which means it processes tasks by dividing them between multiple cores. The GPU is a kind of advanced calculator: it can only receive a limited set of instructions and execute only graphics- and AI-related tasks, such as matrix multiplication (CPU can execute them too.) GPUs only need to interact with the display and memory. In the context of parallel computing, this is actually a benefit, as it allows for a greater number of cores devoted solely to these operations. This specialization enhances the GPU’s efficiency in parallel task execution.An average consumer-grade GPU has hundreds of cores adapted to perform simple operations quickly and in parallel, while an average consumer-grade CPU has 2–16 cores adapted to complex sequential operations. Thus, the GPU is better suited for DL because it provides many more cores to perform the necessary computations faster than the CPU.Figure 4. An average CPU has 2–16 cores, while an average GPU has hundredsThe parallel processing capabilities of the GPU are made possible by dedicating a larger number of transistors to data processing. Rather than relying on large data caches and complex flow control, GPUs can reduce memory access latencies with computation compared to CPUs. This helps avoid long memory access latencies, frees up more transistors for data processing rather than data caching, and, ultimately, benefits highly parallel computations.Figure 5. GPUs devote more transistors to data processing than CPUs. Source: NVIDIAGPUs also use video DRAM: GDDR5 and GDDR6. These are much faster than CPU DRAM: DDR3 and DDR4.How GPU Outperforms CPU in DL TrainingDL requires a lot of data to be transferred between memory and cores. To handle this, GPUs have a specially optimized memory architecture which allows for higher memory bandwidth than CPUs, even when GPUs technically have the same or less memory capacity. For example, a GPU with just 32 GB of HBM (high bandwidth memory) can deliver up to 1.2 Tbps of memory bandwidth and 14 TFLOPS of computing. In contrast, a CPU can have hundreds of GB of HBM, yet deliver only 100 Gbps bandwidth and 1 TFLOPS of computing.Since GPUs are faster in most DL cases, they can also be cheaper when renting. If you know the approximate time you spend on DL training, you can simply check the prices of cloud providers to estimate how much money you will save by using GPUs instead of CPUs.Depending on the configuration, models, and frameworks, GPUs often provide better performance than CPUs in DL training. Here are some direct comparisons:Azure tested various cloud CPU and GPU clusters using the TensorFlow and Keras frameworks for five DL models of different sizes. In all cases, GPU cluster throughput consistently outperformed CPU cluster throughput, with improvements ranging from 186% to 804%.Deci compared the NVIDIA Tesla T4 GPU and the Intel Cascade Lake CPU using the EfficientNet-B2 model. They found that the GPU was 3 times faster than the CPU.IEEE published the results of a survey about running different types of neural networks on an Intel i5 9th generation CPU and an NVIDIA GeForce GTX 1650 GPU. When testing CNN (convolutional neural networks,) which are better suited to parallel computation, the GPU was between 4.9 and 8.8 times faster than the CPU. But when testing ANN (artificial neural networks,) the execution time of CPUs was 1.2 times faster than that of GPUs. However, GPUs outperformed CPUs as the data size increased, regardless of the NN architecture.Using CPU for DL TrainingThe last comparison case shows that CPUs can sometimes be used for DL training. Here are a few more examples of this:There are CPUs with 128 cores that can process some AI workloads faster than consumer GPUs.Some algorithms allow the optimization DL model to perform better training on CPUs. For instance, Rice’s Brown School of Engineering has introduced an algorithm that makes CPUs 15 times faster than GPUs for some AI tasks.There are cases where the precision of a DL model is not critical, like speech recognition under near-ideal conditions without any noise and interference. In such situations, you can train a DL model using floating-point weights (FP16, FP32) and then round them to integers. Because CPUs work better with integers than GPUs, they can be faster, although the results will not be as accurate.However, using CPUs for DL training is still an unusual practice. Most DL models are adapted for parallel computing, i.e., for GPU hardware. Thus, building a CPU-based DL platform is a task that may be both difficult and unnecessary. It can take an unpredictable amount of time to select a multi-core CPU instance and then configure a CPU-adapted algorithm to train your model. By selecting a GPU instance, you get a platform that’s ready to build, train, and run your DL model.How to Choose an Optimal GPU Configuration for Deep LearningChoosing the optimal GPU configuration is basically a two-step process:Determine the stage of deep learning you need to execute.Choose a GPU server specification to match.Note: We’ll only consider specification criteria for DL training, because DL inference (execution of a trained DL model,) as you’ll see, is not such a big deal as training.1. Determine Which Stage of Deep Learning You NeedTo choose an optimal GPU configuration, first you must understand which of two main stages of DL you will execute on GPUs: DL training, or DL inference. Training is the main challenge of DL, because you have to adjust the huge number (up to trillions) of matrix coefficients (weights.) The process is close to a brute-force search for the best combinations to give the best results (though some techniques help to reduce the number of computations, for example, the stochastic gradient descent algorithm.) Therefore, you need maximum hardware performance for training, and vendors make GPUs specifically designed for this. For example, the NVIDIA A100 and H100 GPUs are positioned as devices for DL training, not for inference.Once you have calculated all the necessary matrix coefficients, the model is trained and ready for inference. At this stage, a DL model only needs to multiply the input data and the matrix coefficients once to produce a single result—for example, when a text-to-image AI generator generates an image according to a user’s prompt. Therefore, inference is always simpler than training in terms of math computations and required computational resources. In some cases, DL inference can be run on desktop GPUs, CPUs, and smartphones. An example is an iPhone with face recognition: the relatively modest GPU with 4–5 cores is sufficient for DL inference.2. Choose the GPU Specification for DL TrainingWhen choosing a GPU server or virtual GPU instance for DL training, it’s important to understand what training time is appropriate for you: hours, days, months, etc. To achieve this, you can count operations in the model or use information about reported training time and GPU model performance. Then, decide on the resources you need:Memory size is a key feature. You need to specify at least as much GPU RAM as your DL model size. This is sufficient if you are not pressed for time to market, but if you’re under time pressure then it’s better to specify sufficient memory plus extra in reserve.The number of tensor cores is less critical than the size of the GPU memory, since it only affects the computation speed. However, if you need to train a model faster, then the more cores the better.Memory bandwidth is critical if you need to scale GPUs horizontally, for example, when the training time is too long, the dataset is huge, or the model is highly complex. In such cases, check whether the GPU instances support interconnects, such as NVLink or InfiniBand.So, memory size is the most important thing when training a DL model: if you don’t have enough memory, you won’t be able to run the training. For example, to run the LLaMA model with 7 billion parameters at full precision, the Hugging Face technical team suggests using 28 GB of GPU RAM. This is the result of multiplying 7×4, where 7 is the tensor size (7B), and 4 is four bits for FP32 (the full-precision format.) For FP16 (half-precision), 14 GB is enough (7×2.) The full-precision format provides greater accuracy. The half-precision format provides less accuracy but makes training faster and more memory efficient.Kubernetes as a Tool for Improving DL InferenceTo improve DL inference, you can containerize your model and use a managed Kubernetes service with GPU instances as worker nodes. This will help you achieve greater scalability, resiliency, and cost savings. With Kubernetes, you can automatically scale resources as needed. For example, if the number of user prompts to your model spikes, you will need more compute resources for inference. In that case, more GPUs are allocated for DL inference only when needed, meaning you have no idle resources and no monetary waste.Managed Kubernetes also reduces operational overhead and helps to automate cluster maintenance. A provider manages master nodes (the control plane.) You manage only the worker nodes on which you deploy your model, instead focusing on its development.AI Frameworks that Power Deep Learning on GPUsVarious free, open-source AI frameworks help to train deep neural networks and are specifically designed to be run on GPU instances. All of the following frameworks also support NVIDIA’s Compute Unified Device Architecture (CUDA.) This is a parallel computing platform and API that enables the development of GPU-accelerated applications, including DL models. CUDA can significantly improve their performance.TensorFlow is a library for ML and AI focused on deep learning model training and inference. With TensorFlow, developers can create dataflow graphs. Each graph node represents a matrix operation, and each connection between nodes is a matrix (tensor.) TensorFlow can be used with several programming languages, including Python, C++, JavaScript, and Java.PyTorch is a machine-learning framework based on the Torch library. It provides two high-level features: tensor computing with strong acceleration via GPUs, and deep neural networks built on a tape-based auto-differentiation system. PyTorch is considered more flexible than TensorFlow because it gives developers more control over the model architecture.MXNet is a portable and lightweight DL framework that can be used for DL training and inference not only on GPUs, but also on CPUs and TPUs (Tensor Processing Units.) MXNet supports Python, C++, Scala, R, and Julia.PaddlePaddle is a powerful, scalable, and flexible framework that, like MXNet, can be used to train and deploy deep neural networks on a variety of devices. PaddlePaddle provides over 500 algorithms and pretrained models to facilitate rapid DL development.Gcore’s Cloud GPU InfrastructureAs a cloud provider, Gcore offers AI GPU Infrastructure powered by NVIDIA chips:Virtual machines and bare metal servers with consumer- and enterprise-grade GPUsAI clusters based on servers with A100 and H100 GPUsManaged Kubernetes with virtual and physical GPU instances that can be used as worker nodesWith Gcore’s GPU infrastructure, you can train and deploy DL models of any type and size. To learn more about our cloud services and how they can help in your AI journey, contact our team.ConclusionThe unique design of GPUs, focused on parallelism and efficient matrix operations, makes them the perfect companion for the AI challenges of today and tomorrow, including deep learning. Their profound advantages over CPUs are underscored by their computational efficiency, memory bandwidth, and throughput capabilities.When seeking a GPU, consider your specific deep learning goals, time, and budget. These help you to choose an optimal GPU configuration.Book a GPU instance

GPU Acceleration in AI: How Graphics Processing Units Drive Deep Learning
This article discusses how GPUs are shaping a new reality in the hottest subset of AI training: deep learning. We’ll explain the GPU architecture and how it fits with AI workloads, why GPUs are better than CPUs for training deep learning models, and how to choose an optimal GPU configuration.How GPUs Drive Deep LearningThe key GPU features that power deep learning are its parallel processing capability and, at the foundation of this capability, its core (processor) architecture.Parallel ProcessingDeep learning (DL) relies on matrix calculations, which are performed effectively using the parallel computing that GPUs provide. To understand this interrelationship better, let’s consider a simplified training process of a deep learning model. The model takes the input data, such as images, and has to recognize a specific object in these images using a correlation matrix. The matrix summarizes a data set, identifies patterns, and returns results accordingly: If the object is recognized, the model labels it “true”, otherwise it is labeled “false.” Below is a simplified illustration of this process.Figure 1. The simplified illustration of the DL training processAn average DL model has billions of parameters, each of which contributes to the size of the matrix weights used in the matrix calculations. Each of the billion parameters must be taken into account, that’s why the true/false recognition process requires running billions of iterations of the same matrix calculations. The iterations are not linked to each other, they are executed in parallel. GPUs are perfect for handling these types of operations because of their parallel processing capabilities. This is enabled by devoting more transistors to data processing.Core Architecture: Tensor CoresNVIDIA tensor cores are an example of how hardware architecture can effectively adapt to DL and AI. Tensor cores—special kinds of processors—were designed specifically for the mathematical calculations needed for deep learning, while earlier cores were also used for video rendering and 3D graphics. “Tensor” refers to tensor calculations, which are matrix calculations. A tensor is a mathematical object; if a tensor has two dimensions, it is a matrix. Below is a visualization of how a Tensor core calculates matrices.Figure 2. Volta Tensor Core matrix calculations. Source: NVIDIANVIDIA Volta-based chips, like Tesla V100 with 640 tensor cores, became the first fully AI-focused GPUs, and they significantly influenced and accelerated the DL development industry. NVIDIA added tensor cores to its GPU chips in 2017, based on the Volta architecture.Multi-GPU ClustersAnother GPU feature that drives DL training is the ability to increase throughput by building multi-GPU clusters, where many GPUs work simultaneously. This is especially useful when training large, scalable DL models with billions and trillions of parameters. The most effective approach for such training is to scale GPUs horizontally using interfaces such as NVLink and InfiniBand. These high-speed interfaces allow GPUs to exchange data directly, bypassing CPU bottlenecks.Figure 3. NVIDIA H100 with NVLink GPU-to-GPU connections. Source: NVIDIAFor example, with the NVLink switch system, you can connect 256 NVIDIA GPUs in a cluster and get 57.6 Tbps of bandwidth. A cluster of that size can significantly reduce the time needed to train large DL models. Though there are several AI-focused GPU vendors on the market, NVIDIA is the undisputed leader, and makes the greatest contribution to DL. This is one of the reasons why Gcore uses NVIDIA chips for its AI GPU infrastructure.GPU vs. CPU ComparisonA CPU executes tasks serially. Instructions are completed on a first-in, first-out (FIFO) basis. CPUs are better suited for serial task processing because they can use a single core to execute one task after another. CPUs also have a wider range of possible instructions than GPUs and can perform more tasks. They interact with more computer components such as ROM, RAM, BIOS, and input/output ports.A GPU performs parallel processing, which means it processes tasks by dividing them between multiple cores. The GPU is a kind of advanced calculator: it can only receive a limited set of instructions and execute only graphics- and AI-related tasks, such as matrix multiplication (CPU can execute them too.) GPUs only need to interact with the display and memory. In the context of parallel computing, this is actually a benefit, as it allows for a greater number of cores devoted solely to these operations. This specialization enhances the GPU’s efficiency in parallel task execution.An average consumer-grade GPU has hundreds of cores adapted to perform simple operations quickly and in parallel, while an average consumer-grade CPU has 2–16 cores adapted to complex sequential operations. Thus, the GPU is better suited for DL because it provides many more cores to perform the necessary computations faster than the CPU.Figure 4. An average CPU has 2–16 cores, while an average GPU has hundredsThe parallel processing capabilities of the GPU are made possible by dedicating a larger number of transistors to data processing. Rather than relying on large data caches and complex flow control, GPUs can reduce memory access latencies with computation compared to CPUs. This helps avoid long memory access latencies, frees up more transistors for data processing rather than data caching, and, ultimately, benefits highly parallel computations.Figure 5. GPUs devote more transistors to data processing than CPUs. Source: NVIDIAGPUs also use video DRAM: GDDR5 and GDDR6. These are much faster than CPU DRAM: DDR3 and DDR4.How GPU Outperforms CPU in DL TrainingDL requires a lot of data to be transferred between memory and cores. To handle this, GPUs have a specially optimized memory architecture which allows for higher memory bandwidth than CPUs, even when GPUs technically have the same or less memory capacity. For example, a GPU with just 32 GB of HBM (high bandwidth memory) can deliver up to 1.2 Tbps of memory bandwidth and 14 TFLOPS of computing. In contrast, a CPU can have hundreds of GB of HBM, yet deliver only 100 Gbps bandwidth and 1 TFLOPS of computing.Since GPUs are faster in most DL cases, they can also be cheaper when renting. If you know the approximate time you spend on DL training, you can simply check the prices of cloud providers to estimate how much money you will save by using GPUs instead of CPUs.Depending on the configuration, models, and frameworks, GPUs often provide better performance than CPUs in DL training. Here are some direct comparisons:Azure tested various cloud CPU and GPU clusters using the TensorFlow and Keras frameworks for five DL models of different sizes. In all cases, GPU cluster throughput consistently outperformed CPU cluster throughput, with improvements ranging from 186% to 804%.Deci compared the NVIDIA Tesla T4 GPU and the Intel Cascade Lake CPU using the EfficientNet-B2 model. They found that the GPU was 3 times faster than the CPU.IEEE published the results of a survey about running different types of neural networks on an Intel i5 9th generation CPU and an NVIDIA GeForce GTX 1650 GPU. When testing CNN (convolutional neural networks,) which are better suited to parallel computation, the GPU was between 4.9 and 8.8 times faster than the CPU. But when testing ANN (artificial neural networks,) the execution time of CPUs was 1.2 times faster than that of GPUs. However, GPUs outperformed CPUs as the data size increased, regardless of the NN architecture.Using CPU for DL TrainingThe last comparison case shows that CPUs can sometimes be used for DL training. Here are a few more examples of this:There are CPUs with 128 cores that can process some AI workloads faster than consumer GPUs.Some algorithms allow the optimization DL model to perform better training on CPUs. For instance, Rice’s Brown School of Engineering has introduced an algorithm that makes CPUs 15 times faster than GPUs for some AI tasks.There are cases where the precision of a DL model is not critical, like speech recognition under near-ideal conditions without any noise and interference. In such situations, you can train a DL model using floating-point weights (FP16, FP32) and then round them to integers. Because CPUs work better with integers than GPUs, they can be faster, although the results will not be as accurate.However, using CPUs for DL training is still an unusual practice. Most DL models are adapted for parallel computing, i.e., for GPU hardware. Thus, building a CPU-based DL platform is a task that may be both difficult and unnecessary. It can take an unpredictable amount of time to select a multi-core CPU instance and then configure a CPU-adapted algorithm to train your model. By selecting a GPU instance, you get a platform that’s ready to build, train, and run your DL model.How to Choose an Optimal GPU Configuration for Deep LearningChoosing the optimal GPU configuration is basically a two-step process:Determine the stage of deep learning you need to execute.Choose a GPU server specification to match.Note: We’ll only consider specification criteria for DL training, because DL inference (execution of a trained DL model,) as you’ll see, is not such a big deal as training.1. Determine Which Stage of Deep Learning You NeedTo choose an optimal GPU configuration, first you must understand which of two main stages of DL you will execute on GPUs: DL training, or DL inference. Training is the main challenge of DL, because you have to adjust the huge number (up to trillions) of matrix coefficients (weights.) The process is close to a brute-force search for the best combinations to give the best results (though some techniques help to reduce the number of computations, for example, the stochastic gradient descent algorithm.) Therefore, you need maximum hardware performance for training, and vendors make GPUs specifically designed for this. For example, the NVIDIA A100 and H100 GPUs are positioned as devices for DL training, not for inference.Once you have calculated all the necessary matrix coefficients, the model is trained and ready for inference. At this stage, a DL model only needs to multiply the input data and the matrix coefficients once to produce a single result—for example, when a text-to-image AI generator generates an image according to a user’s prompt. Therefore, inference is always simpler than training in terms of math computations and required computational resources. In some cases, DL inference can be run on desktop GPUs, CPUs, and smartphones. An example is an iPhone with face recognition: the relatively modest GPU with 4–5 cores is sufficient for DL inference.2. Choose the GPU Specification for DL TrainingWhen choosing a GPU server or virtual GPU instance for DL training, it’s important to understand what training time is appropriate for you: hours, days, months, etc. To achieve this, you can count operations in the model or use information about reported training time and GPU model performance. Then, decide on the resources you need:Memory size is a key feature. You need to specify at least as much GPU RAM as your DL model size. This is sufficient if you are not pressed for time to market, but if you’re under time pressure then it’s better to specify sufficient memory plus extra in reserve.The number of tensor cores is less critical than the size of the GPU memory, since it only affects the computation speed. However, if you need to train a model faster, then the more cores the better.Memory bandwidth is critical if you need to scale GPUs horizontally, for example, when the training time is too long, the dataset is huge, or the model is highly complex. In such cases, check whether the GPU instances support interconnects, such as NVLink or InfiniBand.So, memory size is the most important thing when training a DL model: if you don’t have enough memory, you won’t be able to run the training. For example, to run the LLaMA model with 7 billion parameters at full precision, the Hugging Face technical team suggests using 28 GB of GPU RAM. This is the result of multiplying 7×4, where 7 is the tensor size (7B), and 4 is four bits for FP32 (the full-precision format.) For FP16 (half-precision), 14 GB is enough (7×2.) The full-precision format provides greater accuracy. The half-precision format provides less accuracy but makes training faster and more memory efficient.Kubernetes as a Tool for Improving DL InferenceTo improve DL inference, you can containerize your model and use a managed Kubernetes service with GPU instances as worker nodes. This will help you achieve greater scalability, resiliency, and cost savings. With Kubernetes, you can automatically scale resources as needed. For example, if the number of user prompts to your model spikes, you will need more compute resources for inference. In that case, more GPUs are allocated for DL inference only when needed, meaning you have no idle resources and no monetary waste.Managed Kubernetes also reduces operational overhead and helps to automate cluster maintenance. A provider manages master nodes (the control plane.) You manage only the worker nodes on which you deploy your model, instead focusing on its development.AI Frameworks that Power Deep Learning on GPUsVarious free, open-source AI frameworks help to train deep neural networks and are specifically designed to be run on GPU instances. All of the following frameworks also support NVIDIA’s Compute Unified Device Architecture (CUDA.) This is a parallel computing platform and API that enables the development of GPU-accelerated applications, including DL models. CUDA can significantly improve their performance.TensorFlow is a library for ML and AI focused on deep learning model training and inference. With TensorFlow, developers can create dataflow graphs. Each graph node represents a matrix operation, and each connection between nodes is a matrix (tensor.) TensorFlow can be used with several programming languages, including Python, C++, JavaScript, and Java.PyTorch is a machine-learning framework based on the Torch library. It provides two high-level features: tensor computing with strong acceleration via GPUs, and deep neural networks built on a tape-based auto-differentiation system. PyTorch is considered more flexible than TensorFlow because it gives developers more control over the model architecture.MXNet is a portable and lightweight DL framework that can be used for DL training and inference not only on GPUs, but also on CPUs and TPUs (Tensor Processing Units.) MXNet supports Python, C++, Scala, R, and Julia.PaddlePaddle is a powerful, scalable, and flexible framework that, like MXNet, can be used to train and deploy deep neural networks on a variety of devices. PaddlePaddle provides over 500 algorithms and pretrained models to facilitate rapid DL development.Gcore’s Cloud GPU InfrastructureAs a cloud provider, Gcore offers AI GPU Infrastructure powered by NVIDIA chips:Virtual machines and bare metal servers with consumer- and enterprise-grade GPUsAI clusters based on servers with A100 and H100 GPUsManaged Kubernetes with virtual and physical GPU instances that can be used as worker nodesWith Gcore’s GPU infrastructure, you can train and deploy DL models of any type and size. To learn more about our cloud services and how they can help in your AI journey, contact our team.ConclusionThe unique design of GPUs, focused on parallelism and efficient matrix operations, makes them the perfect companion for the AI challenges of today and tomorrow, including deep learning. Their profound advantages over CPUs are underscored by their computational efficiency, memory bandwidth, and throughput capabilities.When seeking a GPU, consider your specific deep learning goals, time, and budget. These help you to choose an optimal GPU configuration.Book a GPU instance

Minecraft and Rust Game Server DDoS Protection: Taking Robust Countermeasures
The online gaming industry is constantly under threat of distributed denial-of-service (DDoS) attacks, as evidenced by the massive attack on Minecraft last year. In the intensely competitive gaming industry, even brief server downtimes can lead to significant financial and reputational loss. Users are willing to migrate quickly to rival games, which underscores the critical importance of maintaining server availability for a company’s sustained success. In response to these persistent threats, we have developed robust countermeasures for Minecraft and Rust game servers.Minecraft DDoS CountermeasureOur tailored countermeasure for Minecraft servers incorporates an advanced approach to ward off DDoS attacks, aimed at preserving the optimal gaming experience:Challenge-response authentication: Utilizes a challenge-response process to authenticate incoming IP addresses.Minecraft protocol ping verification: Verifies the connection using the Minecraft protocol ping to authenticate IP addresses.IP whitelisting: Ensures that only legitimate and authorized IP addresses can access the Minecraft game server, mitigating potential DDoS attacks and preserving gameplay for players.Rust DDoS CountermeasureOur Rust DDoS countermeasure is built around the robust Raknet protocol. It provides an added level of packet inspection and whitelisting for reinforced protection against DDoS attacks:Raknet protocol challenge-response: Leverages the built-in challenge-response feature of the Raknet protocol for authentication.Game server replacement: Temporarily replaces the game server during authentication, forcing it to pass the challenge-response successfully.Passive packet inspection: Actively examines incoming packets to ensure compliance with the Rust game protocol.Whitelisting authorized connections: IP addresses that pass the challenge-response authentication are added to the list of allowed addresses, reinforcing protection against DDoS attacks.Protect Your Game ServersWith our robust countermeasures, Minecraft and Rust game server operators can fortify their infrastructure against DDoS attacks. By implementing challenge-response authentication and protocol verification, we ensure that only legitimate connections are granted access to the game servers. By doing so, we maintain a secure and uninterrupted gaming experience for players, and provide gaming companies and server administrators with the confidence of reliable, attack-resistant operations.Try our DDoS protection for free

Minecraft and Rust Game Server DDoS Protection: Taking Robust Countermeasures
The online gaming industry is constantly under threat of distributed denial-of-service (DDoS) attacks, as evidenced by the massive attack on Minecraft last year. In the intensely competitive gaming industry, even brief server downtimes can lead to significant financial and reputational loss. Users are willing to migrate quickly to rival games, which underscores the critical importance of maintaining server availability for a company’s sustained success. In response to these persistent threats, we have developed robust countermeasures for Minecraft and Rust game servers.Minecraft DDoS CountermeasureOur tailored countermeasure for Minecraft servers incorporates an advanced approach to ward off DDoS attacks, aimed at preserving the optimal gaming experience:Challenge-response authentication: Utilizes a challenge-response process to authenticate incoming IP addresses.Minecraft protocol ping verification: Verifies the connection using the Minecraft protocol ping to authenticate IP addresses.IP whitelisting: Ensures that only legitimate and authorized IP addresses can access the Minecraft game server, mitigating potential DDoS attacks and preserving gameplay for players.Rust DDoS CountermeasureOur Rust DDoS countermeasure is built around the robust Raknet protocol. It provides an added level of packet inspection and whitelisting for reinforced protection against DDoS attacks:Raknet protocol challenge-response: Leverages the built-in challenge-response feature of the Raknet protocol for authentication.Game server replacement: Temporarily replaces the game server during authentication, forcing it to pass the challenge-response successfully.Passive packet inspection: Actively examines incoming packets to ensure compliance with the Rust game protocol.Whitelisting authorized connections: IP addresses that pass the challenge-response authentication are added to the list of allowed addresses, reinforcing protection against DDoS attacks.Protect Your Game ServersWith our robust countermeasures, Minecraft and Rust game server operators can fortify their infrastructure against DDoS attacks. By implementing challenge-response authentication and protocol verification, we ensure that only legitimate connections are granted access to the game servers. By doing so, we maintain a secure and uninterrupted gaming experience for players, and provide gaming companies and server administrators with the confidence of reliable, attack-resistant operations.Try our DDoS protection for free
Subscribe to our newsletter
Get the latest industry trends, exclusive insights, and Gcore updates delivered straight to your inbox.
Vione Ltd