For more information about AI GPU Cloud Infrastructure, please fill out the form
Configurations and prices
Scroll horizontally to view the table
*Prices do not include VAT.
Designed for AI and compute-intensive workloads
With thousands of processing cores, a graphics processing unit (GPU) can perform multiple matrix operations and calculations in parallel. As a result, GPUs complete AI training tasks much faster than traditional CPUs.
GPUs easily handle the high computational demands of deep neural networks and recurrent neural networks, which are fundamental to developing complex deep learning models, including generative AI.
Superior GPU performance is well suited for compute-intensive workloads, including dynamic programming algorithms, video rendering, and scientific simulations.
GPUs provide high memory bandwidth and efficient data transfer capabilities. This improves the processing and manipulation of large data sets, enabling faster analysis.
The NVIDIA A100 and latest H100 GPUs are at the forefront of the enterprise GPU market. Both are powerful
and versatile accelerators for a wide range of AI and high-performance computing (HPC) workloads.
- Up to 249x higher AI inference performance over CPUs
- Up to 20x higher performance than the previous generation of the NVIDIA GPU, V100
- Tensor Core 3rd generation
- Up to 80GB of HBM2e memory
- Up to 4x higher performance than the A100 GPU for AI training on GPT-3
- Up to 7x higher performance than the A100 GPU for HPC applications
- Tensor Core 4th generation
- Up to 100GB of HBM3 memory
Ideal for AI frameworks
NVIDIA GPUs are great for running AI frameworks and tools that help to build, train, and deploy AI models.
Dedicated bare metal GPU servers or virtual GPU instances?
Сhoose what works for you!
Bare metal GPU servers
Bare metal servers provide direct access to the physical hardware, including the GPU. This means that all GPU resources are dedicated to you. Bare metal GPU gives you optimal performance for AI and compute-intensive workloads.
Virtual GPU instances
For the same configuration, GPUs on VMs may perform slightly slower than those on bare metal servers. But VMs offer easier management, scalability, and lower prices than bare metal GPU servers.
Managed Kubernetes with GPU worker nodes
Features like autoscaling and autohealing make Kubernetes ideal for dynamic workloads, including machine learning, video processing, and other compute-intensive tasks. With Gcore’s Managed Kubernetes, you can use Bare Metal and VMs with GPU as worker nodes (A100 and H100.) Simply utilize GPUs in your containers by requesting the custom GPU resource, just like you would request CPU or memory.
Take advantage of
Gcore Cloud solutions
Use Gcore’s AI cloud infrastructure powered by Graphcore IPUs to accelerate machine learning.
Bare metal servers
Deploy resource-intensive applications and services on high-performance physical servers.
Leverage production-grade VMs designed for a wide range of workloads and predictable performance.
Provision, manage, and scale Kubernetes clusters with 99.9% SLA and support for bare metal nodes.
Frequently Asked Questions
A graphics processing unit (GPU) is a specialized electronic circuit designed to improve the rendering of computer graphics. GPUs are used in various applications, including video games, 3D modeling, and AI training.
GPUs are designed for parallel processing, which means that they can execute multiple instructions at the same time. This is the main difference between GPUs and central processing units (CPUs); the latter executes instructions one at a time.
You will be charged for a specific configuration that you choose. If you purchase a separate GPU instance that is not part of a Kubernetes cluster, you will be charged for the corresponding VM or bare metal configuration. See the Configuration and pricing section above to learn more about our pricing.
It depends on the type of instances you choose, bare metal or VMs. If you choose a bare metal server, all of its resources are dedicated to you.
If you choose a VM, you get virtual computing resources, including those of a GPU. The physical resources of the instance (server) are shared, but the virtual resources are not. You get access to the full amount of resources that you purchased.
After you purchase the GPU instance, it is up and running:
- Within 3–5 minutes if it is a virtual machine
- Within 15–20 minutes if it is a bare metal server