AI model selection simplified: your guide to Gcore-supported model selection

By Gcore

December 20, 2024

4 min read

AI model selection simplified: your guide to Gcore-supported model selection

2024 has been an exceptional year for advancements in artificial intelligence (AI). The variety of models has grown significantly, with impressive strides in performance across domains. Whether it’s text or image classification, text and image generation, speech models, or multimodal capabilities, businesses now face the challenge of navigating an ever-expanding catalog of open-source models. Understanding the differences in tasks and metrics targeted by these models is crucial to making informed decisions.

At Gcore, we’ve been expanding our model catalog to simplify AI model testing and deployment. As businesses scale their AI applications across various units, identifying the best model for specific tasks becomes critical. For example, some applications, like cancer screening, prioritize accuracy over latency. On the other hand, time-sensitive use cases like fraud detection demand rapid processing, while cost may drive decisions for lightweight applications like chatbot development.

This guide provides a comprehensive overview of the AI models supported on the Gcore platform, their characteristics, and their most effective use cases to help you choose the right model for your needs. Our inference solution also supports custom AI models.

Large language models (LLMs)

LLMs are foundational for applications requiring human-like understanding and generation of text, making them crucial for customer service, research, and educational tools. These models are versatile and cover a range of applications:

Text generation (e.g., creative writing, content creation)
Summarization
Question answering
Instruction following (specific to instruct-tuned models)
Sentiment analysis
Translation
Code generation and debugging (if fine-tuned for programming tasks)

Models supported by Gcore

Gcore supports the following models for inference, available in the Gcore Customer Portal. Activate them at the click of a button.

Model name	Provider	Parameters	Key characteristics
LLaMA-Pro-8B	Meta AI	8 Billion	Balanced trade-off between cost and power, suitable for real-time applications.
Llama-3.2-1B-Instruct	Meta AI	1 Billion	Ideal for lightweight tasks with minimal computational needs.
Llama-3.2-3B-Instruct	Meta AI	3 Billion	Offers lower latency for moderate task complexity.
Llama-3.1-8B-Instruct	Meta AI	8 Billion	Optimized for instruction following.
Mistral-7B-Instruct-v0.3	Mistral AI	7 Billion	Excellent for nuanced instruction-based responses.
Mistral-Nemo-Instruct-2407	Mistral AI & Nvidia	7 Billion	High efficiency with robust instruction-following capabilities.
Qwen2.5-7B-Instruct	Qwen	7 Billion	Excels in multilingual tasks and general-purpose applications.
QwQ-32B-Preview	Qwen	32 Billion	Suited for complex, multi-turn conversations and strategic decision-making.
Marco-o1	AIDC-AI	1-5 Billion (est.)	Designed for structured and open-ended problem-solving tasks.

Business applications

LLMs play a pivotal role in various business scenarios; choosing the right model will be primarily influenced by task complexity. For lightweight tasks like chatbot development and FAQ automation, models like Llama-3.2-1B-Instruct are highly effective. Medium complexity tasks, including document summarization and multilingual sentiment analysis, can leverage models like Llama-3.2-3B-Instruct and Qwen2.5-7B-Instruct. For high-performance needs like real-time customer service or healthcare diagnostics, models like LLaMA-Pro-8B and Mistral-Nemo-Instruct-2407 provide robust solutions. Complex, large-scale applications, like market forecasting and legal document synthesis, are ideally suited for advanced models like QwQ-32B-Preview. Additionally, specialized solutions for niche industries can benefit from Marco-o1’s unique capabilities.

Image generation

Image generation models empower industries like entertainment, advertising, and e-commerce to create engaging content that captures the audience’s attention. These models excel in producing creative and high-quality visuals. Key tasks include:

Generating photorealistic images
Artistic rendering (e.g., illustrations, concept art)
Image enhancement (e.g., super-resolution, inpainting)
Marketing and branding visuals

Models supported by Gcore

We currently support six models via the Gcore Customer Portal, or you can bring your own image generation model to our inference platform.

Model name	Provider	Parameters	Key characteristics
ByteDance/SDXL-Lightning	ByteDance	100-400 Million	Lightning-fast text-to-image generation with 1024px outputs.
stable-cascade	Stability AI	20M-3.6 Billion	Works on smaller latent spaces for faster and cheaper inference.
stable-diffusion-xl	Stability AI	~3.5B Base + 1.2B Refinement	Photorealistic outputs with detailed composition.
stable-diffusion-3.5-large-turbo	Stability AI	8 Billion	Balances high-quality outputs with faster inference.
FLUX.1-schnell	Black Forest Labs	12 Billion	Designed for fast, local development.
FLUX.1-dev	Black Forest Labs	12 Billion	Open-weight model for non-commercial applications.

Business applications

In high-quality image generation, models like stable-diffusion-xl and stable-cascade are commonly employed for creating marketing visuals, concept art for gaming, and detailed e-commerce product visualizations. Real-time applications, such as AR/VR customizations and interactive customer tools, benefit from the speed of ByteDance/SDXL-Lightning and FLUX.1-schnell. FLUX.1-dev and stable-diffusion-3.5-large-turbo are excellent options for experimentation and development, allowing startups and enterprises to prototype generative AI workflows cost-effectively. Specialized use cases, such as ultra-high-quality visuals for luxury goods or architectural renders, also find tailored solutions with stable-cascade.

Speech recognition

Speech recognition models are essential for industries like media, healthcare, and education, where transcription accuracy and speed directly impact their efficacy. They facilitate:

Accurate speech-to-text transcription
Low-latency live audio conversion
Multilingual speech processing and translation
Automated note-taking and content creation

Models supported by Gcore

At Gcore, our inference service supports two Whisper models, as well as custom speech recognition models.

Model name	Provider	Parameters	Key characteristics
whisper-large-v3-turbo	OpenAI	809 Million	Optimized for speed with minimal accuracy trade-offs.
whisper-large-v3	OpenAI	1.55 Billion	High-quality multilingual speech-to-text and translation with reduced error rates.

Business applications

Speech recognition technology supports a wide range of business functions, all requiring precision and accuracy, delivered at speed. For real-time transcription, whisper-large-v3-turbo is ideal for live captioning and speech analytics applications. High-accuracy tasks, including legal transcription, academic research, and multilingual content localization, leverage the advanced capabilities of whisper-large-v3. These models enable faster, more accurate workflows in sectors where precise audio-to-text conversion is crucial.

Multimodal models

By bridging text, image, and other data modalities, multimodel models unlock innovative solutions for industries requiring complex data analysis. These models integrate diverse data types for applications in:

Image captioning
Visual question answering
Multilingual document processing
Robotic vision

Models supported by Gcore

We currently support the following multimodal models:

Model name	Provider	Parameters	Key characteristics
Pixtral-12B-2409	Mistral AI	12 Billion	Excels in instruction-following tasks with text and image integration.
Qwen2-VL-7B-Instruct	Qwen	7 Billion	Advanced visual understanding and multilingual support.

Business applications

For tasks like image captioning and visual question answering, Pixtral-12B-2409 provides robust capabilities in generating descriptive text and answering questions based on visual content. Qwen2-VL-7B-Instruct supports document analysis and robotic vision, enabling systems to extract insights from documents or understand their physical surroundings. These applications are transformative for industries ranging from digital media to robotics.

A multitude of models, supported by Gcore

Start developing on the Gcore platform today, leveraging top-tier GPUs for seamless AI model training and deployment. Simplify large-scale, cross-regional AI operations with our inference-at-the-edge solutions, backed by over a decade of CDN expertise.

Get started with Inference at the Edge today

What are virtual machines?

A virtual machine (VM), also called a virtual instance, is a software-based version of a physical computer. Instead of running directly on hardware, a VM operates inside a program that emulates a complete computer system, including a processor, memory, storage, and network connections. This allows multiple VMs to run on a single physical machine, each with its own operating system and applications, as if they were independent computers.VMS are useful because they provide flexibility, isolation, and scalability. Since each VM is self-contained, it can run different operating systems (like Windows, Linux, or macOS) on the same hardware without affecting other VMs or the host machine. This makes them ideal for testing software, running legacy applications, or efficiently using server resources in data centers. Because VMs exist as software, they can be easily copied, moved, or backed up, making them a powerful tool for both individuals and businesses.Read on to learn about types of VMs, their benefits, common use cases, and how to choose the right VM provider for your needs.How do VMs work?A virtual machine (VM) runs inside a program called a hypervisor, which acts as an intermediary between the VM and the actual computer hardware. Every time a VM needs to perform an action—such as running software, accessing storage, or using the processor—the hypervisor intercepts these requests and decides how to allocate resources like CPU power, memory, and disk space. You can think of a hypervisor as an operating system for VMs, managing multiple virtual machines on a single physical computer. Popular hypervisors like VirtualBox and VMware enable users to run multiple operating systems simultaneously while providing strong isolation.Modern hypervisors optimize performance by giving VMs direct access to certain hardware components when possible, reducing the need for constant intervention. However, some level of overhead remains because the hypervisor still needs to manage and coordinate resources efficiently. This means that while VMs can leverage most of the system’s hardware, they can’t use 100% of it, as some processing power is always reserved for managing virtualization itself. This small trade-off is often worth it, as hypervisors keep each VM isolated and secure, preventing one VM from interfering with another.VM layersFigure 1 illustrates the layers of a system virtual machine setup. The layer model can vary depending on the hypervisor. Some hypervisors include a built-in host operating system, while modern hardware offers native virtualization support. Many hypervisors can also manage multiple physical machines and VMs efficiently.VM snapshots are an essential feature in cloud computing, allowing users to quickly restore a virtual machine to a previous state.Figure 1: Layers of system virtual machinesHypervisors that emulate hardware architectures different from what the guest OS expects have a bigger overhead, as they can’t relay commands directly to the hardware without first translating them.VM snapshotsVM snapshots are an essential feature in cloud computing, allowing users to quickly restore a virtual machine to a previous state. The hypervisor can save the complete state of the VM and restore it at a later time to skip the boot process of the guest OS. The hypervisor can also move these snapshots between different physical machines, making the software running in the VM completely independent from the underlying hardware.What are the benefits of using VMs?Virtual machines offer benefits including resource efficiency, isolation, simplified operations, easy migration, faster deployment, cost savings, and security. Let’s look at these one by one.Multiple VMs can run on a single physical machine, making sharing resources between various guest operating systems easier. This is especially important when each guest OS needs to be isolated from the others, such as when they belong to different customers of a cloud service provider. Sharing resources through VMs makes running a server cheaper because you don’t have to buy or rent a whole physical machine, but only parts of it.Since VMs abstract the underlying hardware, they also improve resilience. If the physical machine fails, the hypervisor can perform a quick recovery by moving the snapshots to another machine without changing the guest OS installations to minimize downtime. This abstraction also allows operations teams to focus their deployment efforts on a standardized VM instead of considering different physical implementations.Migrations become easier with snapshots as you can simply move them to a faster machine without modifying the software running inside the VM.Faster deployments are possible because starting a VM is just a software execution instead of setting up a physical server in a data center. While you had to buy a server or rent it for months, with fast deployments, you can now rent a machine for hours, minutes, or even seconds, which allows for quite some savings.Modern CPUs have built-in virtualization features that enable easy resource sharing and enforce the isolation at the hardware layer. This prevents the services of one VM from accessing the resources of the others, improving security compared to running multiple apps inside one OS.Common use cases for VMsVMs have a range of use cases. Let’s look at the most popular ones.Cloud computingThe most popular use case is cloud computing, where VMs allow the secure sharing of the cloud provider’s resources, enabling their customers to rent only the resources they need for the period their workload will run.Software development and testingSoftware development often requires specific tools and libraries that aren’t available on a production machine, so having a development VM with all these tools preinstalled can be helpful. An example is cloud IDEs, which look and feel like regular IDEs but run on a cloud VM. A developer can have one for each project with the required dev tools installed.VMs also allow a developer to set up a machine for software testing that looks exactly like the production environment. Here, the opposite of the development VM is required; it should not have any development tools installed because they would also be missing from production.Cross-platform developmentA special case of the software development use case is cross-platform development. When you implement an app for Android or iOS, for example, you usually don’t do this on a mobile device but on your computer. With VMs, developers can simulate different hardware environments, enabling cross-platform testing without requiring physical devices.Legacy system supportIf the hardware your application requires is no longer in production, a VM might be the only way to keep running your software without reimplementing it. This is similar to the cross-platform development use case, as the VM emulates different hardware, but the difference is that the hardware no longer exists.How to choose the right VM providerTo find the right provider for your workload, the most important factor to assess is your own workload requirements. Ask the following questions and compare the answers to what providers offer.Is your workload compute or I/O-bound?Many workloads, like web servers, are I/O-bound. They don’t make complex calculations but rather simply load data and send it over the network. If you need a VM for an I/O-bound workload, you care more about disk and memory size, as well as network speed.However, compute-heavy workloads, such as AI inference or Kubernetes clusters, require careful resource allocation. If you’re evaluating whether to run Kubernetes on bare metal or VMs, check out our white paper on Bare Metal vs. VM-based Kubernetes Clusters for an in-depth comparison.If your workload is compute-bound instead, you need a high-performance CPU or a GPU and loads of memory. An AI inference engine, for example, only sends a bit of text to a client, but it does many calculations to generate this text.How long will your workload run?Web servers usually run indefinitely, but some workloads only run a few hours or minutes. If you’re doing AI training, you don’t want to pay for your huge VM cluster 24/7 if it only runs a few hours or days a week. In such cases, looking for a provider that allows renting your desired VM type hourly on a pay-as-you-go model might be worthwhile.Certain cloud providers offer cost-effective spot instances, which provide lower prices for non-critical workloads that can tolerate interruptions. These cheap VMs can get shut down at any time with minimal notice, but if your calculations aren’t time-critical, you might save quite a bit of money here.How does your workload scale?Scaling in the cloud is usually done horizontally. That is, by adding more VMs and distributing the work between them. Workloads can have different requirements for when and how fast they must be added and removed.In the AI training example, you might know in advance that one training takes more resources than the other, so you can provision enough VMs when starting. However, a web server workload might change its requirements constantly. Hence, you need a load balancer that automatically scales the instances up and down depending on the number of clients that want to access your service.Do you handle sensitive data?You might have to comply with specific laws and regulations depending on your jurisdiction(s) and industry. This means you must check whether the cloud provider also complies. How secure are their data centers? Where are they located? Do they support encryption in transit, at rest, and in process?What are your reliability requirements?Reliability is a question of costs and, again, of compliance. You might get into financial or regulatory troubles if your workload can’t run. Cloud providers often boast about their guaranteed uptimes, but remember that 99% uptime a year still means over three days of potential downtime. Check your needs and then seek a provider that can meet them reliably.Do you need customer support?If your organization doesn’t have the know-how for operating VMs in the cloud, you might need technical support from the provider. Most cloud providers are self-service, offering you a GUI and an API to manage resources. If your business lacks the resources to operate VMs, seek out a provider that can manage VMs on your behalf.SummaryVMs are a core technology for cloud computing and software development alike. They enable efficient resource sharing, improve security with hardware-enforced guest isolation, and simplify migration and disaster recovery. Choosing the right VM provider starts with understanding your workload requirements, from resource allocation to security and scalability.Maximize cloud efficiency with Gcore Virtual Machines—engineered for high performance, seamless scalability, and enterprise-grade security at competitive pricing. Whether you need to run workloads at scale or deploy applications in seconds, our VMs provide enterprise-grade security, built-in resilience, and optimized resource allocation, all powered by cutting-edge infrastructure. With global reach, fast provisioning, egress traffic included, and pay-as-you-go pricing, you get the scalability and reliability your business needs without overspending. Start your journey with Gcore VMs today and experience cloud computing that’s built for speed, security, and savings.Discover Gcore VMs

AI model selection simplified: your guide to Gcore-supported model selection

Large language models (LLMs)

Models supported by Gcore

Business applications

Image generation

Models supported by Gcore

Business applications

Speech recognition

Models supported by Gcore

Business applications

Multimodal models

Models supported by Gcore

Business applications

A multitude of models, supported by Gcore

Related articles

Securing AI from the ground up: defense across the lifecycle

How AI is reshaping the future of interactive streaming

What are virtual machines?

How to deploy DeepSeek 70B with Ollama and a Web UI on Gcore Everywhere Inference

What is AI inference and how does it work?

How to Run Hugging Face Spaces on Gcore Inference at the Edge

Subscribe to our newsletter