AI model selection simplified: your guide to Gcore-supported model selection

AI model selection simplified: your guide to Gcore-supported model selection

2024 has been an exceptional year for advancements in artificial intelligence (AI). The variety of models has grown significantly, with impressive strides in performance across domains. Whether it’s text or image classification, text and image generation, speech models, or multimodal capabilities, businesses now face the challenge of navigating an ever-expanding catalog of open-source models. Understanding the differences in tasks and metrics targeted by these models is crucial to making informed decisions.

At Gcore, we’ve been expanding our model catalog to simplify AI model testing and deployment. As businesses scale their AI applications across various units, identifying the best model for specific tasks becomes critical. For example, some applications, like cancer screening, prioritize accuracy over latency. On the other hand, time-sensitive use cases like fraud detection demand rapid processing, while cost may drive decisions for lightweight applications like chatbot development.

This guide provides a comprehensive overview of the AI models supported on the Gcore platform, their characteristics, and their most effective use cases to help you choose the right model for your needs. Our inference solution also supports custom AI models.

Large language models (LLMs)

LLMs are foundational for applications requiring human-like understanding and generation of text, making them crucial for customer service, research, and educational tools. These models are versatile and cover a range of applications:

  • Text generation (e.g., creative writing, content creation)
  • Summarization
  • Question answering
  • Instruction following (specific to instruct-tuned models)
  • Sentiment analysis
  • Translation
  • Code generation and debugging (if fine-tuned for programming tasks)

Models supported by Gcore

Gcore supports the following models for inference, available in the Gcore Customer Portal. Activate them at the click of a button.

Model nameProviderParametersKey characteristics
LLaMA-Pro-8BMeta AI8 BillionBalanced trade-off between cost and power, suitable for real-time applications.
Llama-3.2-1B-InstructMeta AI1 BillionIdeal for lightweight tasks with minimal computational needs.
Llama-3.2-3B-InstructMeta AI3 BillionOffers lower latency for moderate task complexity.
Llama-3.1-8B-InstructMeta AI8 BillionOptimized for instruction following.
Mistral-7B-Instruct-v0.3Mistral AI7 BillionExcellent for nuanced instruction-based responses.
Mistral-Nemo-Instruct-2407Mistral AI & Nvidia7 BillionHigh efficiency with robust instruction-following capabilities.
Qwen2.5-7B-InstructQwen7 BillionExcels in multilingual tasks and general-purpose applications.
QwQ-32B-PreviewQwen32 BillionSuited for complex, multi-turn conversations and strategic decision-making.
Marco-o1AIDC-AI1-5 Billion (est.)Designed for structured and open-ended problem-solving tasks.

Business applications

LLMs play a pivotal role in various business scenarios; choosing the right model will be primarily influenced by task complexity. For lightweight tasks like chatbot development and FAQ automation, models like Llama-3.2-1B-Instruct are highly effective. Medium complexity tasks, including document summarization and multilingual sentiment analysis, can leverage models like Llama-3.2-3B-Instruct and Qwen2.5-7B-Instruct. For high-performance needs like real-time customer service or healthcare diagnostics, models like LLaMA-Pro-8B and Mistral-Nemo-Instruct-2407 provide robust solutions. Complex, large-scale applications, like market forecasting and legal document synthesis, are ideally suited for advanced models like QwQ-32B-Preview. Additionally, specialized solutions for niche industries can benefit from Marco-o1’s unique capabilities.

Image generation

Image generation models empower industries like entertainment, advertising, and e-commerce to create engaging content that captures the audience’s attention. These models excel in producing creative and high-quality visuals. Key tasks include:

  • Generating photorealistic images
  • Artistic rendering (e.g., illustrations, concept art)
  • Image enhancement (e.g., super-resolution, inpainting)
  • Marketing and branding visuals

Models supported by Gcore

We currently support six models via the Gcore Customer Portal, or you can bring your own image generation model to our inference platform.

Model nameProviderParametersKey characteristics
ByteDance/SDXL-LightningByteDance100-400 MillionLightning-fast text-to-image generation with 1024px outputs.
stable-cascadeStability AI20M-3.6 BillionWorks on smaller latent spaces for faster and cheaper inference.
stable-diffusion-xlStability AI~3.5B Base + 1.2B RefinementPhotorealistic outputs with detailed composition.
stable-diffusion-3.5-large-turboStability AI8 BillionBalances high-quality outputs with faster inference.
FLUX.1-schnellBlack Forest Labs12 BillionDesigned for fast, local development.
FLUX.1-devBlack Forest Labs12 BillionOpen-weight model for non-commercial applications.

Business applications

In high-quality image generation, models like stable-diffusion-xl and stable-cascade are commonly employed for creating marketing visuals, concept art for gaming, and detailed e-commerce product visualizations. Real-time applications, such as AR/VR customizations and interactive customer tools, benefit from the speed of ByteDance/SDXL-Lightning and FLUX.1-schnell. FLUX.1-dev and stable-diffusion-3.5-large-turbo are excellent options for experimentation and development, allowing startups and enterprises to prototype generative AI workflows cost-effectively. Specialized use cases, such as ultra-high-quality visuals for luxury goods or architectural renders, also find tailored solutions with stable-cascade.

Speech recognition

Speech recognition models are essential for industries like media, healthcare, and education, where transcription accuracy and speed directly impact their efficacy. They facilitate:

  • Accurate speech-to-text transcription
  • Low-latency live audio conversion
  • Multilingual speech processing and translation
  • Automated note-taking and content creation

Models supported by Gcore

At Gcore, our inference service supports two Whisper models, as well as custom speech recognition models.

Model nameProviderParametersKey characteristics
whisper-large-v3-turboOpenAI809 MillionOptimized for speed with minimal accuracy trade-offs.
whisper-large-v3OpenAI1.55 BillionHigh-quality multilingual speech-to-text and translation with reduced error rates.

Business applications

Speech recognition technology supports a wide range of business functions, all requiring precision and accuracy, delivered at speed. For real-time transcription, whisper-large-v3-turbo is ideal for live captioning and speech analytics applications. High-accuracy tasks, including legal transcription, academic research, and multilingual content localization, leverage the advanced capabilities of whisper-large-v3. These models enable faster, more accurate workflows in sectors where precise audio-to-text conversion is crucial.

Multimodal models

By bridging text, image, and other data modalities, multimodel models unlock innovative solutions for industries requiring complex data analysis. These models integrate diverse data types for applications in:

  • Image captioning
  • Visual question answering
  • Multilingual document processing
  • Robotic vision

Models supported by Gcore

We currently support the following multimodal models:

Model nameProviderParametersKey characteristics
Pixtral-12B-2409Mistral AI12 BillionExcels in instruction-following tasks with text and image integration.
Qwen2-VL-7B-InstructQwen7 BillionAdvanced visual understanding and multilingual support.

Business applications

For tasks like image captioning and visual question answering, Pixtral-12B-2409 provides robust capabilities in generating descriptive text and answering questions based on visual content. Qwen2-VL-7B-Instruct supports document analysis and robotic vision, enabling systems to extract insights from documents or understand their physical surroundings. These applications are transformative for industries ranging from digital media to robotics.

A multitude of models, supported by Gcore

Start developing on the Gcore platform today, leveraging top-tier GPUs for seamless AI model training and deployment. Simplify large-scale, cross-regional AI operations with our inference-at-the-edge solutions, backed by over a decade of CDN expertise.

Get started with Inference at the Edge today

AI model selection simplified: your guide to Gcore-supported model selection

Subscribe
to our newsletter

Get the latest industry trends, exclusive insights, and Gcore
updates delivered straight to your inbox.