Gcore named a Leader in the GigaOm Radar for AI Infrastructure!Get the report
  1. Home
  2. Blog
  3. Generative AI: The Future of Creativity, Powered by IPU and GPU

Generative AI: The Future of Creativity, Powered by IPU and GPU

  • By Gcore
  • September 18, 2023
  • 8 min read
Generative AI: The Future of Creativity, Powered by IPU and GPU

In this article, we explore how Intelligence Processing Units (IPUs) and graphics processing units (GPUs) drive the rapid evolution of generative AI. You’ll learn how generative AI works, how IPU and GPU help in its development, what’s important when choosing AI infrastructure, and you’ll see generative AI projects by Gcore.

What Is Generative AI?

Generative AI, or GenAI, is artificial intelligence that can generate content in response to users’ prompts. The content types generated include text, images, audio, video, and code. The goal is for the generated content to be human-like, suitable for practical use, and to correspond with the prompt as much as possible. GenAI is trained by learning patterns and structures from input data and then utilizing that knowledge to generate new and unique outputs.

Here are a few examples of the GenAI tools with which you may be familiar:

  • ChatGPT is an AI chatbot that can communicate with humans and write high-quality text and code. It has been taught using vast quantities of data available on the internet.
  • DALL-E 2 is an AI image generator that can create images from text descriptions. DALL-E 2 has been trained on a large set of images and text, producing images that look lifelike and attractive.
  • Whisper is a speech-to-text AI system that can identify, translate, and transcribe 57 languages (a number that continues to grow.) It has been trained on 680,000 hours of multilingual data. This is a GenAI example in which accuracy is more important than creativity.

GenAI has potential applications in various fields. According to the 2023 McKinsey survey of different industries, marketing and sales, product and service development, and service operations are the most commonly reported uses of GenAl this year.

Popular Generative AI Tools

The table below shows examples of different Generative AI tools: chatbots, text-to-image generators, text-to-video generators, speech-to-text generators, and text-to-code generators. Some of them are already mature whereas others are still in beta testing (as marked on the table) but look promising.

GenAI typeApplicationsEngines/ModelsAccessDeveloper
ChatbotsChatGPTGPT-3.5, GPT-4Free, paidOpenAI
Bard BetaLaMDAFreeGoogle
Bing ChatGPT-4FreeMicrosoft
Text-to-image generatorsDALL-E 2 BetaGPT-3, CLIPFreeOpenAI
Midjourney BetaLLMPaidMidjourney
Stable DiffusionLDM, CLIPFreeStability AI
Text-to-video generatorsPika Labs BetaUnknownFreePika Labs
Gen-2LDMPaidRunaway
Imagen Video BetaCDM, U-NetN/AGoogle
Speech-to-text generatorsWhisperCustom GPTFreeOpenAI
Google Cloud Speech-to-TextConformer Speech Model technologyPaidGoogle
DeepgramCustom LLMPaidDeepgram
Text-to-code generatorsGitHub CopilotOpenAI CodexPaidGitHub, OpenAI
Amazon CodeWhispererUnknownFree, paidAmazon
ChatGPTGPT-3.5, GPT-4Free, paidOpenAI

These GenAI tools require specialized AI infrastructure, such as servers with IPU and GPU modules, to train and function. We will discuss IPUs and GPUs later. First, let’s understand how GenAI works on a higher level.

How Does Generative AI Work?

A GenAI system learns structures and patterns from a given dataset of similar content, such as massive amounts of text, photos, or music; for example, ChatGPT was trained on 570 GB of data from books, websites, research articles, and other forms of content available on the internet. According to ChatGPT itself, this is the equivalent of approximately 389,120 full-length eBooks in ePub format! Using that knowledge, the GenAI system then creates new and unique results. Here is a simplified illustration of this process:

Figure 1: A simplified process of how GenAI works

Let’s look at two key phases of how GenAI works: training GenAI on real data and generating new data.

Training on Real Data

To learn patterns and structures, GenAI systems utilize different types of machine learning and deep learning techniques, most commonly neural networks. A neural network is an algorithm that mimics the human brain to create a system of interconnected nodes that learn to process information by changing the weights of the connections between them. The most popular neural networks are GANs and VAEs.

Generative adversarial networks (GANs)

Generative adversarial networks (GANs) are a popular type of neural network used for GenAI training. Image generators DALL-E 2 and Midjourney were trained using GANs.

GANs operate by setting two neural networks against one another:

  • The generator produces new data based on the given real data set.
  • The discriminator determines whether the newly generated data is genuine or artificially generated, i.e., fake.

The generator tries to fool the discriminator. The ultimate goal is to generate data that the discriminator can’t distinguish from real data.

Variational autoencoders (VAEs)

Variational autoencoders (VAEs) are another well-known type of neural network used for image, text, music, and other content generation. The image generator Stable Diffusion was trained mostly using VAEs.

VAEs consist of two neural networks:

  • The encoder receives training data, such as a photo, and maps it to a latent space. Latent space is a lower dimensional representation of the data that captures the essential features of the input data.
  • The decoder analyzes the latent space and generates a new data sample, e.g., a photo imitation.

Comparing GANs and VAEs

Here are the basic differences between VAEs and GANs:

  • VAEs are probabilistic models, meaning they can generate new data that is more diverse than GANs.
  • VAEs are easier to train but don’t generally produce as high-quality images as GANs. GANs can be more difficult to work with but produce better photo-realistic images.
  • VAEs work better for signal processing use cases, such as anomaly detection for predictive maintenance or security analytics applications, while GANs are better at generating multimedia.

To get more efficient AI models, developers often train them using combinations of different neural networks.The entire training process can take minutes to months, depending on your goals, dataset, and resources.

Generating New Data

Once a generative AI tool has completed its training, it can generate new data; this stage is called inference. A user enters a prompt to generate the content, such as an image, a video, or a text. The GenAI system produces new data according to the user’s prompt.

For the most relevant results, it is ideal to train generative AI systems with a focus on a particular area. As a crude example, if you want a GenAI system to produce high-quality images of kangaroos, it’s best to train the system on images of kangaroos rather than on all existing animals. That’s why gathering relevant data to train AI models is one of the key challenges. This requires the tight collaboration of subject matter experts and data scientists.

How IPU and GPU Help to Develop Generative AI

There are two primary options when it comes to how you develop a generative AI system. You can utilize a prebuilt AI model and fine-tune it to your needs, or embark on the ambitious journey of training an AI model from the ground up. Regardless of your approach, access to AI infrastructure—IPU and GPU servers—is indispensable. There are two main reasons for this:

  • GPU and IPU architectures are adapted for AI workloads
  • GPU and IPU are available in the Cloud

Adapted Architecture

Intelligence Processing Units (IPUs) and graphics processing units (GPUs) are specialized hardware designed to accelerate the training and inference of AI models, including models for GenAI training. Their main advantage is that each IPU or GPU module has thousands of cores simultaneously processing data. This makes them ideal for parallel computing, essential in AI training.

As a result, GPUs are usually better deep learning accelerators than, for example, CPUs, which are suitable for sequential tasks but not parallel processing. While the server version of the CPU can have a maximum of 128 cores, a processor in the IPU, for example, has 1472 cores.

Here are the basic differences between GPUs and IPUs:

  • GPUs were initially designed for graphics processing, but their efficient parallel computation capabilities also make them well-suited for AI workloads. GPUs are the ideal choice for training and inference ML models. There are several AI-focused GPU hardware vendors on the market, but the clear leader is NVIDIA.
  • IPUs are a newer type of hardware designed specifically for AI workloads. They are even more efficient than GPUs at performing parallel computations. IPUs are ideal for training and deploying the most sophisticated AI applications, like large language models (LLMs.) Graphcore is the developer and sole vendor of IPUs, but there are some providers, like Gcore, that offer Graphcore IPUs in the cloud.

Availability in the Cloud

Typically, even enterprise-level AI developers don’t buy physical IPU/GPU servers because they are extremely expensive, costing up to $270,000. Instead, developers rent virtual and bare metal IPU/GPU instances from cloud providers on a per-minute or per-hour basis. This is also more convenient because AI training is an iterative process. When you need to run the next training iteration, you rent a server or virtual machine and pay only for the time you actually use it. The same applies to deploying a trained GenAI system for user access: You’ll need the parallel processing capabilities of IPUs/GPUs for better inference speed when generating new data, so you have to either buy or rent this infrastructure.

What’s Important When Choosing AI Infrastructure?

When choosing AI infrastructure, you should consider which type of AI accelerator better suits your needs in terms of performance and cost.

GPUs are usually an easier way to train models since there are a lot of prebuilt frameworks adapted for GPUs, including PyTorch, TensorFlow, and PaddlePaddle. NVIDIA also offers CUDA for its GPUs; this is a parallel computing software that works perfectly with programming languages widely used in AI development, like C and C++. As a result, GPUs are more suitable if you don’t have deep knowledge of AI training and fine-tuning, and want to get results faster using prebuilt AI models.

IPUs are better than GPUs for complex AI training tasks because they were designed specifically for that task, not for video rendering, for example, as GPUs were originally designed to do. However, due to its newness, IPUs support fewer prebuilt AI frameworks out-of-the-box than GPUs. When you are trying to perform a novel AI training task and therefore don’t have a prebuilt framework, you need to adapt an AI framework or AI model and even write code from scratch to run it. All of this requires technical expertise. However, Graphcore is actively developing SDKs and instructions to ease the use of their hardware.

Graphcore’s IPUs also support packing, a technique that significantly reduces the time required to pre-train, fine-tune, and infer from LLMs. Below is an example of how IPUs excel GPUs in inference for a language learning model based on the BERT architecture when using packing.

Figure 2: IPU outperforms GPU in inference for a BERT-flavored LLM when using packing

Cost-effectiveness is another important consideration when choosing an AI infrastructure. Look for benchmarks that compare AI accelerators in terms of performance per dollar/euro. This can help you to identify efficient choices by finding the right balance between price and compute power, and could save you a lot of money if you plan a long-term project.

Understanding the potential costs of renting AI infrastructure helps you to plan your budget correctly. Research the prices of cloud providers and calculate how much a specific server with a particular configuration will cost you per minute, hour, day, and so on. For more accurate calculations, you need to know the approximate time you’ll need to spend on training. This requires some mathematical effort, especially if you’re developing a GenAI model from scratch. To estimate the training time, you can count the number of operations needed or look at the GPU time.

Our Generative AI Projects

Gcore’s GenAI projects offer powerful examples of the fine-tuning approach to AI training, using IPU infrastructure.

English to Luxembourgish Translation Service

Gcore’s speech-to-text AI service translates English speech into Luxembourgish text on the go. The tool is based on the Whisper neural network and has been fine-tuned by our AI developers.

Figure 3: The UI of Gcore’s speech-to-text AI service

The project is an example of fine-tuning an existing speech-to-text GenAI model when it doesn’t support a specific language. The base version of Whisper didn’t support Luxembourgish, so our developers had to train the model to help Whisper learn this skill. A GenAI tool with any local or rare language not supported by existing LLMs could be created in the same way.

AI Image Generator

Al Image Generator is a generative AI tool free for all users registered to the Gcore Platform. It takes your text prompts and creates images of different styles. To develop the Image Generator, we used the prebuilt Openjourney GenAI model. We fine-tuned it using datasets for specific areas, such as gaming, to extend its capabilities and generate a wider range of images. Like our speech-to-text service, the Image Generator is powered by Gcore’s AI IPU infrastructure.

Figure 4: Image examples generated by Gcore’s AI Image Generator

The AI Image Generator is an example of how GenAI models like Openjourney can be customized to generate data with the style and context you need. The main problem with a pretrained model is that it is typically trained on large datasets and may lack accuracy when you need more specific results, like a highly specific stylization. If the prebuilt model doesn’t produce content that matches your expectations, you can collect a more relevant dataset and train your model to get more accurate results, which is what we did at Gcore. This approach can save significant time and resources, as it doesn’t require training the model from scratch.

Future Gcore AI Projects

Here’s what’s in the works for Gcore AI:

  • Custom AI model tuning will help to develop AI models for different purposes and projects. A customer can provide their dataset to train a model for their specific goal. For example, you’ll be able to generate graphics and illustrations according to the company’s guidelines, which can reduce the burden on designers.
  • AI models marketplace will provide ready-made AI models and frameworks in Gcore Cloud, similar to how our Cloud App Marketplace provides prebuilt cloud applications. Customers will be able to deploy these AI models on Virtual Instances or Bare Metal servers with GPU and IPU modules and either use these models as they are or fine-tune them for specific use cases.

Conclusion

IPUs and GPUs are fundamental to parallel processing, neural network training, and inference. This makes such infrastructure essential for generative AI development. However, GenAI developers need to have a clear understanding of their training goals. This will allow them to utilize the AI infrastructure properly, achieving maximum efficiency and best use of resources.

Try IPU for free

Related articles

From budget strain to AI gain: Watch how studios are building smarter with AI

Game development is in a pressure cooker. Budgets are ballooning, infrastructure and labor costs are rising, and players expect more complexity and polish with every release. All studios, from the major AAAs to smaller indies, are feeling the strain.But there is a way forward. In a recent webinar, Sean Hammond, Territory Manager for the UK and Nordics at Gcore, explained how AI is reshaping game development workflows and how the right infrastructure strategy can reduce costs, speed up production, and create better player experiences.Scroll on to watch key moments from Sean's talk and explore how studios can make AI work for them.Rising costs are threatening game developmentGame revenue has slowed, but development costs continue to rise. Some AAA titles now surpass $100 million in development budgets. The complexity of modern games demands more powerful servers, scalable infrastructure, and larger teams, making the industry increasingly unsustainable.Personnel and infrastructure costs are also climbing. Developers, artists, and QA testers with specialized skills are in high demand, as are technologies like VR, AR, and AI. Studios are also having to invest more in cybersecurity to protect player data, detect cheating, and safeguard in-game economies.AI is revolutionizing GameDev, even without a perfect use caseWhile the perfect use case for AI in gaming may not have been found yet, it’s already transforming how games are built, tested, and personalized.Sean highlighted emerging applications, including:Smarter QA testingAI-driven player personalizationReal-time motion and animationAccelerated environment and character designMultilingual localizationAdaptive game balancingStudios are already applying these technologies to reduce production timelines and improve immersion.The challenge of secure, scalable AI adoptionOf course, AI adoption doesn’t come without its challenges. Chief among them is security. Public models pose risks: no studio wants their proprietary assets to end up training a competitor’s model.The solution? Deploy AI models on infrastructure you trust so you’re in complete control. That’s where Gcore comes in.Gcore Everywhere Inference reduces compute costs and infrastructure bloat by allowing you to deploy only what you need, where you need it.The future of gaming is AI at scaleTo power real-time player experiences, your studio needs to deploy AI globally, close to your users.Gcore Everywhere Inference lets you deploy models worldwide at the edge with minimal latency because data is not routed back to central servers. This means fast, responsive gameplay and a new generation of real-time, AI-driven features.As a company originally built by gamers, we’ve developed AI solutions with gaming studios in mind. Here’s what we offer:Global edge inference for real-time gameplay: Deploy your AI models close to players worldwide, enabling fast, responsive player experiences without routing data to central servers.Full control over AI model deployment and IP protection: Avoid public APIs and retain full ownership of your assets with on-prem options, preventing your proprietary data from being available to competitors.Scalable, cost-efficient infrastructure tailored to gaming workloads: Deploy only what you need to avoid overprovisioning and reduce compute costs without sacrificing performance.Enhanced player retention through AI-driven personalization and matchmaking: Real-time inference powers smarter NPCs and dynamic matchmaking, improving engagement and keeping players coming back for more.Deploy models in 3 clicks and under 10 seconds: Our developer-friendly platform lets you go from trained model to global deployment in seconds. No complex DevOps setup required.Final takeawayAI is advancing game development fast, but only if it’s deployed right. Gcore offers scalable, secure, and cost-efficient AI infrastructure that helps studios create smarter, faster, and more immersive games.Want to see how it works? Deploy your first model in just a few clicks.Check out our blog on how AI is transforming gaming in 2025

How AI-enhanced content moderation is powering safe and compliant streaming

How AI-enhanced content moderation is powering safe and compliant streaming

As streaming experiences a global boom across platforms, regions, and industries, providers face a growing challenge: how to deliver safe, respectful, and compliant content delivery at scale. Viewer expectations have never been higher, likewise the regulatory demands and reputational risks.Live content in particular leaves little room for error. A single offensive comment, inappropriate image, or misinformation segment can cause long-term damage in seconds.Moderation has always been part of the streaming conversation, but tools and strategies are evolving rapidly. AI-powered content moderation is helping providers meet their safety obligations while preserving viewer experience and platform performance.In this article, we explore how AI content moderation works, where it delivers value, and why streaming platforms are adopting it to stay ahead of both audience expectations and regulatory pressures.Real-time problems require real-time solutionsHuman moderators can provide accuracy and context, but they can’t match the scale or speed of modern streaming environments. Live streams often involve thousands of viewers interacting at once, with content being generated every second through audio, video, chat, or on-screen graphics.Manual review systems struggle to keep up with this pace. In some cases, content can go viral before it is flagged, like deepfakes that circulated on Facebook leading up to the 2025 Canadian election. In others, delays in moderation result in regulatory penalties or customer churn, like X’s 2025 fine under the EU Digital Services Act for shortcomings in content moderation and algorithm transparency. This has created a demand for scalable solutions that act instantly, with minimal human intervention.AI-enhanced content moderation platforms address this gap. These systems are trained to identify and filter harmful or non-compliant material as it is being streamed or uploaded. They operate across multiple modalities—video frames, audio tracks, text inputs—and can flag or remove content within milliseconds of detection. The result is a safer environment for end users.How AI moderation systems workModern AI moderation platforms are powered by machine learning algorithms trained on extensive datasets. These datasets include a wide variety of content types, languages, accents, dialects, and contexts. By analyzing this data, the system learns to identify content that violates platform policies or legal regulations.The process typically involves three stages:Input capture: The system monitors live or uploaded content across audio, video, and text layers.Pattern recognition: It uses models to identify offensive content, including nudity, violence, hate speech, misinformation, or abusive language.Contextual decision-making: Based on confidence thresholds and platform rules, the system flags, blocks, or escalates the content for review.This process is continuous and self-improving. As the system receives more inputs and feedback, it adapts to new forms of expression, regional trends, and platform-specific norms.What makes this especially valuable for streaming platforms is its low latency. Content can be flagged and removed in real time, often before viewers even notice. This is critical in high-stakes environments like esports, corporate webinars, or public broadcasts.Multi-language moderation and global streamingStreaming audiences today are truly global. Content crosses borders faster than ever, but moderation standards and cultural norms do not. What’s considered acceptable in one region may be flagged as offensive in another. A word that is considered inappropriate in one language might be completely neutral in another. A piece of nudity in an educational context may be acceptable, while the same image in another setting may not be. Without the ability to understand nuance, AI systems risk either over-filtering or letting harmful content through.That’s why high-quality moderation platforms are designed to incorporate context into their models. This includes:Understanding tone, not just keywordsRecognizing culturally specific gestures or idiomsAdapting to evolving slang or coded languageApplying different standards depending on content type or target audienceThis enables more accurate detection of harmful material and avoids false positives caused by mistranslation.Training AI models for multi-language support involves:Gathering large, representative datasets in each languageTeaching the model to detect content-specific risks (e.g., slurs or threats) in the right cultural contextContinuously updating the model as language evolvesThis capability is especially important for platforms that operate in multiple markets or support user-generated content. It enables a more respectful experience for global audiences while providing consistent enforcement of safety standards.Use cases across the streaming ecosystemAI moderation isn’t just a concern for social platforms. It plays a growing role in nearly every streaming vertical, including the following:Live sports: Real-time content scanning helps block offensive chants, gestures, or pitch-side incidents before they reach a wide audience. Fast filtering protects the viewer experience and helps meet broadcast standards.Esports: With millions of viewers and high emotional stakes, esports platforms rely on AI to remove hate speech and adult content from chat, visuals, and commentary. This creates a more inclusive environment for fans and sponsors alike.Corporate live events: From earnings calls to virtual town halls, organizations use AI moderation to help ensure compliance with internal communication guidelines and protect their reputation.Online learning: EdTech platforms use AI to keep classrooms safe and focused. Moderation helps filter distractions, harassment, and inappropriate material in both live and recorded sessions.On-demand entertainment: Even outside of live broadcasts, moderation helps streaming providers meet content standards and licensing obligations across global markets. It also ensures user-submitted content (like comments or video uploads) meets platform guidelines.In each case, the shared goal is to provide a safe and trusted streaming environment for users, advertisers, and creators.Balancing automation with human oversightAI moderation is a powerful tool, but it shouldn’t be the only one. The best systems combine automation with clear review workflows, configurable thresholds, and human input.False positives and edge cases are inevitable. Giving moderators the ability to review, override, or explain decisions is important for both quality control and user trust.Likewise, giving users a way to appeal moderation decisions or report issues ensures that moderation doesn’t become a black box. Transparency and user empowerment are increasingly seen as part of good platform governance.Looking ahead: what’s next for AI moderationAs streaming becomes more interactive and immersive, moderation will need to evolve. AI systems will be expected to handle not only traditional video and chat, but also spatial audio, avatars, and real-time user inputs in virtual environments.We can also expect increased demand for:Personalization, where viewers can set their own content preferencesIntegration with platform APIs for programmatic content governanceCross-platform consistency to support syndicated content across partnersAs these changes unfold, AI moderation will remain central to the success of modern streaming. Platforms that adopt scalable, adaptive moderation systems now will be better positioned to meet the next generation of content challenges without compromising on speed, safety, or user experience.Keep your streaming content safe and compliant with GcoreGcore Video Streaming offers AI Content Moderation that satisfies today’s digital safety concerns while streamlining the human moderation process.To explore how Gcore AI Content Moderation can transform your digital platform, we invite you to contact our streaming team for a demonstration. Our docs provide guidance for using our intuitive Gcore Customer Portal to manage your streaming content. We also provide a clear pricing comparison so you can assess the value for yourself.Embrace the future of content moderation and deliver a safer, more compliant digital space for all your users.Try AI Content Moderation for freeTry AI Content Moderation for free

Deploy GPT-OSS-120B privately on Gcore

OpenAI’s release of GPT-OSS-120B is a turning point for LLM developers. It’s a 120B parameter model trained from scratch, licensed for commercial use, and available with open weights. This is a serious asset for serious builders.Gcore now supports private GPT-OSS-120B deployments via our Everywhere Inference platform. That means you can stand up your own endpoint in minutes, run inference at scale, and control the full stack, without API limits, vendor lock-in, or hidden usage fees. Just fast, secure, controlled deployment on your terms. Deploy now in three clicks or read on to learn more.Why GPT-OSS-120B is big news for buildersThis model changes the game for anyone developing AI apps, platforms, or infrastructure. It brings GPT-3-level reasoning to the open-source ecosystem and frees developers from closed APIs.With GPT-OSS-120B, you get:Full access to model weights and architectureSelf-hosting for maximum data control and privacySupport for fine-tuning and model editingOffline deployment for secure or air-gapped useMassive cost savings at scaleYou can deploy in any Gcore region (or leverage Gcore’s three-click serverless inference on your own infrastructure), route traffic through your own stack, and fully control load, latency, and logs. This is LLM deployment for real-world apps, not just playground prompts.How to deploy GPT-OSS-120B with Gcore Everywhere InferenceGcore Everywhere Inference gives you a clean path from open model to production endpoint. You can spin up a dedicated deployment in just three clicks. We offer configuration options to suit your business needs:Choose your location (cloud or on-prem)Integrate via standard APIs (OpenAI-compatible)Control usage, autoscale, and costsDeploying GPT-OSS-120B on Gcore takes just three clicks in the Gcore Customer Portal.There are no shared endpoints. You get dedicated compute, low-latency routing, and full control and observability.You can also bring your own trained variant if you’ve fine-tuned GPT-OSS-120B elsewhere. We’ll help you host it reliably, close to your users.Use cases: where GPT-OSS-120B fits bestCommercial GPTs still outperform OSS models on some general tasks, but GPT-OSS-120B gives you control, portability, and flexibility where it counts. Most importantly, it gives you the ability to build privacy-sensitive applications.Great fits include:Internal dev tools and copilotsRetrieval-augmented generation (RAG) pipelinesSecure, private enterprise assistantsData-sensitive, on-prem AI workloadsModels requiring full customization or fine-tuningIt’s especially relevant for finance, healthcare, government, and legal teams operating under strict compliance rules.Deploy GPT-OSS-120B todayWant to learn more about GPT-OSS-120B and why Gcore is an ideal provider for deployment? Get all the information you need on our dedicated page.And if you’re ready to deploy in just three clicks, head on over to the Gcore Customer Portal. GPT-OSS-120B is waiting for you in the Application Catalog.Learn more about deploying GPT-OSS-120B on Gcore

Announcing new tools, apps, and regions for your real-world AI use cases

Three updates, one shared goal: helping builders move faster with AI. Our latest releases for Gcore Edge AI bring real-world AI deployments within reach, whether you’re a developer integrating genAI into a workflow, an MLOps team scaling inference workloads, or a business that simply needs access to performant GPUs in the UK.MCP: make AI do moreGcore’s MCP server implementation is now live on GitHub. The Model Context Protocol (MCP) is an open standard, originally developed by Anthropic, that turns AI models into agents that can carry out real-world tasks. It allows you to plug genAI models into everyday tools like Slack, email, Jira, and databases, so your genAI can read, write, and reason directly across systems. Think of it as a way to turn “give me a summary” into “send that summary to the right person and log the action.”“AI needs to be useful, not just impressive. MCP is a critical step toward building AI systems that drive desirable business outcomes, like automating workflows, integrating with enterprise tools, and operating reliably at scale. At Gcore, we’re focused on delivering that kind of production-grade AI through developer-friendly services and top-of-the-range infrastructure that make real-world deployment fast and easy.” — Seva Vayner, Product Director of Edge Cloud and AI, GcoreTo get started, clone the repo, explore the toolsets, and test your own automations.Gcore Application Catalog: inference without overheadWe’ve upgraded the Gcore Model Catalog into something even more powerful: an Application Catalog for AI inference. You can still deploy the latest open models with three clicks. But now, you can also tune, share, and scale them like real applications.We’ve re-architected our inference solution so you can:Run prefill and decode stages in parallelShare KV cache across pods (it’s not tied to individual GPUs) from August 2025Toggle WebUI and secure API independently from August 2025These changes cut down on GPU memory usage, make deployments more flexible, and reduce time to first token, especially at scale. And because everything is application-based, you’ll soon be able to optimize for specific business goals like cost, latency, or throughput.Here’s who benefits:ML engineers can deploy high-throughput workloads without worrying about memory overheadBackend developers get a secure API, no infra setup neededProduct teams can launch demos instantly with the WebUI toggleInnovation labs can move from prototype to production without reconfiguringPlatform engineers get centralized caching and predictable scalingThe new Application Catalog is available now through the Gcore Customer Portal.Chester data center: NVIDIA H200 capacity in the UKGcore’s newest AI cloud region is now live in Chester, UK. This marks our first UK location in partnership with Northern Data. Chester offers 2000 NVIDIA H200 GPUs with BlueField-3 DPUs for secure, high-throughput compute on Gcore GPU Cloud, serving your training and inference workloads. You can reserve your H200 GPU immediately via the Gcore Customer Portal.This launch solves a growing problem: UK-based companies building with AI often face regional capacity shortages, long wait times, or poor performance when routing inference to overseas data centers. Chester fixes that with immediate availability on performant GPUs.Whether you’re training LLMs or deploying inference for UK and European users, Chester offers local capacity, low latency, and impressive capacity and availability.Next stepsExplore the MCP server and start building agentic workflowsTry the new Application Catalog via the Gcore Customer PortalDeploy your workloads in Chester for high-performance UK-based computeDeploy your AI workload in three clicks today!

Gcore recognized as a Leader in the 2025 GigaOm Radar for AI Infrastructure

Gcore recognized as a Leader in the 2025 GigaOm Radar for AI Infrastructure

We’re proud to share that Gcore has been named a Leader in the 2025 GigaOm Radar for AI Infrastructure—the only European provider to earn a top-tier spot. GigaOm’s rigorous evaluation highlights our leadership in platform capability and innovation, and our expertise in delivering secure, scalable AI infrastructure.Inside the GigaOm Radar: what’s behind the Leader statusThe GigaOm Radar report is a respected industry analysis that evaluates top vendors in critical technology spaces. In this year’s edition, GigaOm assessed 14 of the world’s leading AI infrastructure providers, measuring their strengths across key technical and business metrics. It ranks providers based on factors such as scalability and performance, deployment flexibility, security and compliance, and interoperability.Alongside the ranking, the report offers valuable insights into the evolving AI infrastructure landscape, including the rise of hybrid AI architectures, advances in accelerated computing, and the increasing adoption of edge deployment to bring AI closer to where data is generated. It also offers strategic takeaways for organizations seeking to build scalable, secure, and sovereign AI capabilities.Why was Gcore named a top provider?The specific areas in which Gcore stood out and earned its Leader status are as follows:A comprehensive AI platform offering Everywhere Inference and GPU Cloud solutions that support scalable AI from model development to productionHigh performance powered by state-of-the-art NVIDIA A100, H100, H200 and GB200 GPUs and a global private network ensuring ultra-low latencyAn extensive model catalogue with flexible deployment options across cloud, on-premises, hybrid, and edge environments, enabling tailored global AI solutionsExtensive capacity of cutting-edge GPUs and technical support in Europe, supporting European sovereign AI initiativesChoosing Gcore AI is a strategic move for organizations prioritizing ultra-low latency, high performance, and flexible deployment options across cloud, on-premises, hybrid, and edge environments. Gcore’s global private network ensures low-latency processing for real-time AI applications, which is a key advantage for businesses with a global footprint.GigaOm Radar, 2025Discover more about the AI infrastructure landscapeAt Gcore, we’re dedicated to driving innovation in AI infrastructure. GPU Cloud and Everywhere Inference empower organizations to deploy AI efficiently and securely, on their terms.If you’re planning your AI infrastructure roadmap or rethinking your current one, this report is a must-read. Explore the report to discover how Gcore can support high-performance AI at scale and help you stay ahead in an AI-driven world.Download the full report

Protecting networks at scale with AI security strategies

Network cyberattacks are no longer isolated incidents. They are a constant, relentless assault on network infrastructure, probing for vulnerabilities in routing, session handling, and authentication flows. With AI at their disposal, threat actors can move faster than ever, shifting tactics mid-attack to bypass static defenses.Legacy systems, designed for simpler threats, cannot keep pace. Modern network security demands a new approach, combining real-time visibility, automated response, AI-driven adaptation, and decentralized protection to secure critical infrastructure without sacrificing speed or availability.At Gcore, we believe security must move as fast as your network does. So, in this article, we explore how L3/L4 network security is evolving to meet new network security challenges and how AI strengthens defenses against today’s most advanced threats.Smarter threat detection across complex network layersModern threats blend into legitimate traffic, using encrypted command-and-control, slow drip API abuse, and DNS tunneling to evade detection. Attackers increasingly embed credential stuffing into regular login activity. Without deep flow analysis, these attempts bypass simple rate limits and avoid triggering alerts until major breaches occur.Effective network defense today means inspection at Layer 3 and Layer 4, looking at:Traffic flow metadata (NetFlow, sFlow)SSL/TLS handshake anomaliesDNS request irregularitiesUnexpected session persistence behaviorsGcore Edge Security applies real-time traffic inspection across multiple layers, correlating flows and behaviors across routers, load balancers, proxies, and cloud edges. Even slight anomalies in NetFlow exports or unexpected east-west traffic inside a VPC can trigger early threat alerts.By combining packet metadata analysis, flow telemetry, and historical modeling, Gcore helps organizations detect stealth attacks long before traditional security controls react.Automated response to contain threats at network speedDetection is only half the battle. Once an anomaly is identified, defenders must act within seconds to prevent damage.Real-world example: DNS amplification attackIf a volumetric DNS amplification attack begins saturating a branch office's upstream link, automated systems can:Apply ACL-based rate limits at the nearest edge routerFilter malicious traffic upstream before WAN degradationAlert teams for manual inspection if thresholds escalateSimilarly, if lateral movement is detected inside a cloud deployment, dynamic firewall policies can isolate affected subnets before attackers pivot deeper.Gcore’s network automation frameworks integrate real-time AI decision-making with response workflows, enabling selective throttling, forced reauthentication, or local isolation—without disrupting legitimate users. Automation means threats are contained quickly, minimizing impact without crippling operations.Hardening DDoS mitigation against evolving attack patternsDDoS attacks have moved beyond basic volumetric floods. Today, attackers combine multiple tactics in coordinated strikes. Common attack vectors in modern DDoS include the following:UDP floods targeting bandwidth exhaustionSSL handshake floods overwhelming load balancersHTTP floods simulating legitimate browser sessionsAdaptive multi-vector shifts changing methods mid-attackReal-world case study: ISP under hybrid DDoS attackIn recent years, ISPs and large enterprises have faced hybrid DDoS attacks blending hundreds of gigabits per second of L3/4 UDP flood traffic with targeted SSL handshake floods. Attackers shift vectors dynamically to bypass static defenses and overwhelm infrastructure at multiple layers simultaneously. Static defenses fail in such cases because attackers change vectors every few minutes.Building resilient networks through self-healing capabilitiesEven the best defenses can be breached. When that happens, resilient networks must recover automatically to maintain uptime.If BGP route flapping is detected on a peering session, self-healing networks can:Suppress unstable prefixesReroute traffic through backup transit providersPrevent packet loss and service degradation without manual interventionSimilarly, if a VPN concentrator faces resource exhaustion from targeted attack traffic, automated scaling can:Spin up additional concentratorsRedistribute tunnel sessions dynamicallyMaintain stable access for remote usersGcore’s infrastructure supports self-healing capabilities by combining telemetry analysis, automated failover, and rapid resource scaling across core and edge networks. This resilience prevents localized incidents from escalating into major outages.Securing the edge against decentralized threatsThe network perimeter is now everywhere. Branches, mobile endpoints, IoT devices, and multi-cloud services all represent potential entry points for attackers.Real-world example: IoT malware infection at the branchMalware-infected IoT devices at a branch office can initiate outbound C2 traffic during low-traffic periods. Without local inspection, this activity can go undetected until aggregated telemetry reaches the central SOC, often too late.Modern edge security platforms deploy the following:Real-time traffic inspection at branch and edge routersBehavioral anomaly detection at local points of presenceAutomated enforcement policies blocking malicious flows immediatelyGcore’s edge nodes analyze flows and detect anomalies in near real time, enabling local containment before threats can propagate deeper into cloud or core systems. Decentralized defense shortens attacker dwell time, minimizes potential damage, and offloads pressure from centralized systems.How Gcore is preparing networks for the next generation of threatsThe threat landscape will only grow more complex. Attackers are investing in automation, AI, and adaptive tactics to stay one step ahead. Defending modern networks demands:Full-stack visibility from core to edgeAdaptive defense that adjusts faster than attackersAutomated recovery from disruption or compromiseDecentralized detection and containment at every entry pointGcore Edge Security delivers these capabilities, combining AI-enhanced traffic analysis, real-time mitigation, resilient failover systems, and edge-to-core defense. In a world where minutes of network downtime can cost millions, you can’t afford static defenses. We enable networks to protect critical infrastructure without sacrificing performance, agility, or resilience.Move faster than attackers. Build AI-powered resilience into your network with Gcore.Check out our docs to see how DDoS Protection protects your network

Subscribe to our newsletter

Get the latest industry trends, exclusive insights, and Gcore updates delivered straight to your inbox.