> ## Documentation Index
> Fetch the complete documentation index at: https://gcore.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Overview

Gcore Everywhere Inference deploys trained AI models on edge inference nodes across 180+ locations worldwide. It brings models closer to users for low response times, with no infrastructure to manage — suited for latency-sensitive workloads in fintech, healthcare, gaming, media, and industrial applications.

Gcore routes end-user queries to the nearest running model using [anycast endpoints](https://gcore.com/learning/what-is-anycast). Smart Routing selects the closest inference region through a single endpoint—no scaling, routing, or node monitoring required.

## How Everywhere Inference works

It combines two technologies:

1. **Edge network** — provides low latency via anycast balancing, smart routing, and built-in DDoS and bot protection.
2. **Serverless flexible GPU infrastructure** — enables deployment of [Application Catalog](/edge-ai/everywhere-inference/application-catalog) models or [custom models](/edge-ai/everywhere-inference/ai-models/prepare-a-custom-ai-model-for-deployment) on purpose-built NVIDIA GPUs.

<Frame>
  <img src="https://mintcdn.com/gcore/yKie2I_xdaIxD6qJ/images/docs/edge-ai/everywhere-inference/overview/overview-1.png?fit=max&auto=format&n=yKie2I_xdaIxD6qJ&q=85&s=d0ee85d6f5bf25356b4a95bc4a5647ce" alt="How Smart Routing works to speed up requests via Gcore Everywhere Inference" width="1675" height="2500" data-path="images/docs/edge-ai/everywhere-inference/overview/overview-1.png" />
</Frame>

Gcore uses [Healthchecks](/dns/dns-failover/configure-and-use-dns-failover) to monitor pod availability. If a pod in one region goes down, requests are automatically routed to the next-closest inference region.

<Frame>
  <img src="https://mintcdn.com/gcore/QIEAnezmG8Bl8nMk/images/docs/cloud/everywhere-inference/about-everywhere-inference/smart-routing-map.png?fit=max&auto=format&n=QIEAnezmG8Bl8nMk&q=85&s=7777917e2c8972fd95683e1e9db38c3a" alt="Healthchecks redirects traffic to the next-closest edge node if the closest node is unavailable" width="1520" height="816" data-path="images/docs/cloud/everywhere-inference/about-everywhere-inference/smart-routing-map.png" />
</Frame>

## Supported VM flavors

The hardware options available to you depend on your account limits and region. To unlock GPU access or add more deployments, [submit a quota request](/edge-ai/everywhere-inference/quotas/request-quota-increase).

| **vGPUs** | **vCPUs** | **Memory (GiB)** |
| --------- | --------- | ---------------- |
| —         | 4         | 16               |
| —         | 8         | 32               |
| 1xL40S    | 16        | 232              |
| 2xL40S    | 32        | 464              |
| 1xH100    | 16        | 232              |
| 2xH100    | 32        | 464              |
| 4xH100    | 64        | 928              |
| 1xA100    | 16        | 232              |
| 2xA100    | 32        | 464              |
| 4xA100    | 64        | 928              |
