Manage AI model deployments

On this page

All deployments
Deployment details
Start, stop, and delete deployments
Hardware, autoscaling, and environment variables
Deployment logs
Active AI model image

The deployments section allows you to quickly access your active deployments, their configurations, and logs. You can also check and modify the details of your model deployments.

All deployments

You can open the list of deployments by clicking on this link or opening the Gcore Customer Portal and navigating to Everywhere Inference > Deployments.

Deployment details

The following deployment details are displayed in the table:

Name
ID
Endpoint URL
Creation time
Deployment status
Running status (i.e., the number and location of replicas)
Last status message (i.e., the last error reported by one of the replicas)

Some of the details aren’t directly visible; you can view them in the following ways:

Deployment ID: Hover your mouse cursor over an ID icon.
Endpoint URL: Hover your mouse over the i icon.
Replica locations: Hover your mouse cursor over the numbers in the Running status column.
Last status message: Click on View details.

Start, stop, and delete deployments

The action menu on the right side of the table, which appears as three horizontal dots, allows you to start, stop, or delete a deployment.

Hardware, autoscaling, and environment variables

If you want to view or change the hardware, autoscaling settings, or environment variables, click on the deployment name or the Overview action of a deployment. This will open the Deployment overview.

TipChanges aren’t saved automatically, so click the Save changes button at the bottom to apply them.

Click the Settings tab to view this deployment’s hardware, autoscaling configuration, and environment variables. Under Pod configuration, you can change the hardware that runs your model by selecting a Processor type and a VM Flavor from the dropdown.

InfoA hardware change can have performance implications because each replica has to shut down to apply the changes.

You can also change the autoscaling limits and triggers here.

The Cooldown period defines the seconds the autoscaler waits after starting a new pod before reacting to autoscaling triggers again.
The Pod lifetime defines the seconds the auto scaler waits before shutting down idle pods.

InfoIt’s a good practice to keep at least two pods running so deployment changes don’t cause downtime for your users.

You can define Autoscaling triggers for the following conditions:

CPU/GPU utilization
Memory consumption
Request frequency

Like the Pod lifetime, the Scaling down timeout lets you define the period the autoscaler waits for a request before stopping idle pods.

TipTest the models before deploying them to production so you know how much load they can handle. If your thresholds are too low, you might end up with underutilized pods; if they’re too high, requests might fail.

You can define environment variables at the bottom of the Settings tab. Activate the Set environment variables toggle to see the variable form.

InfoEnvironment variables are strings. If entering numbers or booleans, ensure your AI model image can handle the conversion.

Deployment logs

If you want to view the logs of your pods, click on the deployment name or the Overview action of a deployment. This will open the Deployment overview.

Click the Logs tab to view this deployment’s latest log output. You can switch between replicas with the Region dropdown.

Active AI model image

If you want to view or change the running image, click on the deployment name or the Overview action of a deployment. This will open the Deployment overview.

Click the Model image tab to view this deployment’s currently configured AI model image.

You can change the image by entering a new Model image URL and clicking Save changes. If you want to switch to a private registry, you must add it to the Registry section before you switch the Registry type; otherwise, it will not appear in the Registry dropdown. Check out our registry guide to learn more about creating and managing private AI image registries.

Deploy an AI model Manage registries

Account settings

CDN

FastEdge

Edge Cloud

Edge AI

Managed DNS

Hosting

Storage

Video streaming

DDoS protection

WAAP

Manage AI model deployments

All deployments

Deployment details

Start, stop, and delete deployments

Hardware, autoscaling, and environment variables

Deployment logs

Active AI model image

Account settings

CDN

FastEdge

Edge Cloud

Edge AI

Managed DNS

Hosting

Storage

Video streaming

DDoS protection

WAAP

​All deployments

​Deployment details

​Start, stop, and delete deployments

​Hardware, autoscaling, and environment variables

​Deployment logs

​Active AI model image

All deployments

Deployment details

Start, stop, and delete deployments

Hardware, autoscaling, and environment variables

Deployment logs

Active AI model image