Skip to main content
POST
/
cloud
/
v3
/
inference
/
{project_id}
/
deployments
/
check_limits
Check inference deployment quota
curl --request POST \
  --url https://api.gcore.com/cloud/v3/inference/{project_id}/deployments/check_limits \
  --header 'Authorization: <api-key>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "containers": [
    {
      "region_id": 1,
      "scale": {
        "max": 3,
        "min": 1
      }
    }
  ],
  "flavor_name": "inference-16vcpu-232gib-1xh100-80gb"
}
'
{
  "inference_cpu_millicore_count_limit": 8000,
  "inference_cpu_millicore_count_requested": 3000,
  "inference_cpu_millicore_count_usage": 2000,
  "inference_gpu_a100_count_limit": 4,
  "inference_gpu_a100_count_requested": 2,
  "inference_gpu_a100_count_usage": 1,
  "inference_gpu_h100_count_limit": 4,
  "inference_gpu_h100_count_requested": 2,
  "inference_gpu_h100_count_usage": 1,
  "inference_gpu_l40s_count_limit": 4,
  "inference_gpu_l40s_count_requested": 2,
  "inference_gpu_l40s_count_usage": 1,
  "inference_instance_count_limit": 10,
  "inference_instance_count_requested": 1,
  "inference_instance_count_usage": 1
}

Authorizations

Authorization
string
header
required

API key for authentication. Make sure to include the word apikey, followed by a single space and then your token. Example: apikey 1234$abcdef

Path Parameters

project_id
integer
required

Project ID

Example:

1

Body

application/json
containers
ContainerInSerializerV3 · object[]
required

List of containers for the inference instance.

Minimum array length: 1
Example:
[
{
"region_id": 1,
"scale": { "max": 3, "min": 1 }
}
]
flavor_name
string
required

Inference flavor name.

Minimum string length: 1
Example:

"inference-16vcpu-232gib-1xh100-80gb"

Response

200 - application/json

OK

inference_cpu_millicore_count_limit
integer

Inference CPU millicore count limit

Example:

8000

inference_cpu_millicore_count_requested
integer

Inference CPU millicore count requested

Example:

3000

inference_cpu_millicore_count_usage
integer

Inference CPU millicore count usage

Example:

2000

inference_gpu_a100_count_limit
integer

Inference GPU A100 Count limit

Example:

4

inference_gpu_a100_count_requested
integer

Inference GPU A100 Count requested

Example:

2

inference_gpu_a100_count_usage
integer

Inference GPU A100 Count usage

Example:

1

inference_gpu_h100_count_limit
integer

Inference GPU H100 Count limit

Example:

4

inference_gpu_h100_count_requested
integer

Inference GPU H100 Count requested

Example:

2

inference_gpu_h100_count_usage
integer

Inference GPU H100 Count usage

Example:

1

inference_gpu_l40s_count_limit
integer

Inference GPU L40s Count limit

Example:

4

inference_gpu_l40s_count_requested
integer

Inference GPU L40s Count requested

Example:

2

inference_gpu_l40s_count_usage
integer

Inference GPU L40s Count usage

Example:

1

inference_instance_count_limit
integer

Inference instance count limit

Example:

10

inference_instance_count_requested
integer

Inference instance count requested

Example:

1

inference_instance_count_usage
integer

Inference instance count usage

Example:

1