Check if global quota is exceeded, if yes the number of additional quotas needed to create the specified inference deployment will be calculated.
API key for authentication. Make sure to include the word apikey, followed by a single space and then your token.
Example: apikey 1234$abcdef
Project ID
1
List of containers for the inference instance.
1[
{
"region_id": 1,
"scale": { "max": 3, "min": 1 }
}
]Inference flavor name.
1"inference-16vcpu-232gib-1xh100-80gb"
OK
Inference CPU millicore count limit
8000
Inference CPU millicore count requested
3000
Inference CPU millicore count usage
2000
Inference GPU A100 Count limit
4
Inference GPU A100 Count requested
2
Inference GPU A100 Count usage
1
Inference GPU H100 Count limit
4
Inference GPU H100 Count requested
2
Inference GPU H100 Count usage
1
Inference GPU L40s Count limit
4
Inference GPU L40s Count requested
2
Inference GPU L40s Count usage
1
Inference instance count limit
10
Inference instance count requested
1
Inference instance count usage
1