Authorizations
API key for authentication. Make sure to include the word apikey, followed by a single space and then your token.
Example: apikey 1234$abcdef
Path Parameters
Project ID
1
Inference instance name.
"my-instance"
Response
OK
Address of the inference instance
1"https://example.com"
true if instance uses API key authentication. "Authorization": "Bearer ****\*" or "X-Api-Key": "****\*" header is required for the requests to the instance if enabled.
false
Command to be executed when running a container from an image.
List of containers for the inference instance
Inference instance creation date in ISO 8601 format.
"2023-08-22T11:21:00Z"
Registry credentials name
"dockerhub"
Inference instance description.
"My first instance"
Environment variables for the inference instance
{ "DEBUG_MODE": "False", "KEY": "12345" }Flavor name for the inference instance
"inference-16vcpu-232gib-1xh100-80gb"
Docker image for the inference instance. This field should contain the image name and tag in the format 'name:tag', e.g., 'nginx:latest'. It defaults to Docker Hub as the image registry, but any accessible Docker image URL can be specified.
"nginx:latest"
Ingress options for the inference instance
{ "disable_response_buffering": true }Listening port for the inference instance.
8080
Logging configuration for the inference instance
{
"destination_region_id": 1,
"enabled": true,
"retention_policy": { "period": 45 },
"topic_name": "my-log-name"
}Inference instance name.
"my-instance"
Indicates to which parent object this inference belongs to.
Probes configured for all containers of the inference instance.
Project ID. If not provided, your default project ID will be used.
1
Inference instance status.
Value can be one of the following:
DEPLOYING- The instance is being deployed. Containers are not yet created.PARTIALLYDEPLOYED- All containers have been created, but some may not be ready yet. Instances stuck in this state typically indicate either image being pulled, or a failure of some kind. In the latter case, theerror_messagefield of the respective container object in thecontainerscollection explains the failure reason.ACTIVE- The instance is running and ready to accept requests.DISABLED- The instance is disabled and not accepting any requests.PENDING- The instance is running but scaled to zero. It will be automatically scaled up when a request is made.DELETING- The instance is being deleted.
ACTIVE, DELETING, DEPLOYING, DISABLED, PARTIALLYDEPLOYED, PENDING Specifies the duration in seconds without any requests after which the containers will be downscaled to their minimum scale value as defined by scale.min. If set, this helps in optimizing resource usage by reducing the number of container instances during periods of inactivity.
x >= 0120
List of API keys for the inference instance
["key1", "key2"]