> ## Documentation Index
> Fetch the complete documentation index at: https://gcore.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Create inference application deployment

> Creates a new application deployment based on a selected catalog application.
Specify the desired deployment name, target regions, and configuration for each component.
The platform will provision the necessary resources and initialize the application accordingly.



## OpenAPI

````yaml /api-reference/services_documented/cloud_api.yaml post /cloud/v3/inference/applications/{project_id}/deployments
openapi: 3.1.0
info:
  title: Gcore OpenAPI – Cloud API
  description: >-
    This OpenAPI is an aggregated OpenAPI specification that unifies all Gcore
    products into a single file. It covers Cloud, CDN, DNS, WAAP, DDoS
    Protection, Object Storage, Streaming, and FastEdge services.
  version: '2026-05-11T15:10:30.328297+00:00'
servers:
  - url: https://api.gcore.com
security:
  - APIKey: []
tags:
  - name: Bare Metal
    x-displayName: Bare Metal
  - name: Container as a Service
    x-displayName: Container as a Service
  - name: Cost Reports
    x-displayName: Cost Reports
  - name: DDoS Protection
    x-displayName: DDoS Protection
  - name: Everywhere Inference
    x-displayName: Everywhere Inference
  - name: Everywhere Inference Apps
    x-displayName: Everywhere Inference Apps
  - name: File Shares
    x-displayName: File Shares
  - name: Floating IPs
    x-displayName: Floating IPs
  - name: Function as a Service
    x-displayName: Function as a Service
  - name: GPU Bare Metal
    x-displayName: GPU Bare Metal
  - name: GPU Virtual
    x-displayName: GPU Virtual
  - name: IP Ranges
    x-displayName: IP Ranges
  - name: Images
    x-displayName: Images
  - name: Instances
    x-displayName: Instances
  - name: Load Balancers
    x-displayName: Load Balancers
  - name: Logging
    x-displayName: Logging
  - name: Managed Kubernetes
    x-displayName: Managed Kubernetes
  - name: Managed PostgreSQL
    x-displayName: Managed PostgreSQL
  - name: Networks
    x-displayName: Networks
  - name: Placement Groups
    x-displayName: Placement Groups
  - name: Projects
    x-displayName: Projects
  - name: Quotas
    x-displayName: Quotas
  - name: Regions
    x-displayName: Regions
  - name: Registry
    x-displayName: Registry
  - name: Reservations
    x-displayName: Reservations
  - name: Reserved IPs
    x-displayName: Reserved IPs
  - name: Routers
    x-displayName: Routers
  - name: SSH Keys
    x-displayName: SSH Keys
  - name: Secrets
    x-displayName: Secrets
  - name: Security Groups
    x-displayName: Security Groups
  - name: Snapshot Schedules
    x-displayName: Snapshot Schedules
  - name: Snapshots
    x-displayName: Snapshots
  - name: Tasks
    x-displayName: Tasks
  - name: User Actions
    x-displayName: User Actions
  - name: User Role Assignments
    x-displayName: User Role Assignments
  - name: Volumes
    x-displayName: Volumes
paths:
  /cloud/v3/inference/applications/{project_id}/deployments:
    post:
      tags:
        - Everywhere Inference Apps
      summary: Create inference application deployment
      description: >-
        Creates a new application deployment based on a selected catalog
        application.

        Specify the desired deployment name, target regions, and configuration
        for each component.

        The platform will provision the necessary resources and initialize the
        application accordingly.
      operationId: InferenceApplicationDeploymentsCollection.post
      parameters:
        - in: path
          name: project_id
          required: true
          description: Project ID
          schema:
            description: Project ID
            example: 1
            examples:
              - 1
            title: Project Id
            type: integer
      requestBody:
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/AppDeploymentCreateRequest'
      responses:
        '200':
          description: OK
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/TaskIDsSerializer'
      x-codeSamples:
        - lang: Python
          source: >-
            import os

            from gcore import Gcore


            client = Gcore(
                api_key=os.environ.get("GCORE_API_KEY"),  # This is the default and can be omitted
            )

            task_id_list =
            client.cloud.inference.applications.deployments.create(
                project_id=1,
                application_name="demo-app",
                components_configuration={
                    "model": {
                        "exposed": True,
                        "flavor": "inference-16vcpu-232gib-1xh100-80gb",
                        "scale": {
                            "max": 1,
                            "min": 1,
                        },
                    }
                },
                name="name",
                regions=[1, 2],
            )

            print(task_id_list.tasks)
        - lang: Go
          source: "package main\n\nimport (\n\t\"context\"\n\t\"fmt\"\n\n\t\"github.com/G-Core/gcore-go\"\n\t\"github.com/G-Core/gcore-go/cloud\"\n\t\"github.com/G-Core/gcore-go/option\"\n)\n\nfunc main() {\n\tclient := gcore.NewClient(\n\t\toption.WithAPIKey(\"My API Key\"),\n\t)\n\ttaskIDList, err := client.Cloud.Inference.Applications.Deployments.New(context.TODO(), cloud.InferenceApplicationDeploymentNewParams{\n\t\tProjectID:       gcore.Int(1),\n\t\tApplicationName: \"demo-app\",\n\t\tComponentsConfiguration: map[string]cloud.InferenceApplicationDeploymentNewParamsComponentsConfiguration{\n\t\t\t\"model\": {\n\t\t\t\tExposed: true,\n\t\t\t\tFlavor:  \"inference-16vcpu-232gib-1xh100-80gb\",\n\t\t\t\tScale: cloud.InferenceApplicationDeploymentNewParamsComponentsConfigurationScale{\n\t\t\t\t\tMax: 1,\n\t\t\t\t\tMin: 1,\n\t\t\t\t},\n\t\t\t},\n\t\t},\n\t\tName:    \"name\",\n\t\tRegions: []int64{1, 2},\n\t})\n\tif err != nil {\n\t\tpanic(err.Error())\n\t}\n\tfmt.Printf(\"%+v\\n\", taskIDList.Tasks)\n}\n"
components:
  schemas:
    AppDeploymentCreateRequest:
      properties:
        api_keys:
          description: List of API keys for the application
          example:
            - key1
            - key2
          examples:
            - - key1
              - key2
          items:
            type: string
          title: Api Keys
          type: array
        application_name:
          description: Identifier of the application from the catalog
          example: demo-app
          examples:
            - demo-app
          title: Application Name
          type: string
        components_configuration:
          additionalProperties:
            $ref: '#/components/schemas/CreateComponentConfiguration'
          description: >-
            Mapping of component names to their configuration (e.g., `"model":
            {...}`)
          example:
            model:
              exposed: true
              flavor: inference-16vcpu-232gib-1xh100-80gb
              scale:
                max: 1
                min: 1
          examples:
            - model:
                exposed: true
                flavor: inference-16vcpu-232gib-1xh100-80gb
                scale:
                  max: 1
                  min: 1
          title: Components Configuration
          type: object
        name:
          description: Desired name for the new deployment
          example: my-app-deployment
          examples:
            - my-app-deployment
          maxLength: 15
          title: Name
          type: string
        regions:
          description: Geographical regions where the deployment should be created
          example:
            - 1
            - 2
          examples:
            - - 1
              - 2
          items:
            type: integer
          title: Regions
          type: array
      required:
        - application_name
        - name
        - regions
        - components_configuration
      title: AppDeploymentCreateRequest
      type: object
    TaskIDsSerializer:
      properties:
        tasks:
          description: >-
            List of task IDs representing asynchronous operations. Use these IDs
            to monitor operation progress:

            - `GET /v1/tasks/{task_id}` - Check individual task status and
            details

            Poll task status until completion (`FINISHED`/`ERROR`) before
            proceeding with dependent operations.
          example:
            - d478ae29-dedc-4869-82f0-96104425f565
          examples:
            - - d478ae29-dedc-4869-82f0-96104425f565
          items:
            type: string
          title: Tasks
          type: array
      required:
        - tasks
      title: TaskIDsSerializer
      type: object
    CreateComponentConfiguration:
      properties:
        exposed:
          description: >-
            Whether the component should be exposed via a public endpoint (e.g.,
            for external inference/API access).
          title: Exposed
          type: boolean
        flavor:
          description: >-
            Specifies the compute configuration (e.g., CPU/GPU size) to be used
            for the component.
          title: Flavor
          type: string
        parameter_overrides:
          additionalProperties:
            $ref: '#/components/schemas/ParameterOverride'
          description: Map of parameter overrides for customization
          title: Parameter Overrides
          type: object
        scale:
          $ref: '#/components/schemas/Scale'
          description: Scaling parameters of the component
      required:
        - flavor
        - exposed
        - scale
      title: CreateComponentConfiguration
      type: object
    ParameterOverride:
      properties:
        value:
          description: New value assigned to the overridden parameter
          title: Value
          type: string
      required:
        - value
      title: ParameterOverride
      type: object
    Scale:
      properties:
        max:
          description: Maximum number of replicas the container can be scaled up to
          title: Max
          type: integer
        min:
          description: Minimum number of replicas the component can be scaled down to
          title: Min
          type: integer
      required:
        - min
        - max
      title: Scale
      type: object
  securitySchemes:
    APIKey:
      description: >-
        API key for authentication. Make sure to include the word `apikey`,
        followed by a single space and then your token.

        Example: `apikey 1234$abcdef`
      type: apiKey
      in: header
      name: Authorization

````