Home/Edge AI/Inference at the Edge/Prepare a custom model for deployment

Prepare a custom AI model for deployment

If you want to run a custom AI model on Gcore inference nodes, refer to the following guidelines.

While we don’t have any specific requirements for model packaging or its dependencies, this guide will help you get a general understanding of the key steps you should complete before deploying your model to our Customer Portal.

Step 1. Ensure your model is trained and optimized for inference

Your model should be properly trained, validated, and tested.

Fine-tuning an AI model is essential to ensure that it makes accurate and reliable predictions.

Step 2. Containerize a model

To ensure consistent deployment of your model across different environments, you need to package the model into a container image.

There are no specific requirements for building a container image or its dependencies. You only need to ensure that it’s compliant with the image registry standards. For example, if you’re using Docker, you need to prepare a Dockerfile with your AI model.

If you need more general information about Docker and its setup for running AI models, read the Docker guide for AI development and deployment.

Here’s an example of a Dockerfile configuration:

# Set the base image your model is built from  
FROM python:3.11-slim

# Set the working directory inside a container 
WORKDIR /app

# Set environment variables 
ENV USE_TORCH=1
ENV NVIDIA_VISIBLE_DEVICES=all 
ENV CLI_ARGS=""

# Install any required dependencies 
pip install -r requirements.txt

# Install Python packages 
RUN pip3 install onnxruntime flask 

# Copy the model into the container 
COPY my_model.onnx /app/my_model.onnx

# Copy your inference script 
COPY inference.py /app/inference.py 

# Expose the port the app runs on 
EXPOSE 5000</span>

# Run the application 
CMD ["python3", "inference.py"]

For a full list of Dockerfile requirements and supported syntax, check the official Docker documentation.

Step 3. Build, tag, and push an image

The image with your AI model must be built for the x86-64(AMD64) architecture. Apart from this compatibility requirement, we have no specific constraints on the structure or organization of your container image.

The following steps demonstrate how to build, tag, and push a Docker image:

1. If you’re building a Docker image on Apple Silicon machines, use the following command:

docker buildx build --platform linux/amd64 instead of docker build

2. Tag the image: docker image tag SOURCE_IMAGE[:TAG] TARGET_IMAGE[:TAG]

3. Push the image to the registry: docker push my-username/my-image

Step 4. Deploy the model

After you’ve built and pushed a Docker image with your AI model, deploy it on edge inference nodes in the Gcore Customer Portal.

Was this article helpful?