Deploy Hugging Face AI models on Gcore edge inference nodes to easily set up, manage, and ensure real-time delivery of your AI-powered solutions.
Whether you're working on natural language processing, computer vision, or other AI tasks, using edge inference significantly improves response times and scalability of your workloads.
1. In the Hugging Face models catalog, select the model you want to deploy for your edge inference solution.
2. Navigate to the corresponding space where the model is used. For example, for the model Pixtral-12B-2409, select any of the associated spaces.
3. Copy the Docker image link and startup command according to the instructions from the official Hugging Face guide.
1. In the Gcore Customer Portal, click Inference at the Edge.
2. Click Deploy custom model.
3. Configure the model image:
In the Model image URL (docker tag) field, enter the following link: registry.hf.space/ethux-mistral-pixtral-demo:latest
.
Enable the Set startup command toggle and add the executable command python app.py
.
4. Set the Container port to 7860.
5. Configure the pod:
Processor type: GPU-optimized
Flavor: 1x L40S / 16 vCPU / 232GiB RAM
6. In the Routing placement field, choose any available region for optimal performance.
7. Enter a name for your deployment.
8. Click Deploy.
Once the model is up and running, you’ll get a link to the endpoint. You can interact with the model via this endpoint to test and use your deployed inference model at the edge.
Was this article helpful?
Discover our offerings, including virtual instances starting from 3.7 euro/mo, bare metal servers, AI Infrastructure, load balancers, Managed Kubernetes, Function as a Service, and Centralized Logging solutions.