How to deploy DeepSeek 70B with Ollama and a Web UI on Gcore Everywhere Inference

How to deploy DeepSeek 70B with Ollama and a Web UI on Gcore Everywhere Inference

Large language models (LLMs) like DeepSeek 70B are revolutionizing industries by enabling more advanced and dynamic conversational AI solutions. Whether you’re looking to build intelligent customer support systems, enhance content generation, or create data-driven applications, deploying and interacting with LLMs has never been more accessible.

In this tutorial, we’ll show you exactly how to set up DeepSeek 70B using Ollama and a Web UI on Gcore Everywhere Inference. By the end, you’ll have a fully functional environment where you can easily interact with your custom LLM via a user-friendly interface. This process involves three simple steps: deploying Ollama, deploying the web UI, and configuring the web UI and connecting to Ollama.

Let’s get started!

Step 1: Deploy Ollama

  1. Log in to Gcore Everywhere Inference and select Deploy Custom Model.
  1. In the model image field, enter ollama/ollama.
  2. Set the Port to 11434.
  1. Under Pod Configuration, configure the following:
  2. Select GPU-Optimized.
  3. Choose a GPU type, such as 1×A100 or 1×H100.
  4. Choose a region (e.g., Luxembourg-3).
  1. Set an autoscaling policy or use the default settings.
  2. Name your deployment (e.g., ollama).
  3. Click Deploy model on the right side of the screen.

Once deployed, you’ll have an Ollama endpoint ready to serve your model.

Step 2: Deploy the Web UI for Ollama

  1. Go back to the Gcore Everywhere Inference console and select Deploy Custom Model again.
  2. In the Model Image field, enter ghcr.io/open-webui/open-webui:main.
  3. Set the Port to 8080.
  1. Under Pod Configuration, set:
    • CPU-Optimized.
    • Choose 4 vCPU / 16 GiB RAM.
  2. Select the same region as before (e.g., Luxembourg-3).
  1. Configure an autoscaling policy or use the default settings.
  2. Name your deployment (e.g., webui).
  3. Click Deploy model on the right side of the screen.
  1. Once deployed, navigate to the Web UI endpoint from the Gcore Customer Portal.

Step 3: Configure the Web UI

  1. From the Web UI endpoint and set up a username and password when prompted.
  1. Log in and navigate to the admin panel.
  1. Go to Settings → Connections → Disable the OpenAI API integration.
  2. In the Ollama API field, enter the endpoint for your Ollama deployment. You can find this in the Gcore Customer Portal. It will look similar to this: https://<your-ollama-deployment>.ai.gcore.dev/.
  1. Click Save to confirm your changes.

Step 4: Pull and Use DeepSeek 70B

  1. Open the chat section in the Web UI.
  2. In the Select a model field, type deepseek-r1:70b.
  3. Click Pull to download the model.

  1. Wait for the download to complete.
  2. Once downloaded, select the model and start chatting!

Your AI environment is ready to explore

By following these steps, you’ve successfully deployed DeepSeek 70B on Gcore Everywhere Inference with Ollama. This setup provides a powerful and user-friendly environment for experimenting with LLMs, prototyping AI-driven features, or integrating advanced conversational AI into your applications.

Ready to unlock the full potential of AI? Gcore Everywhere Inference offers outstanding scalability, performance, and support, making it the perfect solution for developers and businesses working with advanced AI models. Dive deeper into our powerful tools and resources by exploring our AI blog and docs.

Discover Gcore Everywhere Inference

How to deploy DeepSeek 70B with Ollama and a Web UI on Gcore Everywhere Inference

Subscribe
to our newsletter

Get the latest industry trends, exclusive insights, and Gcore
updates delivered straight to your inbox.