Oct 21, 2024

How to Set Up an LLM Server with Debian and Ollama

In a previous article, we set up an old laptop with Debian as a server for remote access. Now it’s time to set up a large language model (LLM) server using Debian and Ollama, and configure Docker to manage everything efficiently.

1. Installing Docker

Docker allows the use of containers and efficient application management by isolating these applications.

Therefore, we are going to install it to manage the web interface we will use.

First, install the dependencies with:

sudo apt -qy install software-properties-common apt-transport-https ca-certificates curl gnupg lsb-release

Next, add the Docker GPG key with:

sudo mkdir -p /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/debian/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg

Then, configure the Docker repositories so that apt (Debian’s package manager) knows where to install it from:

echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/debian $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

Once this is done, simply update the repository information with:

sudo apt -qy update

And install Docker with:

sudo apt -qy install docker-ce docker-ce-cli containerd.io docker-compose-plugin

To gain some security, we will allow the use of Docker without superuser privileges (known as rootless usage), for which we will add our user to the Docker users group with:

sudo usermod -aG docker $USER

We do not need to replace anything because the system already has our user in the $USER variable from the login.

2. Installing Ollama

Ollama is the tool we will use to manage the LLMs on our server. The installation is straightforward and is done with a single command:

curl -fsSL https://ollama.com/install.sh | sh

This will download and install Ollama, making it ready to manage the models on our server.

I know there is a Docker-based installation, but I haven’t had the chance to look into it yet. However, I will update this article in the future to add that alternative.

3. Enabling Connections from the Local Network

To access Ollama from other devices on the local network, we need to modify its configuration to accept external connections.

First, make a backup!

sudo cp /etc/systemd/system/ollama.serve /etc/systemd/system/ollama.serve.bak

Then open it with nano:

sudo nano /etc/systemd/system/ollama.serve

Find the line that says [Service] and just below it, add the following lines:

Environment="OLLAMA_HOST=0.0.0.0"
Environment="OLLAMA_ORIGINS=*"

These lines might already exist, either commented out or with different values. The important thing is that the file includes the lines as I have provided.

With this, Ollama will accept connections from any origin, facilitating remote access to the models from any device on the local network.

4. Installing Open Web UI

Open Web UI is the graphical administration interface that makes it easier to interact with Ollama from any browser. Install the web interface using Docker:

sudo docker run -d -p 3000:8080 -e OLLAMA_BASE_URL=http://$(hostname -I | awk '{print $1}'):11434 -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main

This command takes care of everything necessary:

Configuring the interface to start automatically,
Pointing to the address and port where Ollama is listening,
And setting up a port where it will listen for requests.

With this, you can access the web interface from http://192.168.1.100:3000 (or the IP you configured as the fixed IP of your server).

5. Installing LLMs

Everything is ready to install the models… so let’s install one!

All we need to do is execute the following command:

ollama pull <model>

You can see which models are available for Ollama on their website.

You will need to replace <model> with the LLM you want to install, and that’s it.

Problems I Have Encountered

During the configuration and use of Ollama, I encountered some problems. Below, I discuss the issues I faced and how I resolved them.

Error: `no suitable llama server found`

This error usually indicates that the model could not be loaded. In general, models are loaded into the /tmp folder, so the first thing I did was check that there was enough space in that folder and, by increasing the space (or deleting some files on the disk), the problem disappeared.

Conclusion

With these steps, we have set up a large language model (LLM) server using Debian, Ollama, and Docker. Now, any device on your local network can access the installed models through the web interface, making this environment ideal for testing and experimentation.

It also helps bring these technologies closer to all family members with complete security.

I hope you enjoy exploring the potential of LLMs on your own server!