Maximum performance at minimal cost. Start your server in seconds now!

Content

1 Downloading and Installing Ollama
2 Managing Ollama as a System Service on Linux
3 Installing Ollama on macOS
4 Installing Ollama on Windows
5 Downloading Large Language Models (LLMs) with Ollama
6 Using Ollama to Run AI Models
7 Why Use Ollama to Run AI Models?
8 Step 2: Display Model Details
9 Configuring Ollama Variables on Linux
10 Setting Ollama Variables on macOS
11 Setting Ollama Variables on Windows
12 Conclusion

Vijona

21 Mar at 10:08

Running Large Language Models Locally with Ollama

Ollama is an open-source solution that enables users to run large language models (LLMs) directly on their personal computers. It supports a variety of open-source LLMs, such as Llama 3, DeepSeek R1, Mistral, Phi-4, and Gemma 2, allowing them to operate without an internet connection. This approach improves security, safeguards privacy, and offers complete control over model customization and performance optimization.

Key Features of Ollama

Ollama comes with a built-in model repository, making it easy to find, download, and run LLMs locally. It also supports OpenWebUI, providing a user-friendly graphical interface for those who prefer not to use the command line. The platform is compatible with Linux, Windows, and macOS, eliminating the need for cloud-based APIs to execute models.

Setting Up Ollama and Running LLMs

This guide outlines the steps required to install Ollama and configure large language models (LLMs) with all necessary dependencies on a local workstation.

Downloading and Installing Ollama

Ollama is designed to run on Linux, macOS, and Windows, allowing users to install it seamlessly using the official release package or script. Follow the steps below to install the latest version of Ollama on your system.

Step 1: Open a Terminal

Begin by launching a new terminal session on your system.

Step 2: Install Ollama on Linux

To download and install Ollama on Linux, execute the following command:

$ curl -fsSL https://ollama.com/install.sh | sh

Step 3: Verify Installation

After installation, confirm that Ollama has been successfully installed by checking the version:

$ ollama -v

Expected output:

ollama version is 0.5.12

Step 4: List Available Models

To see all models available on your local machine, run:

$ ollama list

Managing Ollama as a System Service on Linux

When installed on Linux, Ollama creates a system service called ollama.service to manage its operation. Follow these steps to check the service status and configure it to start at boot.

Check Ollama Service Status

To verify that Ollama is running, use:

$ sudo systemctl status ollama

Expected output:


● ollama.service - Ollama Service
    Loaded: loaded (/etc/systemd/system/ollama.service; enabled; preset: enabled)
    Active: active (running) since Wed 2025-02-26 13:33:41 UTC; 5min ago
Main PID: 27138 (ollama)
    Tasks: 6 (limit: 2269)
    Memory: 32.2M (peak: 32.7M)
        CPU: 63ms
    CGroup: /system.slice/ollama.service
            └─27138 /usr/local/bin/ollama serve

Enable Ollama to Start at Boot

To configure Ollama to start automatically when your system boots up, execute:

$ sudo systemctl enable ollama

Restart Ollama Service

If necessary, restart the Ollama service using:

$ sudo systemctl restart ollama

Optional: Install AMD GPU ROCm Drivers for Ollama

For systems using AMD GPUs, download and install the ROCm-supported Ollama version:

$ curl -L https://ollama.com/download/ollama-linux-amd64-rocm.tgz -o ollama-linux-amd64-rocm.tgz $ sudo tar -C /usr/ -xzf ollama-linux-amd64-rocm.tgz

Installing Ollama on macOS

To install Ollama on macOS:

Visit the official Ollama website.
Click “Download” and select the latest macOS package.
Extract the downloaded .zip file.
Move Ollama.app to the Applications folder.

Verify Installation

To confirm that Ollama is installed correctly, open a terminal and run:

$ ollama -v $ ollama list $ ollama serve

Installing Ollama on Windows

To install Ollama on Windows:

Visit the official Ollama website.
Download the latest .exe file.
Run the installer and click “Install” to complete the setup.

Verify Installation

After installation, open Windows PowerShell and run:


> ollama -v
> ollama list
> ollama serve

Downloading Large Language Models (LLMs) with Ollama

Ollama allows users to fetch models using the ollama pull command. Follow these steps to download and run models locally.

Step 1: Pull a Model

Use the command below to fetch a model from the Ollama repository:


$ ollama pull [model]

Example: Download Mistral

To fetch the Mistral model, run:

$ ollama pull mistral

Example: Download DeepSeek-R1 with 1.5B Parameters

To retrieve the DeepSeek-R1-Distill-Qwen model, execute:

$ ollama pull deepseek-r1:1.5b

Example: Download Llama 3.3

Llama 3.3 is a large model (~40GB). Ensure sufficient storage before proceeding:

$ ollama pull llama3.3

Step 2: Verify Downloaded Models

To check which models have been downloaded, use:

$ ollama list

Example output:

NAME ID SIZE MODIFIED llama3.3:latest a6eb4748fd29 42 GB 21 seconds ago deepseek-r1:1.5b a42b25d8c10a 1.1 GB 4 minutes ago mistral:latest f974a74358d6 4.1 GB 26 minutes ago

Using Ollama to Run AI Models

Ollama allows users to execute, pull, and initialize large language models directly from its repository or from locally stored models. Before running a model, ensure that your system meets the required hardware specifications. Follow the steps below to test models and analyze their performance on your workstation.

Step 1: List Available Models

Check all models currently installed on your machine by running the following command:

$ ollama list

Step 2: Run a Model

To execute a model, use the ollama run command. For example, to run the Qwen 2.5 instruct model with 1.5B parameters, use:

$ ollama run qwen2.5:1.5b

Step 3: Provide a Prompt

Once the model is running, enter a prompt.

The model will generate a response in the terminal.

Step 4: Exit the Model

To exit Ollama, enter the following command:


>>> /bye

Running a Different Model

You can also execute a model that is already available on your workstation. For instance, if you previously downloaded the DeepSeek R1 model, you can run it with:

$ ollama run deepseek-r1:1.5b

Step 6: Enter a Prompt

For a test case, enter a mathematical challenge, such as:

Prompt: Generate a recursive fractal pattern description using only mathematical notations and symbolic logic.

Step 7: Observe the Model’s Response

The AI will process the request and output a structured mathematical description.

Okay, so I need to create a recursive fractal pattern using only mathematical notation and symbols. Hmm, that's an interesting challenge. Let me think about what I know about fractals and how they can be represented mathematically. ...................................

Step 8: Exit Ollama

To close the session, type:


>>> /bye

Why Use Ollama to Run AI Models?

Different models have unique strengths and are designed for various tasks. Running models locally using Ollama allows users to benchmark their efficiency and effectiveness. The ollama run command enables you to execute available models instantly, while the ollama pull command fetches the latest versions from the official Ollama repository.

Managing Models with Ollama

Handling multiple large language models (LLMs) on your system requires effective management. Ollama provides commands to list, view details, stop, and remove models. Follow the steps below to manage models on your workstation.

Step 1: List Available Models

To see all models currently stored on your system, run:

$ ollama list

Step 2: Display Model Details

To view detailed information about a specific model, such as Llama 3.3, use:

$ ollama show llama3.3

Expected output:


 Model
   architecture        llama
   parameters          70.6B
   context length      131072
   embedding length    8192
   quantization        Q4_K_M

 Parameters
   stop    "<|start_header_id|>"
   stop    "<|end_header_id|>"
   stop    "<|eot_id|>"

 License
   LLAMA 3.3 COMMUNITY LICENSE AGREEMENT
   Llama 3.3 Version Release Date: December 6, 2024

Step 3: Stop a Running Model

If a model is actively running, stop it using the following command:


$ ollama stop [model-name]

For example, to stop the DeepSeek R1 model:

$ ollama stop deepseek-r1:1.5b

Step 4: Remove an Unused Model

To delete a model from your system, run:

$ ollama rm mistral

Expected output:

deleted 'mistral'

Setting Ollama Environment Variables

Ollama provides environment variables to fine-tune its behavior and optimize performance. Below are some commonly used variables:

OLLAMA_HOST: Defines the Ollama server address.
OLLAMA_GPU_OVERHEAD: Allocates VRAM for GPU processing.
OLLAMA_MODELS: Sets a custom directory for storing models.
OLLAMA_KEEP_ALIVE: Determines how long models remain loaded in memory.
OLLAMA_DEBUG: Enables debugging output.
OLLAMA_FLASH_ATTENTION: Activates performance optimizations.
OLLAMA_NOHISTORY: Disables session history logging.
OLLAMA_NOPRUNE: Prevents model cleanup during system boot.
OLLAMA_ORIGINS: Configures access permissions for remote connections.

Configuring Ollama Variables on Linux

To set environment variables for Ollama on Linux:

Step 1: Open the Service File

$ sudo vim /etc/systemd/system/ollama.service

Step 2: Add Environment Variables

Insert the following lines under the [Service] section:


[Service]
Environment="OLLAMA_DEBUG=1"
Environment="OLLAMA_HOST=0.0.0.0:11434"

Step 3: Apply Changes

$ sudo systemctl daemon-reload $ sudo systemctl restart ollama

Setting Ollama Variables on macOS

To configure environment variables on macOS:

$ launchctl setenv OLLAMA_HOST "0.0.0.0" $ ollama serve

Setting Ollama Variables on Windows

To configure variables on Windows:

Open the Windows search menu and search for “Environment Variables”.
Select “Edit the System Variables”.
Click “Environment Variables”.
Click “New” to create a new entry.
Enter the variable name and value.
Click “OK” to save the settings.
Click “Apply” to finalize the changes.

Conclusion

You have successfully installed and configured Ollama to run large language models on your local system. Whether running models locally or on a remote machine, Ollama provides an efficient solution. Use environment variables to enhance performance and allow remote access. For further information, refer to the Ollama GitHub repository.

Source: vultr.com

Create a Free Account

Register now and get access to our Cloud Services.

Posts you might be interested in:

Moderne Hosting Services mit Cloud Server, Managed Server und skalierbarem Cloud Hosting für professionelle IT-Infrastrukturen

Install Alfresco on CentOS 7 with Nginx and SSL

Linux Basics, Tutorial

7 days ago

Alfresco Community Edition Installation on CentOS 7 The Alfresco Community Edition represents the open-source variant of Alfresco Content Services. Developed in Java and utilizing PostgreSQL as its database backend,…

Moderne Hosting Services mit Cloud Server, Managed Server und skalierbarem Cloud Hosting für professionelle IT-Infrastrukturen

Install Apache CouchDB on CentOS 7 – Step-by-Step Guide

Databases, Tutorial

7 days ago

Install and Configure Apache CouchDB on CentOS 7 Apache CouchDB is a NoSQL database server that’s open-source and free to use. It keeps data in databases formatted as JSON documents…

Moderne Hosting Services mit Cloud Server, Managed Server und skalierbarem Cloud Hosting für professionelle IT-Infrastrukturen

Install Oxwall on CentOS 7: Step-by-Step Server Guide

Linux Basics, Tutorial

7 days ago

Guide: Installing Oxwall on CentOS 7 Oxwall is a free, open-source platform designed to help you create your own social networking site. This step-by-step tutorial shows how to deploy Oxwall…