Skip to main content

Ollama

Ollama allows you to run open-source large language models, such as Llama3.1, locally.

Ollama bundles model weights, configuration, and data into a single package, defined by a Modelfile. It optimizes setup and configuration details, including GPU usage. For a complete list of supported models and model variants, see the Ollama model library.

See this guide for more details on how to use Ollama with LangChain.

Installation and Setupโ€‹

Ollama installationโ€‹

Follow these instructions to set up and run a local Ollama instance.

Ollama will start as a background service automatically, if this is disabled, run:

# export OLLAMA_HOST=127.0.0.1 # environment variable to set ollama host
# export OLLAMA_PORT=11434 # environment variable to set the ollama port
ollama serve

After starting ollama, run ollama pull <model_checkpoint> to download a model from the Ollama model library.

ollama pull llama3.1

We're now ready to install the langchain-ollama partner package and run a model.

Ollama LangChain partner package installโ€‹

Install the integration package with:

pip install langchain-ollama

LLMโ€‹

from langchain_ollama.llms import OllamaLLM
API Reference:OllamaLLM

See the notebook example here.

Chat Modelsโ€‹

Chat Ollamaโ€‹

from langchain_ollama.chat_models import ChatOllama
API Reference:ChatOllama

See the notebook example here.

Ollama tool callingโ€‹

Ollama tool calling uses the OpenAI compatible web server specification, and can be used with the default BaseChatModel.bind_tools() methods as described here. Make sure to select an ollama model that supports tool calling.

Embedding modelsโ€‹

from langchain_community.embeddings import OllamaEmbeddings
API Reference:OllamaEmbeddings

See the notebook example here.


Was this page helpful?