Ollama API
  1. Generate Embeddings
Ollama API
  • Endpoints
  • Conventions
  • Generate a completion
    • Overview
    • Generate request (Streaming)
      POST
    • Request (No streaming)
      POST
    • Request (with suffix)
      POST
    • Request (Structured outputs)
      POST
    • Request (JSON mode)
      POST
    • Request (with images)
      POST
    • Request (Raw Mode)
      POST
    • Request (Reproducible outputs)
      POST
    • Generate request (With options)
      POST
    • Load a model
      POST
    • Unload a model
      POST
  • Generate a chat completion
    • Overview
    • Chat Request (Streaming)
      POST
    • Chat request (No streaming)
      POST
    • Chat request (Structured outputs)
      POST
    • Chat request (With History)
      POST
    • Chat request (with images)
      POST
    • Chat request (Reproducible outputs)
      POST
    • Chat request (with tools)
      POST
    • Load a model
      POST
    • Unload a model
      POST
  • Create a Model
    • Overview
    • Create a new model
    • Quantize a model
    • Create a model from GGUF
    • Create a model from a Safetensors directory
  • Check if a Blob Exists
    • Overview
  • Push a Blob
    • Overview
  • List Local Models
    • Overview
    • Examples
  • Show Model Information
    • Overview
    • Examples
  • Copy a Model
    • Overview
    • Examples
  • Delete a Model
    • Overview
    • Examples
  • Pull a Model
    • Overview
    • Examples
  • Push a Model
    • Overview
  • Generate Embeddings
    • Overview
    • Examples
      POST
    • Request (Multiple input)
      POST
  • List Running Models
    • Overview
    • Examples
  • Generate Embedding
    • Overview
    • Examples
  • Version
    • Overview
  1. Generate Embeddings

Overview

POST /api/embed
Generate embeddings from a model

Parameters#

model: name of model to generate embeddings from
input: text or list of text to generate embeddings for
Advanced parameters:
truncate: truncates the end of each input to fit within context length. Returns error if false and context length is exceeded. Defaults to true
options: additional model parameters listed in the documentation for the Modelfile such as temperature
keep_alive: controls how long the model will stay loaded into memory following the request (default: 5m)
Modified at about 2 months ago
Previous
Overview
Next
Examples
Built with
On this page
Parameters