Ollama API
  1. Generate a chat completion
Ollama API
  • Endpoints
  • Conventions
  • Generate a completion
    • Overview
    • Generate request (Streaming)
      POST
    • Request (No streaming)
      POST
    • Request (with suffix)
      POST
    • Request (Structured outputs)
      POST
    • Request (JSON mode)
      POST
    • Request (with images)
      POST
    • Request (Raw Mode)
      POST
    • Request (Reproducible outputs)
      POST
    • Generate request (With options)
      POST
    • Load a model
      POST
    • Unload a model
      POST
  • Generate a chat completion
    • Overview
    • Chat Request (Streaming)
      POST
    • Chat request (No streaming)
      POST
    • Chat request (Structured outputs)
      POST
    • Chat request (With History)
      POST
    • Chat request (with images)
      POST
    • Chat request (Reproducible outputs)
      POST
    • Chat request (with tools)
      POST
    • Load a model
      POST
    • Unload a model
      POST
  • Create a Model
    • Overview
    • Create a new model
      POST
    • Quantize a model
      POST
    • Create a model from GGUF
      POST
    • Create a model from a Safetensors directory
      POST
  • Check if a Blob Exists
    • Overview
  • Push a Blob
    • Overview
  • List Local Models
    • Overview
    • Examples
  • Show Model Information
    • Overview
    • Examples
  • Copy a Model
    • Overview
    • Examples
  • Delete a Model
    • Overview
    • Examples
  • Pull a Model
    • Overview
    • Examples
  • Push a Model
    • Overview
  • Generate Embeddings
    • Overview
    • Examples
    • Request (Multiple input)
  • List Running Models
    • Overview
    • Examples
  • Generate Embedding
    • Overview
    • Examples
  • Version
    • Overview
  1. Generate a chat completion

Chat request (Reproducible outputs)

POST
http://localhost:11434/api/chat
Request Request Example
Shell
JavaScript
Java
Swift
curl --location --request POST 'http://localhost:11434/api/chat' \
--header 'Content-Type: application/json' \
--data-raw '{
    "model": "gpt-3.5-turbo",
    "messages": [
        {
            "role": "user",
            "content": "Hello!"
        },
        {
            "role": "assistant",
            "content": "Hello! I am AI assistant."
        }
    ],
    "options": {
        "seed": 42,
        "temperature": 0.7
    }
}'
Response Response Example
{
  "model": "llama3.2",
  "created_at": "2023-12-12T14:13:43.416799Z",
  "message": {
    "role": "assistant",
    "content": "Hello! How are you today?"
  },
  "done": true,
  "total_duration": 5191566416,
  "load_duration": 2154458,
  "prompt_eval_count": 26,
  "prompt_eval_duration": 383809000,
  "eval_count": 298,
  "eval_duration": 4799921000
}

Request

Body Params application/json
model
string 
required
Example:
gpt-3.5-turbo
messages
array [object {2}] 
required
Example:
[{"role":"user","content":"Hello!"},{"role":"assistant","content":"Hello! I am AI assistant."}]
role
string 
required
Example:
user
content
string 
required
options
object 
optional
Example:
{"seed":42,"temperature":0.7}
seed
integer 
optional
Random seed
Example:
42
temperature
number 
optional
Generation temperature
Example:
0.7
Examples

Responses

🟢200Success
application/json
Body
model
string 
required
created_at
string 
required
message
object 
required
role
string 
required
content
string 
required
done
boolean 
required
total_duration
integer 
required
load_duration
integer 
required
prompt_eval_count
integer 
required
prompt_eval_duration
integer 
required
eval_count
integer 
required
eval_duration
integer 
required
Modified at 2025-03-28 02:18:20
Previous
Chat request (with images)
Next
Chat request (with tools)
Built with