Ollama API
  1. Generate a completion
Ollama API
  • Endpoints
  • Conventions
  • Generate a completion
    • Overview
    • Generate request (Streaming)
      POST
    • Request (No streaming)
      POST
    • Request (with suffix)
      POST
    • Request (Structured outputs)
      POST
    • Request (JSON mode)
      POST
    • Request (with images)
      POST
    • Request (Raw Mode)
      POST
    • Request (Reproducible outputs)
      POST
    • Generate request (With options)
      POST
    • Load a model
      POST
    • Unload a model
      POST
  • Generate a chat completion
    • Overview
    • Chat Request (Streaming)
      POST
    • Chat request (No streaming)
      POST
    • Chat request (Structured outputs)
      POST
    • Chat request (With History)
      POST
    • Chat request (with images)
      POST
    • Chat request (Reproducible outputs)
      POST
    • Chat request (with tools)
      POST
    • Load a model
      POST
    • Unload a model
      POST
  • Create a Model
    • Overview
    • Create a new model
      POST
    • Quantize a model
      POST
    • Create a model from GGUF
      POST
    • Create a model from a Safetensors directory
      POST
  • Check if a Blob Exists
    • Overview
  • Push a Blob
    • Overview
  • List Local Models
    • Overview
    • Examples
  • Show Model Information
    • Overview
    • Examples
  • Copy a Model
    • Overview
    • Examples
  • Delete a Model
    • Overview
    • Examples
  • Pull a Model
    • Overview
    • Examples
  • Push a Model
    • Overview
  • Generate Embeddings
    • Overview
    • Examples
    • Request (Multiple input)
  • List Running Models
    • Overview
    • Examples
  • Generate Embedding
    • Overview
    • Examples
  • Version
    • Overview
  1. Generate a completion

Request (JSON mode)

POST
http://localhost:11434/api/generate
Important
When format is set to json, the output will always be a well-formed JSON object. It's important to also instruct the model to respond in JSON.
Request Request Example
Shell
JavaScript
Java
Swift
curl --location --request POST 'http://localhost:11434/api/generate' \
--header 'Content-Type: application/json' \
--data-raw '{
    "model": "string",
    "prompt": "string",
    "format": "string",
    "stream": true
}'
Response Response Example
{
  "model": "llama3.2",
  "created_at": "2023-11-09T21:07:55.186497Z",
  "response": "{\n\"morning\": {\n\"color\": \"blue\"\n},\n\"noon\": {\n\"color\": \"blue-gray\"\n},\n\"afternoon\": {\n\"color\": \"warm gray\"\n},\n\"evening\": {\n\"color\": \"orange\"\n}\n}\n",
  "done": true,
  "context": [1, 2, 3],
  "total_duration": 4648158584,
  "load_duration": 4071084,
  "prompt_eval_count": 36,
  "prompt_eval_duration": 439038000,
  "eval_count": 180,
  "eval_duration": 4196918000
}

Request

Body Params application/json
model
string 
required
prompt
string 
required
format
string 
required
stream
boolean 
required
Examples

Responses

🟢200Success
application/json
Body
model
string 
required
created_at
string 
required
response
string 
required
done
boolean 
required
context
array[integer]
required
total_duration
integer 
required
load_duration
integer 
required
prompt_eval_count
integer 
required
prompt_eval_duration
integer 
required
eval_count
integer 
required
eval_duration
integer 
required
Previous
Request (Structured outputs)
Next
Request (with images)
Built with