Overview

POST /api/chat

Generate the next message in a chat with a provided model. This is a streaming endpoint, so there will be a series of responses. Streaming can be disabled using "stream": false. The final response object will include statistics and additional data from the request.

Parameters

model: (required) the model name

messages: the messages of the chat, this can be used to keep a chat memory

tools: list of tools in JSON for the model to use if supported

The message object has the following fields:

role: the role of the message, either system, user, assistant, or tool

content: the content of the message

images (optional): a list of images to include in the message (for multimodal models such as llava)

tool_calls (optional): a list of tools in JSON that the model wants to use

Advanced parameters (optional):

format: the format to return a response in. Format can be json or a JSON schema.

options: additional model parameters listed in the documentation for the Modelfile such as temperature

stream: if false the response will be returned as a single response object, rather than a stream of objects

keep_alive: controls how long the model will stay loaded into memory following the request (default: 5m)

Structured outputs

Structured outputs are supported by providing a JSON schema in the format parameter. The model will generate a response that matches the schema. See the Chat request (Structured outputs) example below.

Parameters#

Structured outputs#

Parameters

Structured outputs