Skip to content

Chat completion

Methods

ChatCompletion

Calls the /v1/chat/completions endpoint.

Arguments:

  • ctx: context.Context
  • req: *ChatCompletionRequest

Returns: (*ChatCompletionResponse, error)

ChatCompletionStream

Calls the /v1/chat/completions endpoint with streaming enabled.

Arguments:

  • ctx: context.Context
  • req: *ChatCompletionRequest

Returns: (<-chan *CompletionChunk, error)

Request

ChatCompletionRequest

The request object for chat completion. It embeds CompletionConfig.

Fields:

  • Model (string): ID of the model to use.
  • Messages ([]ChatMessage): The list of messages for the conversation.
  • Tools ([]Tool): A list of tools the model may call.
  • MaxTokens (int): Maximum number of tokens to generate. The token count of your prompt plus max_tokens cannot exceed the model's context length.
  • Temperature (float64): Sampling temperature. We recommend between 0.0 and 0.7.
  • TopP (float64): Nucleus sampling. Default to 1.0.
  • ResponseFormat (*ResponseFormat): Specifies the format of the output (Text, JSON Object, or JSON Schema).
  • ToolChoice (ToolChoiceType): Controls which (if any) tool is called by the model.
  • ParallelToolCalls (bool): Whether to enable parallel function calling. Default to true.
  • FrequencyPenalty (float64): Penalizes repetition based on frequency.
  • PresencePenalty (float64): Penalizes repetition based on presence.
  • N (int): Number of completions to return for each request.
  • PromptMode (string): Toggles between the reasoning mode and no system prompt.
  • RandomSeed (int): Seed for deterministic results.
  • SafePrompt (bool): Whether to inject a safety prompt before all conversations.
  • Stop ([]string): Stop generation if these tokens are detected.
  • Stream (bool): Whether to stream back partial progress.

NewChatCompletionRequest

Creates a new ChatCompletionRequest.

Arguments:

  • model: string
  • messages: []ChatMessage
  • ...opts: ChatCompletionRequestOption

NewChatCompletionStreamRequest

Creates a new ChatCompletionRequest with streaming enabled.

Arguments:

  • model: string
  • messages: []ChatMessage
  • ...opts: ChatCompletionRequestOption

Options

WithResponseTextFormat

Ensures the response is formatted as text. This is the default format.

Arguments: none

WithResponseJsonSchema

Ensures the response follows the specified JSON schema.

Arguments: PropertyDefinition

WithResponseJsonObjectFormat

Ensures the response is a JSON object.

Arguments: none

WithTools

Enables the model to call the specified tools. Sets ToolChoice to Auto.

Arguments: []Tool

WithToolChoice

Controls which (if any) tool is called by the model.

Arguments: ToolChoiceType

WithStreaming

Enables streaming back partial progress.

Arguments: none

Response

ChatCompletionResponse

Fields:

  • Choices ([]ChatCompletionChoice): List of completions.
  • Created (time.Time): Creation timestamp.
  • Id (string): Unique ID for the completion.
  • Model (string): Model used for the completion.
  • Object (string): Object type.
  • Usage (UsageInfo): Token usage information.
  • Latency (time.Duration): Request latency.

AssistantMessage

Method on ChatCompletionResponse that returns the first assistant message in the choices, or nil if there are no assistant messages.

ChatCompletionChoice

Fields:

  • FinishReason (FinishReason): The reason why the model stopped generating tokens.
  • Index (int): Index of the choice.
  • Message (*AssistantMessage): The generated assistant message.

FinishReason

Possible values:

  • FinishReasonStop: Model reached a natural stop point or a provided stop sequence.
  • FinishReasonLength: Model reached the maximum number of tokens.
  • FinishReasonModelLength: Model reached its maximum context length.
  • FinishReasonError: An error occurred during generation.
  • FinishReasonToolCalls: Model is calling a tool.

Streaming

CompletionChunk

Represents a chunk of data received during streaming.

Fields:

  • Choices ([]CompletionResponseStreamChoice): List of choices in this chunk.
  • Created (time.Time): Creation timestamp.
  • Id (string): Unique ID.
  • Model (string): Model used.
  • Object (string): Object type.
  • Usage (UsageInfo): Token usage (usually only present in the last chunk).
  • IsLastChunk (bool): Indicates if this is the last chunk.
  • ChunkLatency (time.Duration): Latency of this specific chunk.
  • TotalLatency (time.Duration): Total latency of the request.
  • Error (error): Error if any occurred during streaming.

DeltaMessage

Method on CompletionChunk that returns the AssistantMessage delta for the first choice.