Chat completion
Methods
ChatCompletion
Calls the /v1/chat/completions endpoint.
Arguments:
ctx:context.Contextreq:*ChatCompletionRequest
Returns: (*ChatCompletionResponse, error)
ChatCompletionStream
Calls the /v1/chat/completions endpoint with streaming enabled.
Arguments:
ctx:context.Contextreq:*ChatCompletionRequest
Returns: (<-chan *CompletionChunk, error)
Request
ChatCompletionRequest
The request object for chat completion. It embeds CompletionConfig.
Fields:
Model(string): ID of the model to use.Messages([]ChatMessage): The list of messages for the conversation.Tools([]Tool): A list of tools the model may call.MaxTokens(int): Maximum number of tokens to generate. The token count of your prompt plusmax_tokenscannot exceed the model's context length.Temperature(float64): Sampling temperature. We recommend between 0.0 and 0.7.TopP(float64): Nucleus sampling. Default to 1.0.ResponseFormat(*ResponseFormat): Specifies the format of the output (Text, JSON Object, or JSON Schema).ToolChoice(ToolChoiceType): Controls which (if any) tool is called by the model.ParallelToolCalls(bool): Whether to enable parallel function calling. Default totrue.FrequencyPenalty(float64): Penalizes repetition based on frequency.PresencePenalty(float64): Penalizes repetition based on presence.N(int): Number of completions to return for each request.PromptMode(string): Toggles between the reasoning mode and no system prompt.RandomSeed(int): Seed for deterministic results.SafePrompt(bool): Whether to inject a safety prompt before all conversations.Stop([]string): Stop generation if these tokens are detected.Stream(bool): Whether to stream back partial progress.
NewChatCompletionRequest
Creates a new ChatCompletionRequest.
Arguments:
model:stringmessages:[]ChatMessage...opts:ChatCompletionRequestOption
NewChatCompletionStreamRequest
Creates a new ChatCompletionRequest with streaming enabled.
Arguments:
model:stringmessages:[]ChatMessage...opts:ChatCompletionRequestOption
Options
WithResponseTextFormat
Ensures the response is formatted as text. This is the default format.
Arguments: none
WithResponseJsonSchema
Ensures the response follows the specified JSON schema.
Arguments: PropertyDefinition
WithResponseJsonObjectFormat
Ensures the response is a JSON object.
Arguments: none
WithTools
Enables the model to call the specified tools. Sets ToolChoice to Auto.
Arguments: []Tool
WithToolChoice
Controls which (if any) tool is called by the model.
Arguments: ToolChoiceType
WithStreaming
Enables streaming back partial progress.
Arguments: none
Response
ChatCompletionResponse
Fields:
Choices([]ChatCompletionChoice): List of completions.Created(time.Time): Creation timestamp.Id(string): Unique ID for the completion.Model(string): Model used for the completion.Object(string): Object type.Usage(UsageInfo): Token usage information.Latency(time.Duration): Request latency.
AssistantMessage
Method on ChatCompletionResponse that returns the first assistant message in the choices, or nil if there are no assistant messages.
ChatCompletionChoice
Fields:
FinishReason(FinishReason): The reason why the model stopped generating tokens.Index(int): Index of the choice.Message(*AssistantMessage): The generated assistant message.
FinishReason
Possible values:
FinishReasonStop: Model reached a natural stop point or a provided stop sequence.FinishReasonLength: Model reached the maximum number of tokens.FinishReasonModelLength: Model reached its maximum context length.FinishReasonError: An error occurred during generation.FinishReasonToolCalls: Model is calling a tool.
Streaming
CompletionChunk
Represents a chunk of data received during streaming.
Fields:
Choices([]CompletionResponseStreamChoice): List of choices in this chunk.Created(time.Time): Creation timestamp.Id(string): Unique ID.Model(string): Model used.Object(string): Object type.Usage(UsageInfo): Token usage (usually only present in the last chunk).IsLastChunk(bool): Indicates if this is the last chunk.ChunkLatency(time.Duration): Latency of this specific chunk.TotalLatency(time.Duration): Total latency of the request.Error(error): Error if any occurred during streaming.
DeltaMessage
Method on CompletionChunk that returns the AssistantMessage delta for the first choice.