Documentation
¶
Overview ¶
Package llm provides a minimal OpenAI compatible client with chat, completion and embedding helpers.
Index ¶
- Variables
- func IsRetryableError(err error) bool
- func StripThinking(s string) string
- type APIError
- type ApproxTokenCounter
- type ChatCompletionRequest
- type ChatMessage
- type ChatResponse
- type ChatResponseIterator
- type ChatSession
- func (s *ChatSession) ContextUsed() ContextUsage
- func (s *ChatSession) NewChat(opts ...SessionOpt) *ChatSession
- func (s *ChatSession) Send(ctx context.Context, req ChatCompletionRequest) (*ChatResponse, error)
- func (s *ChatSession) SendStreaming(ctx context.Context, req ChatCompletionRequest) (ChatResponseIterator, error)
- type Client
- func (*Client) Close() error
- func (c *Client) Embed(ctx context.Context, req EmbedRequest) (*EmbedResponse, error)
- func (c *Client) EmbedBatch(ctx context.Context, req EmbedBatchRequest) (*EmbedBatchResponse, error)
- func (c *Client) GenerateCompletion(ctx context.Context, req CompletionRequest) (string, error)
- func (c *Client) ListModels(ctx context.Context) ([]string, error)
- type CompletionRequest
- type ContextUsage
- type EmbedBatchRequest
- type EmbedBatchResponse
- type EmbedRequest
- type EmbedResponse
- type Option
- type SessionOpt
- type TokenCounter
Constants ¶
This section is empty.
Variables ¶
Functions ¶
func IsRetryableError ¶
IsRetryableError returns true if the error is retryable. It handles common HTTP codes and network timeouts.
func StripThinking ¶
Types ¶
type ApproxTokenCounter ¶
type ApproxTokenCounter struct{}
ApproxTokenCounter estimates token usage by assuming roughly one token corresponds to four runes.
func (ApproxTokenCounter) Count ¶
func (ApproxTokenCounter) Count(msgs ...openai.ChatCompletionMessageParamUnion) int
type ChatCompletionRequest ¶
type ChatMessage ¶
type ChatMessage = openai.ChatCompletionMessageParamUnion
func TruncateHistory ¶
func TruncateHistory(tc TokenCounter, msgs []ChatMessage, limit int) []ChatMessage
type ChatResponse ¶
ChatResponse is a non-streaming chat response.
type ChatResponseIterator ¶
type ChatResponseIterator iter.Seq2[ChatResponse, error]
ChatResponseIterator is a streaming sequence of chat responses.
type ChatSession ¶
type ChatSession struct {
// contains filtered or unexported fields
}
ChatSession represents a single conversational context. Not thread safe, create a separate ChatSession per goroutine or protect calls with a mutex.
func NewChat ¶
func NewChat(c *Client, systemPrompt string, opts ...SessionOpt) *ChatSession
NewChat creates a new chat session with optional system prompt.
func (*ChatSession) ContextUsed ¶
func (s *ChatSession) ContextUsed() ContextUsage
ContextUsed returns the number of tokens currently used in the session context.
func (*ChatSession) NewChat ¶
func (s *ChatSession) NewChat(opts ...SessionOpt) *ChatSession
func (*ChatSession) Send ¶
func (s *ChatSession) Send(ctx context.Context, req ChatCompletionRequest) (*ChatResponse, error)
Send sends user messages and returns a response. The assistant's reply is appended to the internal history.
func (*ChatSession) SendStreaming ¶
func (s *ChatSession) SendStreaming(ctx context.Context, req ChatCompletionRequest) (ChatResponseIterator, error)
SendStreaming sends user messages and returns a streaming response iterator. The assistant's full reply is added to history after streaming completes.
type Client ¶
type Client struct {
// contains filtered or unexported fields
}
Client implements an open ai api compatible client.
func (*Client) Embed ¶
func (c *Client) Embed(ctx context.Context, req EmbedRequest) (*EmbedResponse, error)
Embed returns the embedding for a single input.
func (*Client) EmbedBatch ¶
func (c *Client) EmbedBatch(ctx context.Context, req EmbedBatchRequest) (*EmbedBatchResponse, error)
EmbedBatch returns embeddings for multiple inputs.
func (*Client) GenerateCompletion ¶
GenerateCompletion creates a single-turn completion from a prompt.
type CompletionRequest ¶
type ContextUsage ¶
type ContextUsage struct{ Used, Max int }
type EmbedBatchRequest ¶
EmbedBatchRequest contains multiple inputs to embed with a model.
type EmbedBatchResponse ¶
type EmbedBatchResponse struct {
Vectors [][]float64
Usage *openai.CreateEmbeddingResponseUsage
}
type EmbedRequest ¶
EmbedRequest specifies a model and input string for embedding.
type EmbedResponse ¶
type EmbedResponse struct {
Vector []float64
Usage *openai.CreateEmbeddingResponseUsage
}
type Option ¶
type Option func(*config)
Option configures the OpenAI client.
func WithTemperature ¶
WithTemperature sets the LLM completion temperature.
type SessionOpt ¶
type SessionOpt func(*ChatSession)
func WithDefaultContextLength ¶
func WithDefaultContextLength(l int) SessionOpt
WithDefaultContextLength sets the maximum context length (in tokens) for a session.
func WithSessionLogger ¶
func WithSessionLogger(logger *slog.Logger) SessionOpt
WithSessionLogger sets a session custom slog.Logger.
func WithSessionTemperature ¶
func WithSessionTemperature(t *float64) SessionOpt
WithSessionTemperature sets the session LLM completion temperature.
func WithTokenCounter ¶
func WithTokenCounter(tc TokenCounter) SessionOpt
WithTokenCounter sets a custom TokenCounter for estimating token usage.
type TokenCounter ¶
type TokenCounter interface {
Count(msgs ...openai.ChatCompletionMessageParamUnion) int
}
TokenCounter reports the number of tokens in a set of messages.