Skip to content

LLM Providers

Each native LLM vendor is published as its own Go module under llm/. The client returned from a vendor's NewLLM(...) satisfies the llm.LLM interface, so once you've constructed it the rest of your code is vendor-agnostic.

Creating a client

import (
    llmopenai "github.com/joakimcarlsson/ai/llm/openai"
    "github.com/joakimcarlsson/ai/model"
)

client := llmopenai.NewLLM(
    llmopenai.WithAPIKey("your-api-key"),
    llmopenai.WithModel(model.OpenAIModels[model.GPT4o]),
    llmopenai.WithMaxTokens(1000),
)

For Anthropic instead:

import llmanthropic "github.com/joakimcarlsson/ai/llm/anthropic"

client := llmanthropic.NewLLM(
    llmanthropic.WithAPIKey("..."),
    llmanthropic.WithModel(model.AnthropicModels[model.Claude45Sonnet]),
    llmanthropic.WithMaxTokens(1000),
)

Sending messages

import "github.com/joakimcarlsson/ai/message"

response, err := client.SendMessages(ctx, []message.Message{
    message.NewUserMessage("Hello, how are you?"),
}, nil)
fmt.Println(response.Content)

Streaming

import "github.com/joakimcarlsson/ai/types"

stream := client.StreamResponse(ctx, messages, nil)

for event := range stream {
    switch event.Type {
    case types.EventContentDelta:
        fmt.Print(event.Content)
    case types.EventComplete:
        fmt.Printf("\nTokens: %d in / %d out\n",
            event.Response.Usage.InputTokens,
            event.Response.Usage.OutputTokens)
    case types.EventError:
        log.Fatal(event.Error)
    }
}

Multimodal (images)

imageData, _ := os.ReadFile("image.png")

msg := message.NewUserMessage("What's in this image?")
msg.AddAttachment(message.Attachment{
    MIMEType: "image/png",
    Data:     imageData,
})

response, err := client.SendMessages(ctx, []message.Message{msg}, nil)

Common options

Every vendor exports the standard set:

llmopenai.WithAPIKey("...")
llmopenai.WithModel(model.OpenAIModels[model.GPT4o])
llmopenai.WithMaxTokens(2000)
llmopenai.WithTemperature(0.7)
llmopenai.WithTopP(0.9)
llmopenai.WithTopK(40)
llmopenai.WithStopSequences("STOP", "END")
llmopenai.WithTimeout(30 * time.Second)
llmopenai.WithToolChoice(llm.ToolChoice{Mode: llm.ToolChoiceRequired})

WithToolChoice

WithToolChoice is shared by the OpenAI, Anthropic, and Gemini modules (OpenAI-compatible providers inherit it through llm/openai). It takes the vendor-neutral llm.ToolChoice type: Mode is ToolChoiceAuto (default), ToolChoiceNone, ToolChoiceRequired, or ToolChoiceSpecific with a Name. It maps to each provider's native field (tool_choice for OpenAI/Anthropic, toolConfig.functionCallingConfig for Gemini) and is emitted only when tools are supplied. ToolChoiceSpecific with an empty Name is rejected before the request is sent.

WithTopK on the OpenAI client

OpenAI's and Azure's own APIs reject top_k (HTTP 400), so llmopenai.WithTopK is sent only when a custom base URL points at an OpenAI-compatible provider that accepts it (Together, OpenRouter, Fireworks, ...); against OpenAI or Azure proper it has no effect. Native providers (Anthropic, Gemini, Bedrock) honor WithTopK directly. WithStopSequences sends every sequence provided (the OpenAI client caps at the API's limit of 4).

Vendor-specific options

OpenAI:

llmopenai.WithBaseURL("https://custom-endpoint")
llmopenai.WithExtraHeaders(map[string]string{"X-My-Header": "value"})
llmopenai.WithReasoningEffort(llmopenai.ReasoningEffortHigh)
llmopenai.WithFrequencyPenalty(0.5)
llmopenai.WithPresencePenalty(0.5)
llmopenai.WithSeed(42)
llmopenai.WithParallelToolCalls(false)
llmopenai.WithLogitBias(map[string]int{"1212": 5, "50256": -100})  // bias/ban tokens by id
llmopenai.WithLogprobs(3)                                          // logprobs:true + top_logprobs:3
llmopenai.WithN(3)                                                 // n completions per request

Sampling knobs that change the response shape

WithLogitBias, WithLogprobs, and WithN live on the OpenAI client and so also cover every OpenAI-compatible provider (Groq, OpenRouter, xAI, Together, Fireworks, DeepSeek, Mistral, Ollama). They are emitted only when set, and are OpenAI-only: Anthropic supports none of them, and Gemini's candidateCount (the n equivalent) is out of scope — those providers never receive the fields.

  • WithLogitBias maps token IDs (tokenizer ids, OpenAI's wire shape) to a bias from -100 (ban) to 100 (force).
  • WithLogprobs(n) requests per-token log probabilities with up to n alternatives per position; the result lands on Response.LogProbs ([]llm.TokenLogProb), nil when not requested.
  • WithN(n) requests n completions; all land on Response.Choices ([]llm.Choice). The top-level Content / FinishReason / ToolCalls / LogProbs mirror choice 0, so single-completion callers are unaffected (Choices is empty when n is unset or 1). Streaming with n > 1 is not supported — use the non-streaming SendMessages path.

logit_bias is rejected by reasoning-tier models (the gpt-5 family) with an HTTP 400; use a classic chat model such as gpt-4o-mini when you need it.

Anthropic:

llmanthropic.WithBedrock(true)              // route through AWS Bedrock
llmanthropic.WithDisableCache()
llmanthropic.WithReasoningEffort(llmanthropic.ReasoningEffortHigh)

Gemini:

import llmgemini "github.com/joakimcarlsson/ai/llm/gemini"

llmgemini.WithThinkingLevel(llmgemini.ThinkingLevelHigh)
llmgemini.WithFrequencyPenalty(0.5)
llmgemini.WithSeed(42)

Provider built-in tools

Server-side built-in tools (web search, code execution, file search) run inside the provider's infrastructure. They're opt-in per-client; results land inline in Response.Content, with structured metadata under Response.ProviderMetadata. See Tool Calling for the full picture; below is the per-provider surface.

Anthropic — web_search:

llmanthropic.WithWebSearch(llmanthropic.WebSearchConfig{
    MaxUses:        5,
    AllowedDomains: []string{"go.dev"},
    BlockedDomains: nil,
    UserLocation: &llmanthropic.WebSearchUserLocation{
        City: "Stockholm", Country: "SE",
    },
})

Gemini — google_search, code_execution, url_context:

llmgemini.WithGoogleSearch()
llmgemini.WithCodeExecution()
llmgemini.WithURLContext()

OpenAI (Responses API) — web_search, file_search, code_interpreter. The Responses API is a separate surface from Chat Completions; use NewResponsesLLM instead of NewLLM:

client := llmopenai.NewResponsesLLM(
    llmopenai.WithResponsesAPIKey(os.Getenv("OPENAI_API_KEY")),
    llmopenai.WithResponsesModel(model.OpenAIModels[model.GPT5]),
    llmopenai.WithResponsesMaxTokens(1024),
    llmopenai.WithWebSearch(llmopenai.WebSearchOpts{
        SearchContextSize: llmopenai.SearchContextMedium,
    }),
    llmopenai.WithFileSearch("vs_abc123"),
    llmopenai.WithCodeInterpreter(),
)

WithWebSearchPreview is also available for models that don't yet support the newer web_search tool.

Groq — browser_search, code_execution, visit_website (requires a groq/compound* model via the dedicated NewCompoundLLM):

import llmgroq "github.com/joakimcarlsson/ai/llm/groq"

client := llmgroq.NewCompoundLLM(
    llmgroq.WithCompoundAPIKey(os.Getenv("GROQ_API_KEY")),
    llmgroq.WithCompoundModel(model.Model{APIModel: "groq/compound"}),
    llmgroq.WithBrowserSearch(llmgroq.BrowserSearchOpts{
        Country:       "us",
        IncludeImages: true,
    }),
    llmgroq.WithCodeExecution(),
    llmgroq.WithVisitWebsite(),
)

The regular llmgroq.NewLLM wrapper stays available for OpenAI-compatible chat without built-ins.

xAI — web_search, x_search, code_execution via the Responses API (use NewResponsesLLM instead of NewLLM):

import llmxai "github.com/joakimcarlsson/ai/llm/xai"

client := llmxai.NewResponsesLLM(
    llmxai.WithResponsesAPIKey(os.Getenv("XAI_API_KEY")),
    llmxai.WithResponsesModel(model.XAIModels[model.XAIGrok4]),
    llmxai.WithWebSearch(llmxai.WebSearchOpts{
        SearchContextSize: llmxai.SearchContextMedium,
    }),
    llmxai.WithXSearch(llmxai.XSearchOpts{
        AllowedXHandles: []string{"xai"},
        FromDate:        "2026-01-01",
    }),
    llmxai.WithCodeExecution(),
)

The thin llmxai.NewLLM wrapper remains available for OpenAI-compatible chat without built-ins.

Cross-vendor wrappers

llm/azure (Azure OpenAI), llm/vertexai (Gemini on Vertex), and llm/bedrock (Claude on Bedrock) are thin wrappers that delegate to their underlying vendor module:

import llmazure "github.com/joakimcarlsson/ai/llm/azure"

client := llmazure.NewLLM(
    llmazure.WithAPIKey(os.Getenv("AZURE_OPENAI_KEY")),
    llmazure.WithEndpoint("https://my-resource.openai.azure.com"),
    llmazure.WithDeployment("my-chat-deployment"),
)
import llmbedrock "github.com/joakimcarlsson/ai/llm/bedrock"

// Region is read from $AWS_REGION (or $AWS_DEFAULT_REGION).
client := llmbedrock.NewLLM(
    llmbedrock.WithModel(model.AnthropicModels[model.Claude45Sonnet]),
    llmbedrock.WithMaxTokens(2000),
)

Prompt caching is on by default on Bedrock: the underlying Anthropic client's cache_control breakpoints reach Bedrock and populate CacheReadTokens / CacheCreationTokens in the response usage. Pass llmbedrock.WithDisableCache() to opt out. (Newer Claude models require at least 4096 cached tokens per checkpoint before a cache hit is recorded.)

import llmvertex "github.com/joakimcarlsson/ai/llm/vertexai"

client := llmvertex.NewLLM(
    llmvertex.WithProject(os.Getenv("VERTEXAI_PROJECT")),
    llmvertex.WithLocation(os.Getenv("VERTEXAI_LOCATION")),
    llmvertex.WithModel(model.GeminiModels[model.Gemini25Pro]),
)

OpenAI-compatible providers (BYOM)

OpenRouter, Mistral, Ollama, LocalAI, etc. — point llm/openai at the right base URL:

openrouter := llmopenai.NewLLM(
    llmopenai.WithAPIKey(os.Getenv("OPENROUTER_API_KEY")),
    llmopenai.WithBaseURL("https://openrouter.ai/api/v1"),
    llmopenai.WithModel(model.OpenAIModels[model.GPT5]),
)

Groq and xAI are published as their own modules (llm/groq, llm/xai) so they can expose vendor-specific built-in tools on top of the OpenAI-compatible surface. Use the thin NewLLM constructor in each for plain chat, or the dedicated NewCompoundLLM / NewResponsesLLM for built-in tools.

Berget AI (Swedish, EU-hosted; open-weight models) ships as llm/berget, a thin wrapper pinned to https://api.berget.ai/v1. Pricing in the model catalog is in EUR (BergetModels, BergetEmbeddingModels, BergetRerankerModels, BergetTranscriptionModels):

Like the other OpenAI-compatible wrappers, llm/berget aliases Option but does not re-export the option constructors; pass the standard ones from llm/openai:

import (
    llmberget "github.com/joakimcarlsson/ai/llm/berget"
    llmopenai "github.com/joakimcarlsson/ai/llm/openai"
    "github.com/joakimcarlsson/ai/model"
)

client := llmberget.NewLLM(
    llmopenai.WithAPIKey(os.Getenv("BERGET_API_KEY")),
    llmopenai.WithModel(model.BergetModels[model.BergetGPTOSS120B]),
    llmopenai.WithMaxTokens(1000),
)

For a managed registry of these, see BYOM.

Tracing

Every vendor's NewLLM(...) returns a tracing-wrapped client. Spans + metrics are emitted automatically via OpenTelemetry. See Tracing for setup.