# Go AI Client Library - Complete Documentation ## Table of Contents - [Home](#home) - [Getting Started > Installation](#getting-started-installation) - [Getting Started > Quick Start](#getting-started-quick-start) - [Providers > Overview](#providers-overview) - [Providers > LLM](#providers-llm) - [Providers > Embeddings](#providers-embeddings) - [Providers > Image Generation](#providers-image-generation) - [Providers > Audio](#providers-audio) - [Providers > Speech-to-Text](#providers-speech-to-text) - [Providers > Rerankers](#providers-rerankers) - [Providers > Fill-in-the-Middle](#providers-fill-in-the-middle) - [Providers > Vision](#providers-vision) - [Agent Framework > Overview](#agent-framework-overview) - [Agent Framework > Session Management](#agent-framework-session-management) - [Agent Framework > Persistent Memory](#agent-framework-persistent-memory) - [Agent Framework > Streaming](#agent-framework-streaming) - [Agent Framework > Hooks](#agent-framework-hooks) - [Agent Framework > Tool Confirmation](#agent-framework-tool-confirmation) - [Agent Framework > Sub-Agents](#agent-framework-sub-agents) - [Agent Framework > Background Agents](#agent-framework-background-agents) - [Agent Framework > Handoffs](#agent-framework-handoffs) - [Agent Framework > Fan-Out](#agent-framework-fan-out) - [Agent Framework > Continue/Resume](#agent-framework-continueresume) - [Agent Framework > Context Strategies](#agent-framework-context-strategies) - [Agent Framework > Toolsets](#agent-framework-toolsets) - [Agent Framework > Instruction Templates](#agent-framework-instruction-templates) - [Integrations > PostgreSQL](#integrations-postgresql) - [Integrations > SQLite](#integrations-sqlite) - [Integrations > pgvector](#integrations-pgvector) - [Advanced > Batch Processing](#advanced-batch-processing) - [Advanced > BYOM](#advanced-byom) - [Advanced > MCP Integration](#advanced-mcp-integration) - [Advanced > Tool Calling](#advanced-tool-calling) - [Advanced > Structured Output](#advanced-structured-output) - [Advanced > Cost Tracking](#advanced-cost-tracking) - [Advanced > Prompt Templates](#advanced-prompt-templates) - [Advanced > OpenTelemetry Tracing](#advanced-opentelemetry-tracing) - [Advanced > Configuration](#advanced-configuration) --- ## Home > Source: index.md # Go AI Client Library [![Go Reference](https://pkg.go.dev/badge/github.com/joakimcarlsson/ai.svg)](https://pkg.go.dev/github.com/joakimcarlsson/ai) [![Go Report Card](https://goreportcard.com/badge/github.com/joakimcarlsson/ai)](https://goreportcard.com/report/github.com/joakimcarlsson/ai) A comprehensive, multi-provider Go library for interacting with various AI models through unified interfaces. This library supports Large Language Models (LLMs), embedding models, image generation models, audio generation (text-to-speech), and rerankers from multiple providers including Anthropic, OpenAI, Google, AWS, Voyage AI, xAI, ElevenLabs, and more. ## Features - **Multi-Provider Support** — Unified interface for 10+ AI providers - **LLM Support** — Chat completions, streaming, tool calling, structured output - **Agent Framework** — Multi-agent orchestration with sub-agents, handoffs, fan-out, session management, persistent memory, and context strategies - **Embedding Models** — Text, multimodal, and contextualized embeddings - **Image Generation** — Text-to-image generation with multiple quality and size options - **Audio Generation** — Text-to-speech with voice selection and streaming support - **Speech-to-Text** — Audio transcription and translation with timestamp support - **Rerankers** — Document reranking for improved search relevance - **Streaming Responses** — Real-time response streaming via Go channels - **Tool Calling** — Native function calling with struct-tag schema generation - **Structured Output** — Constrained generation with JSON schemas - **MCP Integration** — Model Context Protocol support for advanced tooling - **Multimodal Support** — Text and image inputs across compatible providers - **Cost Tracking** — Built-in token and character usage with cost calculation - **Retry Logic** — Exponential backoff with configurable retry policies - **Type Safety** — Full Go generics support for compile-time safety ## Quick Install ```bash go get github.com/joakimcarlsson/ai ``` ## Quick Example ```go package main import ( "context" "fmt" "log" "github.com/joakimcarlsson/ai/message" "github.com/joakimcarlsson/ai/model" llm "github.com/joakimcarlsson/ai/providers" ) func main() { ctx := context.Background() client, err := llm.NewLLM( model.ProviderOpenAI, llm.WithAPIKey("your-api-key"), llm.WithModel(model.OpenAIModels[model.GPT4o]), llm.WithMaxTokens(1000), ) if err != nil { log.Fatal(err) } messages := []message.Message{ message.NewUserMessage("Hello, how are you?"), } response, err := client.SendMessages(ctx, messages, nil) if err != nil { log.Fatal(err) } fmt.Println(response.Content) } ``` ## Next Steps - [Installation & Quick Start](getting-started/installation.md) — Get up and running - [Provider Overview](providers/overview.md) — See all supported providers - [Agent Framework](agent/overview.md) — Build multi-agent systems - [Advanced Features](advanced/byom.md) — BYOM, MCP, cost tracking --- ## Getting Started > Installation > Source: getting-started/installation.md # Installation ## Requirements - Go 1.25 or later ## Install ```bash go get github.com/joakimcarlsson/ai ``` ## Import ```go import ( "github.com/joakimcarlsson/ai/message" "github.com/joakimcarlsson/ai/model" llm "github.com/joakimcarlsson/ai/providers" ) ``` ## Provider API Keys Each provider requires its own API key. Set them as environment variables: ```bash export OPENAI_API_KEY="sk-..." export ANTHROPIC_API_KEY="sk-ant-..." export GOOGLE_API_KEY="..." export VOYAGE_API_KEY="..." export XAI_API_KEY="..." export ELEVENLABS_API_KEY="..." ``` Or pass them directly when creating a client: ```go client, err := llm.NewLLM( model.ProviderOpenAI, llm.WithAPIKey("your-api-key"), llm.WithModel(model.OpenAIModels[model.GPT4o]), ) ``` --- ## Getting Started > Quick Start > Source: getting-started/quick-start.md # Quick Start ## Basic Usage ```go package main import ( "context" "fmt" "log" "github.com/joakimcarlsson/ai/message" "github.com/joakimcarlsson/ai/model" llm "github.com/joakimcarlsson/ai/providers" ) func main() { ctx := context.Background() client, err := llm.NewLLM( model.ProviderOpenAI, llm.WithAPIKey("your-api-key"), llm.WithModel(model.OpenAIModels[model.GPT4o]), llm.WithMaxTokens(1000), ) if err != nil { log.Fatal(err) } messages := []message.Message{ message.NewUserMessage("Hello, how are you?"), } response, err := client.SendMessages(ctx, messages, nil) if err != nil { log.Fatal(err) } fmt.Println(response.Content) } ``` ## Streaming Responses ```go stream := client.StreamResponse(ctx, messages, nil) for event := range stream { switch event.Type { case types.EventContentDelta: fmt.Print(event.Content) case types.EventComplete: fmt.Printf("\nTokens used: %d\n", event.Response.Usage.InputTokens) case types.EventError: log.Fatal(event.Error) } } ``` ## Multimodal (Images) ```go imageData, err := os.ReadFile("image.png") if err != nil { log.Fatal(err) } msg := message.NewUserMessage("What's in this image?") msg.AddAttachment(message.Attachment{ MIMEType: "image/png", Data: imageData, }) messages := []message.Message{msg} response, err := client.SendMessages(ctx, messages, nil) ``` ## Your First Agent ```go import ( "github.com/joakimcarlsson/ai/agent" "github.com/joakimcarlsson/ai/agent/session" ) myAgent := agent.New(llmClient, agent.WithSystemPrompt("You are a helpful assistant."), agent.WithTools(&weatherTool{}), agent.WithSession("user-123", session.FileStore("./sessions")), ) response, _ := myAgent.Chat(ctx, "What's the weather in Tokyo?") fmt.Println(response.Content) ``` See the [Agent Framework](../agent/overview.md) section for the full guide. --- ## Providers > Overview > Source: providers/overview.md # Supported Providers ## LLM Providers | Provider | Streaming | Tools | Structured Output | Attachments | |----------|-----------|-------|-------------------|-------------| | Anthropic (Claude) | ✅ | ✅ | ❌ | ✅ | | OpenAI (GPT) | ✅ | ✅ | ✅ | ✅ | | Google Gemini | ✅ | ✅ | ✅ | ✅ | | AWS Bedrock | ✅ | ✅ | ❌ | ✅ | | Azure OpenAI | ✅ | ✅ | ✅ | ✅ | | Google Vertex AI | ✅ | ✅ | ✅ | ✅ | | Groq | ✅ | ✅ | ✅ | ✅ | | OpenRouter | ✅ | ✅ | ✅ | ✅ | | xAI (Grok) | ✅ | ✅ | ✅ | ✅ | ## Embedding & Reranker Providers | Provider | Text Embeddings | Multimodal Embeddings | Contextualized Embeddings | Rerankers | |----------|-----------------|----------------------|---------------------------|-----------| | Voyage AI | ✅ | ✅ | ✅ | ✅ | | OpenAI | ✅ | ❌ | ❌ | ❌ | ## Image Generation Providers | Provider | Models | Quality Options | Size Options | |----------|--------|-----------------|--------------| | OpenAI | DALL-E 2, DALL-E 3, GPT Image 1 | standard, hd, low, medium, high | 256x256 to 1792x1024 | | xAI (Grok) | Grok 2 Image | default | default | | Google Gemini | Gemini 2.5 Flash Image, Imagen 3, Imagen 4, Imagen 4 Ultra, Imagen 4 Fast | default | Aspect ratios: 1:1, 3:4, 4:3, 9:16, 16:9 | ## Audio Generation Providers (Text-to-Speech) | Provider | Models | Streaming | Voice Selection | Max Characters | |----------|--------|-----------|-----------------|----------------| | ElevenLabs | Multilingual v2, Turbo v2.5, Flash v2.5 | ✅ | ✅ | 10,000 - 40,000 | ## Speech-to-Text Providers (Transcription) | Provider | Models | Streaming | Translation | Timestamps | Diarization | |----------|--------|-----------|-------------|------------|-------------| | OpenAI | Whisper-1, GPT-4o Transcribe, GPT-4o Mini Transcribe | ✅ | ✅ | ✅ | ✅ | --- ## Providers > LLM > Source: providers/llm.md # LLM Providers ## Creating a Client ```go import ( "github.com/joakimcarlsson/ai/model" llm "github.com/joakimcarlsson/ai/providers" ) client, err := llm.NewLLM( model.ProviderOpenAI, llm.WithAPIKey("your-api-key"), llm.WithModel(model.OpenAIModels[model.GPT4o]), llm.WithMaxTokens(1000), ) ``` ## Sending Messages ```go messages := []message.Message{ message.NewUserMessage("Hello, how are you?"), } response, err := client.SendMessages(ctx, messages, nil) fmt.Println(response.Content) ``` ## Streaming ```go stream := client.StreamResponse(ctx, messages, nil) for event := range stream { switch event.Type { case types.EventTypeContentDelta: fmt.Print(event.Content) case types.EventTypeFinal: fmt.Printf("\nTokens used: %d\n", event.Response.Usage.InputTokens) case types.EventTypeError: log.Fatal(event.Error) } } ``` ## Multimodal (Images) ```go imageData, err := os.ReadFile("image.png") if err != nil { log.Fatal(err) } msg := message.NewUserMessage("What's in this image?") msg.AddAttachment(message.Attachment{ MIMEType: "image/png", Data: imageData, }) messages := []message.Message{msg} response, err := client.SendMessages(ctx, messages, nil) ``` ## Client Options ```go client, err := llm.NewLLM( model.ProviderOpenAI, llm.WithAPIKey("your-key"), llm.WithModel(model.OpenAIModels[model.GPT4o]), llm.WithMaxTokens(2000), llm.WithTemperature(0.7), llm.WithTopP(0.9), llm.WithTimeout(30*time.Second), llm.WithStopSequences("STOP", "END"), ) ``` ## Provider-Specific Options ```go // Anthropic llm.WithAnthropicOptions( llm.WithAnthropicBeta("beta-feature"), ) // OpenAI llm.WithOpenAIOptions( llm.WithOpenAIBaseURL("custom-endpoint"), llm.WithOpenAIExtraHeaders(map[string]string{ "Custom-Header": "value", }), ) ``` --- ## Providers > Embeddings > Source: providers/embeddings.md # Embeddings ## Text Embeddings ```go import ( "github.com/joakimcarlsson/ai/embeddings" "github.com/joakimcarlsson/ai/model" ) embedder, err := embeddings.NewEmbedding(model.ProviderVoyage, embeddings.WithAPIKey(""), embeddings.WithModel(model.VoyageEmbeddingModels[model.Voyage35]), ) if err != nil { log.Fatal(err) } texts := []string{ "Hello, world!", "This is a test document.", } response, err := embedder.GenerateEmbeddings(context.Background(), texts) if err != nil { log.Fatal(err) } for i, embedding := range response.Embeddings { fmt.Printf("Text: %s\n", texts[i]) fmt.Printf("Dimensions: %d\n", len(embedding)) fmt.Printf("First 5 values: %v\n", embedding[:5]) } ``` ## Multimodal Embeddings ```go embedder, err := embeddings.NewEmbedding(model.ProviderVoyage, embeddings.WithAPIKey(""), embeddings.WithModel(model.VoyageEmbeddingModels[model.VoyageMulti3]), ) multimodalInputs := []embeddings.MultimodalInput{ { Content: []embeddings.MultimodalContent{ {Type: "text", Text: "This is a banana."}, {Type: "image_url", ImageURL: "https://example.com/banana.jpg"}, }, }, } response, err := embedder.GenerateMultimodalEmbeddings(context.Background(), multimodalInputs) ``` ## Contextualized Embeddings Embed document chunks with awareness of their surrounding context. Each chunk embedding incorporates information from the full document, improving retrieval for chunks that lack standalone meaning. ```go documentChunks := [][]string{ { // Document 1 "Introduction to quantum computing...", "Qubits differ from classical bits...", "Quantum entanglement enables...", }, { // Document 2 "Machine learning overview...", "Neural networks consist of...", }, } response, err := embedder.GenerateContextualizedEmbeddings(context.Background(), documentChunks) // response.DocumentEmbeddings[0][1] = embedding for "Qubits differ..." with context from Document 1 ``` ## Client Options ```go embedder, err := embeddings.NewEmbedding( model.ProviderVoyage, embeddings.WithAPIKey(""), embeddings.WithModel(model.VoyageEmbeddingModels[model.Voyage35]), embeddings.WithBatchSize(100), embeddings.WithDimensions(1024), embeddings.WithTimeout(30*time.Second), embeddings.WithVoyageOptions( embeddings.WithInputType("document"), embeddings.WithOutputDimension(1024), embeddings.WithOutputDtype("float"), ), ) ``` ## Embedding Interface ```go type Embedding interface { GenerateEmbeddings(ctx, texts, inputType...) (*EmbeddingResponse, error) GenerateMultimodalEmbeddings(ctx, inputs, inputType...) (*EmbeddingResponse, error) GenerateContextualizedEmbeddings(ctx, documentChunks, inputType...) (*ContextualizedEmbeddingResponse, error) Model() model.EmbeddingModel } ``` --- ## Providers > Image Generation > Source: providers/image-generation.md # Image Generation ## OpenAI DALL-E 3 ```go import ( "github.com/joakimcarlsson/ai/image_generation" "github.com/joakimcarlsson/ai/model" ) client, err := image_generation.NewImageGeneration( model.ProviderOpenAI, image_generation.WithAPIKey("your-api-key"), image_generation.WithModel(model.OpenAIImageGenerationModels[model.DALLE3]), ) if err != nil { log.Fatal(err) } response, err := client.GenerateImage( context.Background(), "A serene mountain landscape at sunset with vibrant colors", image_generation.WithSize("1024x1024"), image_generation.WithQuality("hd"), image_generation.WithResponseFormat("b64_json"), ) if err != nil { log.Fatal(err) } imageData, _ := image_generation.DecodeBase64Image(response.Images[0].ImageBase64) os.WriteFile("image.png", imageData, 0644) ``` ## Google Gemini Imagen 4 ```go client, err := image_generation.NewImageGeneration( model.ProviderGemini, image_generation.WithAPIKey("your-api-key"), image_generation.WithModel(model.GeminiImageGenerationModels[model.Imagen4]), ) response, err := client.GenerateImage( context.Background(), "A futuristic cityscape at night", image_generation.WithSize("16:9"), image_generation.WithN(4), ) ``` ## xAI Grok 2 Image ```go client, err := image_generation.NewImageGeneration( model.ProviderXAI, image_generation.WithAPIKey("your-api-key"), image_generation.WithModel(model.XAIImageGenerationModels[model.XAIGrok2Image]), ) response, err := client.GenerateImage( context.Background(), "A robot playing chess", image_generation.WithResponseFormat("b64_json"), ) ``` ## Client Options ```go // OpenAI/xAI client, err := image_generation.NewImageGeneration( model.ProviderOpenAI, image_generation.WithAPIKey("your-key"), image_generation.WithModel(model.OpenAIImageGenerationModels[model.DALLE3]), image_generation.WithTimeout(60*time.Second), image_generation.WithOpenAIOptions( image_generation.WithOpenAIBaseURL("custom-endpoint"), ), ) // Gemini client, err := image_generation.NewImageGeneration( model.ProviderGemini, image_generation.WithAPIKey("your-key"), image_generation.WithModel(model.GeminiImageGenerationModels[model.Imagen4]), image_generation.WithTimeout(60*time.Second), image_generation.WithGeminiOptions( image_generation.WithGeminiBackend(genai.BackendVertexAI), ), ) ``` --- ## Providers > Audio > Source: providers/audio.md # Audio Generation (Text-to-Speech) ## Basic Usage ```go import ( "github.com/joakimcarlsson/ai/audio" "github.com/joakimcarlsson/ai/model" ) client, err := audio.NewAudioGeneration( model.ProviderElevenLabs, audio.WithAPIKey("your-api-key"), audio.WithModel(model.ElevenLabsAudioModels[model.ElevenTurboV2_5]), ) if err != nil { log.Fatal(err) } response, err := client.GenerateAudio( context.Background(), "Hello! This is a demonstration of text-to-speech.", audio.WithVoiceID("EXAVITQu4vr4xnSDxMaL"), ) if err != nil { log.Fatal(err) } os.WriteFile("output.mp3", response.AudioData, 0644) fmt.Printf("Characters used: %d\n", response.Usage.Characters) ``` ## Custom Voice Settings ```go response, err := client.GenerateAudio( context.Background(), "This uses custom voice settings for enhanced expressiveness.", audio.WithVoiceID("EXAVITQu4vr4xnSDxMaL"), audio.WithStability(0.75), // 0.0-1.0, higher = more consistent audio.WithSimilarityBoost(0.85), // 0.0-1.0, higher = more similar to original audio.WithStyle(0.5), // 0.0-1.0, higher = more expressive audio.WithSpeakerBoost(true), // Enhanced speaker similarity ) ``` ## Streaming Audio ```go chunkChan, err := client.StreamAudio( context.Background(), "This is a streaming audio example.", audio.WithVoiceID("EXAVITQu4vr4xnSDxMaL"), audio.WithOptimizeStreamingLatency(3), // 0-4, higher = lower latency ) if err != nil { log.Fatal(err) } file, _ := os.Create("output_stream.mp3") defer file.Close() for chunk := range chunkChan { if chunk.Error != nil { log.Fatal(chunk.Error) } if chunk.Done { break } file.Write(chunk.Data) } ``` ## List Available Voices ```go voices, err := client.ListVoices(context.Background()) if err != nil { log.Fatal(err) } for _, voice := range voices { fmt.Printf("%s (%s) - %s\n", voice.Name, voice.VoiceID, voice.Category) } ``` ## Alignment Data Enable character-level timing information for subtitles, word highlighting, or lip sync: ```go response, err := client.GenerateAudio( context.Background(), "Hello, world!", audio.WithVoiceID("EXAVITQu4vr4xnSDxMaL"), audio.WithAlignmentEnabled(true), ) // response.Alignment contains character-level timing for i, char := range response.Alignment.Characters { fmt.Printf("%s: %.2fs - %.2fs\n", char, response.Alignment.CharacterStartTimesSeconds[i], response.Alignment.CharacterEndTimesSeconds[i], ) } ``` Alignment is also available per-chunk during streaming via `chunk.Alignment`. ## Forced Alignment Match existing audio with a transcript to produce word-level timing data. The provider must implement the `ForcedAlignmentProvider` interface: ```go if aligner, ok := client.(audio.ForcedAlignmentProvider); ok { audioData, _ := os.ReadFile("speech.mp3") result, err := aligner.GenerateForcedAlignment(ctx, audioData, "Hello, world!") for _, word := range result.Words { fmt.Printf("%s: %.2fs - %.2fs\n", word.Text, word.Start, word.End) } } ``` ## Generation Options | Option | Description | |--------|-------------| | `WithVoiceID(id)` | Voice to use for generation | | `WithOutputFormat(fmt)` | Audio format (`mp3_44100_128`, `pcm_16000`, etc.) | | `WithStability(f)` | Voice consistency, 0.0–1.0 | | `WithSimilarityBoost(f)` | Match to original voice, 0.0–1.0 | | `WithStyle(f)` | Style exaggeration, 0.0–1.0 | | `WithSpeakerBoost(bool)` | Enhanced speaker similarity | | `WithOptimizeStreamingLatency(n)` | Latency optimization level, 0–4 | | `WithAlignmentEnabled(bool)` | Enable character-level timing data | ## Client Options ```go client, err := audio.NewAudioGeneration( model.ProviderElevenLabs, audio.WithAPIKey("your-key"), audio.WithModel(model.ElevenLabsAudioModels[model.ElevenTurboV2_5]), audio.WithTimeout(30*time.Second), audio.WithElevenLabsOptions( audio.WithElevenLabsBaseURL("custom-endpoint"), ), ) ``` --- ## Providers > Speech-to-Text > Source: providers/speech-to-text.md # Speech-to-Text (Transcription) ## Basic Transcription ```go import ( "github.com/joakimcarlsson/ai/transcription" "github.com/joakimcarlsson/ai/model" ) client, err := transcription.NewSpeechToText( model.ProviderOpenAI, transcription.WithAPIKey("your-api-key"), transcription.WithModel(model.OpenAITranscriptionModels[model.Whisper1]), ) if err != nil { log.Fatal(err) } audioData, err := os.ReadFile("audio.mp3") if err != nil { log.Fatal(err) } response, err := client.Transcribe(context.Background(), audioData) if err != nil { log.Fatal(err) } fmt.Println(response.Text) ``` ## Transcription with Options ```go response, err := client.Transcribe(ctx, audioData, transcription.WithLanguage("en"), transcription.WithResponseFormat("verbose_json"), transcription.WithTimestampGranularities("word", "segment"), transcription.WithTemperature(0.2), ) for _, segment := range response.Segments { fmt.Printf("[%.2fs - %.2fs] %s\n", segment.Start, segment.End, segment.Text) } for _, word := range response.Words { fmt.Printf("%s (%.2fs) ", word.Word, word.Start) } ``` ## Translation (to English) ```go response, err := client.Translate(ctx, audioData, transcription.WithPrompt("Translate this Swedish audio to English"), ) fmt.Println(response.Text) ``` ## Client Options ```go client, err := transcription.NewSpeechToText( model.ProviderOpenAI, transcription.WithAPIKey("your-key"), transcription.WithModel(model.OpenAITranscriptionModels[model.GPT4oTranscribe]), transcription.WithTimeout(30*time.Second), ) ``` --- ## Providers > Rerankers > Source: providers/rerankers.md # Document Reranking ## Basic Usage ```go import ( "github.com/joakimcarlsson/ai/rerankers" "github.com/joakimcarlsson/ai/model" ) reranker, err := rerankers.NewReranker(model.ProviderVoyage, rerankers.WithAPIKey(""), rerankers.WithModel(model.VoyageRerankerModels[model.Rerank25Lite]), rerankers.WithReturnDocuments(true), ) if err != nil { log.Fatal(err) } query := "What is machine learning?" documents := []string{ "Machine learning is a subset of artificial intelligence.", "The weather today is sunny.", "Deep learning uses neural networks.", } response, err := reranker.Rerank(context.Background(), query, documents) if err != nil { log.Fatal(err) } for i, result := range response.Results { fmt.Printf("Rank %d (Score: %.4f): %s\n", i+1, result.RelevanceScore, result.Document) } ``` ## Client Options ```go reranker, err := rerankers.NewReranker( model.ProviderVoyage, rerankers.WithAPIKey(""), rerankers.WithModel(model.VoyageRerankerModels[model.Rerank25Lite]), rerankers.WithTopK(10), rerankers.WithReturnDocuments(true), rerankers.WithTruncation(true), rerankers.WithTimeout(30*time.Second), ) ``` --- ## Providers > Fill-in-the-Middle > Source: providers/fim.md # Fill-in-the-Middle (FIM) Code completion by providing a prompt (code before the cursor) and an optional suffix (code after the cursor), with the model filling in the middle. Useful for code editors and IDE integrations. ## Supported Providers | Provider | Model | |----------|-------| | Mistral | Codestral | | DeepSeek | DeepSeek Coder | ## Setup ```go import ( "github.com/joakimcarlsson/ai/fim" "github.com/joakimcarlsson/ai/model" ) client, err := fim.NewFIM(model.ProviderMistral, fim.WithAPIKey(os.Getenv("MISTRAL_API_KEY")), fim.WithModel(model.MistralModels[model.Codestral]), ) if err != nil { log.Fatal(err) } ``` ## Basic Completion ```go maxTokens := int64(100) resp, err := client.Complete(ctx, fim.Request{ Prompt: "func Add(a, b int) int {\n ", Suffix: "\n}", MaxTokens: &maxTokens, }) if err != nil { log.Fatal(err) } fmt.Println(resp.Content) // "return a + b" ``` ## Streaming ```go events := client.CompleteStream(ctx, fim.Request{ Prompt: "func Max(numbers []int) int {\n ", Suffix: "\n}", MaxTokens: &maxTokens, }) for event := range events { switch event.Type { case fim.EventContentDelta: fmt.Print(event.Content) case fim.EventComplete: fmt.Printf("\nTokens: %d in, %d out\n", event.Response.Usage.InputTokens, event.Response.Usage.OutputTokens, ) case fim.EventError: log.Fatal(event.Error) } } ``` ## Request | Field | Type | Description | |-------|------|-------------| | `Prompt` | `string` | Code before the cursor (required) | | `Suffix` | `string` | Code after the cursor (optional) | | `MaxTokens` | `*int64` | Max tokens to generate | | `Temperature` | `*float64` | Sampling temperature (0.0–1.0) | | `TopP` | `*float64` | Nucleus sampling probability | | `Stop` | `[]string` | Sequences that halt generation | | `RandomSeed` | `*int64` | Seed for deterministic output | ## Client Options | Option | Description | |--------|-------------| | `fim.WithAPIKey(key)` | API key for authentication | | `fim.WithModel(m)` | Model to use | | `fim.WithMaxTokens(n)` | Default max tokens | | `fim.WithTemperature(t)` | Default temperature | | `fim.WithTopP(p)` | Default top-p | | `fim.WithTimeout(d)` | API request timeout | | `fim.WithMistralOptions(...)` | Mistral-specific options | | `fim.WithDeepSeekOptions(...)` | DeepSeek-specific options | --- ## Providers > Vision > Source: providers/vision.md # Vision (Multimodal Images) Send images to LLMs for analysis using URL references or raw binary data. Works with any provider that supports multimodal input (Anthropic, OpenAI, Gemini). ## Image from URL ```go import "github.com/joakimcarlsson/ai/message" msg := message.NewUserMessage("What do you see in this image?") msg.AddImageURL("https://example.com/photo.jpg", "") response, err := client.SendMessages(ctx, []message.Message{msg}, nil) fmt.Println(response.Content) ``` The second argument to `AddImageURL` is an optional detail level (`"low"`, `"high"`, or `""` for auto). ## Image from Binary Data ```go imageData, _ := os.ReadFile("photo.jpg") msg := message.NewUserMessage("Describe this image.") msg.AddBinary("image/jpeg", imageData) response, err := client.SendMessages(ctx, []message.Message{msg}, nil) fmt.Println(response.Content) ``` ## Multiple Images ```go msg := message.NewUserMessage("Compare these two images.") msg.AddImageURL("https://example.com/before.jpg", "") msg.AddImageURL("https://example.com/after.jpg", "") response, err := client.SendMessages(ctx, []message.Message{msg}, nil) ``` ## MultiModalMessage For full control, build messages with the `MultiModalMessage` type directly: ```go msg := message.NewUserMultiModalMessage([]message.MultiModalContent{ message.NewTextContent("What's in this image?"), message.NewImageURLContent("https://example.com/photo.jpg", "high"), }) // Or with attachments msg := message.NewUserMultiModalMessageWithAttachments( "Describe these files.", []message.Attachment{ {MIMEType: "image/png", Data: pngData}, {MIMEType: "image/jpeg", Data: jpegData}, }, ) ``` ## Content Types | Type | Constructor | Description | |------|-------------|-------------| | `text` | `NewTextContent(text)` | Text content | | `image_url` | `NewImageURLContent(url, detail)` | Image from URL | | `binary` | `NewBinaryContent(mimeType, data)` | Raw binary data (base64-encoded for the provider) | ## Supported Formats Most providers accept JPEG, PNG, GIF, and WebP. Check your provider's documentation for size limits. --- ## Agent Framework > Overview > Source: agent/overview.md # Agent Framework The agent package provides multi-agent orchestration with automatic tool execution, session management, persistent memory, sub-agents, handoffs, fan-out, and context strategies. ## Basic Agent ```go import ( "github.com/joakimcarlsson/ai/agent" "github.com/joakimcarlsson/ai/agent/session" ) myAgent := agent.New(llmClient, agent.WithSystemPrompt("You are a helpful assistant."), agent.WithTools(&weatherTool{}), agent.WithSession("user-123", session.FileStore("./sessions")), ) response, _ := myAgent.Chat(ctx, "What's the weather in Tokyo?") fmt.Println(response.Content) ``` ## How It Works When you call `Chat()`, the agent: 1. Builds the message history (system prompt + session messages + user message) 2. Sends messages to the LLM 3. If the LLM requests tool calls, executes them automatically 4. Loops back to step 2 with tool results until the LLM responds with text 5. Persists the conversation to the session store ## Configuration Options | Option | Description | Default | |--------|-------------|---------| | `WithSystemPrompt(prompt)` | Sets the agent's behavior | none | | `WithTools(tools...)` | Adds tools the agent can use | none | | `WithSession(id, store)` | Enables conversation persistence | none | | `WithMemory(id, store, opts...)` | Enables long-term memory | none | | `WithMaxIterations(n)` | Max tool execution loops | 10 | | `WithAutoExecute(bool)` | Auto-execute tool calls | true | | `WithContextStrategy(strategy, maxTokens)` | Context window management | none | | `WithSequentialToolExecution()` | Disable parallel tool execution | parallel | | `WithMaxParallelTools(n)` | Limit concurrent tool execution | unlimited | | `WithState(map)` | Template variables for system prompt | none | | `WithInstructionProvider(fn)` | Dynamic system prompt generation | none | | `WithHooks(hooks...)` | Add hook interceptors for observation/interception | none | | `WithConfirmationProvider(fn)` | Require human approval for sensitive tools | none | | `WithSubAgents(configs...)` | Register child agents | none | | `WithHandoffs(configs...)` | Register peer agents for transfer | none | | `WithFanOut(configs...)` | Register parallel task distribution | none | ## ChatResponse ```go type ChatResponse struct { Content string ToolCalls []message.ToolCall ToolResults []ToolExecutionResult Usage llm.TokenUsage FinishReason message.FinishReason AgentName string // Set when a handoff occurred TotalToolCalls int TotalDuration time.Duration TotalTurns int } ``` All metrics are aggregated across the full agent loop, not just the final LLM call: | Field | Description | |-------|-------------| | `TotalTurns` | Number of LLM round-trips (API calls) made | | `TotalDuration` | Wall-clock time from `Chat()` entry to return | | `TotalToolCalls` | Total tool invocations across all iterations | | `ToolResults` | Results of every tool execution during the conversation | ## Debug APIs Inspect the messages that would be sent to the LLM after applying context strategies: ```go // Non-destructive — does not modify the session messages, err := myAgent.PeekContextMessages(ctx, "Hello") // Modifying — adds the user message to the session messages, err := myAgent.BuildContextMessages(ctx, "Hello") ``` Use `PeekContextMessages` to debug context window management without side effects. --- ## Agent Framework > Session Management > Source: agent/sessions.md # Session Management Sessions persist conversation history across multiple `Chat()` calls. ## Setup ```go import "github.com/joakimcarlsson/ai/agent/session" myAgent := agent.New(llmClient, agent.WithSystemPrompt("You are a helpful assistant."), agent.WithSession("conversation-id", session.FileStore("./sessions")), ) ``` ## Built-in Stores ```go // Persistent JSON files store := session.FileStore("./sessions") // In-memory (ephemeral, lost on restart) store := session.MemoryStore() ``` ## Database Stores Ready-to-use stores for production backends: - [PostgreSQL](../integrations/postgres.md) — `postgres.SessionStore(ctx, connString)` - [SQLite](../integrations/sqlite.md) — `sqlite.SessionStore(ctx, db)` ## Store Interface Implement this interface to use any backend: ```go type Store interface { Exists(ctx context.Context, id string) (bool, error) Create(ctx context.Context, id string) (Session, error) Load(ctx context.Context, id string) (Session, error) Delete(ctx context.Context, id string) error } ``` ## Session Interface ```go type Session interface { ID() string GetMessages(ctx context.Context, limit *int) ([]message.Message, error) AddMessages(ctx context.Context, msgs []message.Message) error PopMessage(ctx context.Context) (*message.Message, error) Clear(ctx context.Context) error } ``` --- ## Agent Framework > Persistent Memory > Source: agent/memory.md # Persistent Memory Memory enables cross-conversation fact storage and retrieval using vector-based semantic search. ## Setup ```go import "github.com/joakimcarlsson/ai/agent/memory" store := memory.NewStore(embedder) myAgent := agent.New(llmClient, agent.WithSystemPrompt("You are a personal assistant."), agent.WithMemory("user-123", store, memory.AutoExtract(), // Auto-extract facts from conversations memory.AutoDedup(), // LLM-based memory deduplication ), ) response, _ := myAgent.Chat(ctx, "My name is Alice and I'm allergic to peanuts.") // Agent automatically stores this fact and recalls it in future conversations ``` ## Built-in Stores ```go // In-memory vector store store := memory.NewStore(embedder) // File-persisted vector store store := memory.FileStore("./memories", embedder) ``` ## Memory Options | Option | Description | |--------|-------------| | `memory.AutoExtract()` | Automatically extract facts from conversations after each response | | `memory.AutoDedup()` | Use LLM to deduplicate similar memories before storing | | `memory.LLM(l)` | Use a separate (cheaper) LLM for extraction and deduplication | ## Database Stores Ready-to-use stores for production backends: - [pgvector](../integrations/pgvector.md) — `pgvector.MemoryStore(ctx, connString, embedder)` — PostgreSQL with HNSW vector search ## Store Interface Implement for any vector database backend: ```go type Store interface { Store(ctx context.Context, id string, fact string, metadata map[string]any) error Search(ctx context.Context, id string, query string, limit int) ([]Entry, error) GetAll(ctx context.Context, id string, limit int) ([]Entry, error) Delete(ctx context.Context, memoryID string) error Update(ctx context.Context, memoryID string, fact string, metadata map[string]any) error } ``` ## How It Works When `AutoExtract` is enabled: 1. After the agent responds, it reviews the conversation 2. An LLM extracts factual information worth remembering 3. If `AutoDedup` is enabled, the LLM checks for existing similar memories 4. New facts are stored, duplicates are merged or skipped ## Manual Memory Tools When `AutoExtract` is disabled, the agent gets four memory tools that the LLM can call directly: ### store_memory Store a fact about the user for future conversations. | Parameter | Type | Required | Description | |-----------|------|----------|-------------| | `fact` | string | yes | The fact to remember | | `category` | string | no | One of: `preference`, `personal`, `health`, `professional`, `other` | ### recall_memories Search for relevant memories. Returns memory IDs for use with replace/delete. | Parameter | Type | Required | Description | |-----------|------|----------|-------------| | `query` | string | yes | What to search for | ### replace_memory Update an existing memory with corrected or updated information. | Parameter | Type | Required | Description | |-----------|------|----------|-------------| | `memory_id` | string | yes | ID from `recall_memories` results | | `fact` | string | yes | The updated fact | | `category` | string | no | One of: `preference`, `personal`, `health`, `professional`, `other` | ### delete_memory Remove a memory that is no longer accurate or relevant. | Parameter | Type | Required | Description | |-----------|------|----------|-------------| | `memory_id` | string | yes | ID from `recall_memories` results | | `reason` | string | no | Why the memory is being deleted | --- ## Agent Framework > Streaming > Source: agent/streaming.md # Streaming `ChatStream` returns a channel of events for real-time response handling. ## Basic Usage ```go for event := range myAgent.ChatStream(ctx, "Tell me a story") { switch event.Type { case types.EventContentDelta: fmt.Print(event.Content) case types.EventThinkingDelta: // Extended thinking content (if supported) case types.EventToolUseStart: fmt.Printf("\nUsing tool: %s\n", event.ToolCall.Name) case types.EventToolUseStop: if event.ToolResult != nil { fmt.Printf("Tool result: %s\n", event.ToolResult.Output) } case types.EventHandoff: fmt.Printf("Handed off to: %s\n", event.AgentName) case types.EventComplete: fmt.Printf("\nDone! Tokens: %d\n", event.Response.Usage.InputTokens) case types.EventError: log.Fatal(event.Error) } } ``` ## ContinueStream The streaming variant of `Continue()`: ```go for event := range myAgent.ContinueStream(ctx, toolResults) { switch event.Type { case types.EventContentDelta: fmt.Print(event.Content) case types.EventComplete: fmt.Println("\nDone!") } } ``` ## Event Types | Event | Field | Description | |-------|-------|-------------| | `EventContentStart` | — | Content generation is beginning | | `EventContentDelta` | `Content` | Partial text token | | `EventContentStop` | — | Content generation finished | | `EventToolUseStart` | `ToolCall` | Tool invocation starting (name, ID) | | `EventToolUseDelta` | `ToolCall` | Partial tool input JSON | | `EventToolUseStop` | `ToolResult` | Tool execution completed with result | | `EventThinkingDelta` | `Thinking` | Chain-of-thought reasoning (if model supports it) | | `EventHandoff` | `AgentName` | Control transferred to another agent | | `EventConfirmationRequired` | `ConfirmationRequest` | Tool awaiting human approval ([details](confirmation.md)) | | `EventComplete` | `Response` | Streaming finished — contains the full `ChatResponse` | | `EventError` | `Error` | An error occurred during streaming | | `EventWarning` | `Error` | A non-fatal warning | ## ChatEvent ```go type ChatEvent struct { Type types.EventType Content string // EventContentDelta Thinking string // EventThinkingDelta ToolCall *message.ToolCall // EventToolUseStart/Delta ToolResult *ToolExecutionResult // EventToolUseStop Response *ChatResponse // EventComplete Error error // EventError, EventWarning AgentName string // EventHandoff ConfirmationRequest *tool.ConfirmationRequest // EventConfirmationRequired } ``` --- ## Agent Framework > Hooks > Source: agent/hooks.md # Hooks Hooks let you observe, modify, or block agent behavior at key points in the execution pipeline. They cover tool calls, model interactions, error recovery, agent lifecycle, input validation, and cross-cutting event observation. ## Setup ```go myAgent := agent.New(llmClient, agent.WithHooks(agent.Hooks{ PreToolUse: func(ctx context.Context, tc agent.ToolUseContext) (agent.PreToolUseResult, error) { log.Printf("Tool call: %s (branch: %s)", tc.ToolName, tc.Branch) return agent.PreToolUseResult{Action: agent.HookAllow}, nil }, }), ) ``` ## Hook Types | Hook | Fires | Can | |------|-------|-----| | `PreToolUse` | Before a tool executes | Allow, Deny, or Modify input | | `PostToolUse` | After a tool executes | Allow or Modify output | | `PreModelCall` | Before an LLM request | Allow or Modify messages/tools | | `PostModelCall` | After an LLM response | Allow or Modify response | | `OnSubagentStart` | When a background sub-agent launches | Observe only | | `OnSubagentStop` | When a background sub-agent finishes | Observe only | | `OnToolError` | When a tool returns an error | Allow (re-raise) or Modify (recover) | | `OnModelError` | When an LLM call fails | Allow (re-raise) or Modify (recover) | | `BeforeAgent` | Before an agent starts its run | Allow, Deny, or Modify (short-circuit) | | `AfterAgent` | After an agent completes its run | Allow or Modify response | | `BeforeRun` | At the start of Chat/ChatStream | Observe only | | `AfterRun` | At the end of Chat/ChatStream | Observe only | | `OnUserMessage` | When a user message arrives | Allow, Deny, or Modify message | | `OnEvent` | On every hook event emitted | Observe only | ## HookAction Every hook returns a `HookAction` that controls what happens next: | Action | Behavior | |--------|----------| | `HookAllow` | Continue normally (default) | | `HookDeny` | Block execution (PreToolUse, BeforeAgent, OnUserMessage) | | `HookModify` | Replace input, output, messages, response, or recover from errors | ## Denying a Tool Call Return `HookDeny` from `PreToolUse` to block a tool before it runs: ```go agent.Hooks{ PreToolUse: func(_ context.Context, tc agent.ToolUseContext) (agent.PreToolUseResult, error) { if tc.ToolName == "dangerous_tool" { return agent.PreToolUseResult{ Action: agent.HookDeny, DenyReason: "this tool is not allowed", }, nil } return agent.PreToolUseResult{Action: agent.HookAllow}, nil }, } ``` The agent receives a tool error result with the deny reason. ## Modifying Tool Input Return `HookModify` from `PreToolUse` to rewrite the input before execution: ```go agent.Hooks{ PreToolUse: func(_ context.Context, tc agent.ToolUseContext) (agent.PreToolUseResult, error) { modified := strings.ReplaceAll(tc.Input, "SECRET", "[REDACTED]") return agent.PreToolUseResult{ Action: agent.HookModify, Input: modified, }, nil }, } ``` ## Modifying Model Messages Return `HookModify` from `PreModelCall` to inject or filter messages before they reach the LLM: ```go agent.Hooks{ PreModelCall: func(_ context.Context, mc agent.ModelCallContext) (agent.ModelCallResult, error) { extra := message.NewUserMessage("Remember: always respond in JSON.") return agent.ModelCallResult{ Action: agent.HookModify, Messages: append(mc.Messages, extra), Tools: mc.Tools, }, nil }, } ``` ## Error Recovery ### Tool Error Recovery `OnToolError` fires when a tool returns an error, before the error reaches `PostToolUse`. Return `HookModify` with replacement output to recover: ```go agent.Hooks{ OnToolError: func(_ context.Context, tc agent.ToolErrorContext) (agent.ToolErrorResult, error) { if tc.ToolName == "flaky_api" { return agent.ToolErrorResult{ Action: agent.HookModify, Output: "API temporarily unavailable, using cached data", }, nil } return agent.ToolErrorResult{Action: agent.HookAllow}, nil }, } ``` When recovery succeeds, the error flag is cleared and `PostToolUse` sees a non-error result. Multiple error callbacks chain — the first recovery wins. ### Model Error Recovery `OnModelError` fires when an LLM call fails. Return `HookModify` with a replacement response to recover: ```go agent.Hooks{ OnModelError: func(_ context.Context, mc agent.ModelErrorContext) (agent.ModelErrorResult, error) { return agent.ModelErrorResult{ Action: agent.HookModify, Response: &llm.Response{ Content: "Service temporarily unavailable. Please try again.", }, }, nil }, } ``` This works in both `Chat()` and `ChatStream()` paths. ## Agent Lifecycle ### Short-Circuiting with BeforeAgent `BeforeAgent` fires before an agent starts its run. Return `HookModify` with a response to skip the agent entirely: ```go agent.Hooks{ BeforeAgent: func(_ context.Context, ac agent.LifecycleContext) (agent.LifecycleResult, error) { if cached, ok := cache.Get(ac.Input); ok { return agent.LifecycleResult{ Action: agent.HookModify, Response: &agent.ChatResponse{Content: cached}, }, nil } return agent.LifecycleResult{Action: agent.HookAllow}, nil }, } ``` Return `HookDeny` to block the agent run with a nil response. ### Modifying with AfterAgent `AfterAgent` fires after an agent completes. Modify the response before it reaches the caller: ```go agent.Hooks{ AfterAgent: func(_ context.Context, ac agent.LifecycleContext) (agent.LifecycleResult, error) { modified := *ac.Response modified.Content = sanitize(modified.Content) return agent.LifecycleResult{ Action: agent.HookModify, Response: &modified, }, nil }, } ``` ## Run Lifecycle `BeforeRun` and `AfterRun` are observation-only hooks that fire at the very start and end of `Chat()`/`ChatStream()`: ```go agent.Hooks{ BeforeRun: func(_ context.Context, rc agent.RunContext) { metrics.StartTimer(rc.AgentName) }, AfterRun: func(_ context.Context, rc agent.RunContext) { metrics.RecordDuration(rc.AgentName, rc.Duration) if rc.Error != nil { metrics.RecordError(rc.AgentName, rc.Error) } }, } ``` `AfterRun` receives the final response, any error, and the total duration. ## Input Validation `OnUserMessage` fires when a user message arrives, before it reaches any agent logic. Use it to preprocess, validate, or reject messages: ```go agent.Hooks{ OnUserMessage: func(_ context.Context, uc agent.UserMessageContext) (agent.UserMessageResult, error) { if containsPII(uc.Message) { return agent.UserMessageResult{ Action: agent.HookDeny, DenyReason: "message contains PII", }, nil } return agent.UserMessageResult{ Action: agent.HookModify, Message: sanitizeInput(uc.Message), }, nil }, } ``` `OnUserMessage` does not fire for `Continue()`/`ContinueStream()` since those resume with tool results, not user messages. ## Cross-Cutting Event Observation `OnEvent` fires on every hook event emitted during execution. Use it for logging, analytics, or event transformation: ```go agent.Hooks{ OnEvent: func(_ context.Context, evt agent.HookEvent) { log.Printf("[%s] agent=%s tool=%s", evt.Type, evt.AgentName, evt.ToolName) }, } ``` `OnEvent` fires once per hook-point invocation (after all hooks in the chain have run), not once per registered hook. It covers all event types except itself. ## Chaining Multiple Hooks Pass multiple `Hooks` to `WithHooks`, or call `WithHooks` multiple times. Hooks run in registration order. ```go myAgent := agent.New(llmClient, agent.WithHooks(loggingHooks, guardRailHooks, metricsHooks), ) ``` Chain rules: - **Deny wins immediately** — if any hook returns `HookDeny`, later hooks are skipped - **Last Modify wins** — if multiple hooks return `HookModify`, the last one's value is used - **First recovery wins** — for error callbacks (`OnToolError`, `OnModelError`), the first `HookModify` response is used - **nil fields are skipped** — you only need to set the hooks you care about ## Observation with NewObservingHooks For pure observation (logging, metrics, streaming to a UI), use the `NewObservingHooks` helper. It wires all hooks to emit structured `HookEvent` values to a single callback: ```go myAgent := agent.New(llmClient, agent.WithHooks(agent.NewObservingHooks(func(evt agent.HookEvent) { log.Printf("[%s] agent=%s branch=%s tool=%s", evt.Type, evt.AgentName, evt.Branch, evt.ToolName) })), ) ``` All observing hooks return `HookAllow` — they never block or modify execution. `OnEvent` is left nil in observing hooks to avoid double-emission. ### HookEvent | Field | Type | Description | |-------|------|-------------| | `Type` | `HookEventType` | Event type (see below) | | `Timestamp` | `time.Time` | When the event fired | | `AgentName` | `string` | Name of the agent | | `TaskID` | `string` | Background task ID (if applicable) | | `Branch` | `string` | Agent hierarchy path (e.g. `"orchestrator/researcher"`) | | `ToolCallID` | `string` | Tool call ID (tool events only) | | `ToolName` | `string` | Tool name (tool events only) | | `Input` | `string` | Tool input, sub-agent task, or user message | | `Output` | `string` | Tool output or sub-agent result | | `IsError` | `bool` | Whether an error occurred | | `Duration` | `time.Duration` | Execution duration (post-events only) | | `Usage` | `llm.TokenUsage` | Token usage (post model call only) | | `Error` | `string` | Error message (if `IsError` is true) | ### Event Types | Constant | Value | When | |----------|-------|------| | `HookEventPreToolUse` | `"pre_tool_use"` | Before tool execution | | `HookEventPostToolUse` | `"post_tool_use"` | After tool execution | | `HookEventPreModelCall` | `"pre_model_call"` | Before LLM request | | `HookEventPostModelCall` | `"post_model_call"` | After LLM response | | `HookEventSubagentStart` | `"subagent_start"` | Background sub-agent launched | | `HookEventSubagentStop` | `"subagent_stop"` | Background sub-agent finished | | `HookEventToolError` | `"tool_error"` | Tool returned an error | | `HookEventModelError` | `"model_error"` | LLM call failed | | `HookEventBeforeAgent` | `"before_agent"` | Before agent starts | | `HookEventAfterAgent` | `"after_agent"` | After agent completes | | `HookEventBeforeRun` | `"before_run"` | Start of Chat/ChatStream | | `HookEventAfterRun` | `"after_run"` | End of Chat/ChatStream | | `HookEventUserMessage` | `"user_message"` | User message received | ## Branch The `Branch` field on all hook contexts gives you the agent hierarchy as a `/`-separated path. For a nested setup where an orchestrator delegates to a researcher which delegates to a scraper: ``` Branch: "orchestrator/researcher/scraper" ``` This lets you immediately see which agent in the hierarchy produced an event, without cross-referencing task IDs. ## Hook Propagation Hooks set on a parent agent automatically propagate to sub-agents that don't have their own hooks: ```go orchestrator := agent.New(llmClient, agent.WithHooks(myHooks), agent.WithSubAgents( agent.SubAgentConfig{Name: "worker", Agent: worker}, ), ) // worker inherits myHooks since it has none of its own ``` If a sub-agent already has hooks configured, the parent's hooks are not applied. ## Context Structs ### ToolUseContext Passed to `PreToolUse` and embedded in `PostToolUseContext` and `ToolErrorContext`: ```go type ToolUseContext struct { ToolCallID string ToolName string Input string AgentName string TaskID string Branch string } ``` ### PostToolUseContext Passed to `PostToolUse`: ```go type PostToolUseContext struct { ToolUseContext // Embeds all fields from ToolUseContext Output string IsError bool Duration time.Duration } ``` ### ToolErrorContext Passed to `OnToolError`: ```go type ToolErrorContext struct { ToolUseContext // Embeds all fields from ToolUseContext Error error Output string Duration time.Duration } ``` ### ModelCallContext Passed to `PreModelCall`: ```go type ModelCallContext struct { Messages []message.Message Tools []tool.BaseTool AgentName string TaskID string Branch string } ``` ### ModelResponseContext Passed to `PostModelCall`: ```go type ModelResponseContext struct { Response *llm.Response Duration time.Duration AgentName string TaskID string Branch string Error error } ``` ### ModelErrorContext Passed to `OnModelError`: ```go type ModelErrorContext struct { Messages []message.Message Tools []tool.BaseTool Error error AgentName string TaskID string Branch string } ``` ### SubagentEventContext Passed to `OnSubagentStart` and `OnSubagentStop`: ```go type SubagentEventContext struct { TaskID string AgentName string Task string Branch string Result string Error error Duration time.Duration } ``` ### LifecycleContext Passed to `BeforeAgent` and `AfterAgent`: ```go type LifecycleContext struct { AgentName string TaskID string Branch string Input string Response *ChatResponse // nil for BeforeAgent, set for AfterAgent } ``` ### RunContext Passed to `BeforeRun` and `AfterRun`: ```go type RunContext struct { AgentName string TaskID string Branch string Input string Response *ChatResponse // nil for BeforeRun, set for AfterRun Error error // nil for BeforeRun, set for AfterRun if failed Duration time.Duration // zero for BeforeRun } ``` ### UserMessageContext Passed to `OnUserMessage`: ```go type UserMessageContext struct { Message string AgentName string TaskID string Branch string } ``` ## Streaming to a UI A common use case is forwarding hook events to a frontend over WebSocket or SSE: ```go agent.NewObservingHooks(func(evt agent.HookEvent) { data, _ := json.Marshal(evt) websocket.Send(data) }) ``` This gives the UI real-time visibility into tool calls, model interactions, error recovery, and agent lifecycle — including nested agent hierarchies via `Branch`. --- ## Agent Framework > Tool Confirmation > Source: agent/confirmation.md # Tool Confirmation The confirmation protocol lets tools require human approval before executing. The framework provides the mechanism — consumers provide the UI/interaction layer. ## Setup Register a `ConfirmationProvider` on the agent. The provider is called whenever a tool requires confirmation and blocks until the consumer provides a decision. ```go myAgent := agent.New(llmClient, agent.WithTools(&DeleteTool{}), agent.WithConfirmationProvider( func(ctx context.Context, req tool.ConfirmationRequest) (bool, error) { // Present req to the user, wait for their decision return askUser(req.ToolName, req.Input, req.Hint), nil }, ), ) ``` Return `true` to approve, `false` to reject. If the provider returns an error, the tool call fails with that error. ## Declarative Confirmation Set `RequireConfirmation` on a tool's `Info` to require approval before `Run()` is called: ```go func (t *DeleteTool) Info() tool.Info { info := tool.NewInfo("delete_records", "Delete database records", DeleteParams{}) info.RequireConfirmation = true return info } ``` When the agent encounters this tool, it calls the `ConfirmationProvider` before executing. If no provider is configured, the tool runs normally — confirmation is opt-in. ## Dynamic Confirmation Tools can request confirmation from within `Run()` for conditional approval: ```go func (t *TransferTool) Run(ctx context.Context, params tool.Call) (tool.Response, error) { var input TransferParams json.Unmarshal([]byte(params.Input), &input) if input.Amount > 10000 { err := tool.RequestConfirmation(ctx, "Large transfer exceeding $10,000", input) if err != nil { return tool.Response{}, err } } // Proceed with transfer return tool.NewTextResponse("Transfer complete"), nil } ``` `RequestConfirmation` blocks until the consumer decides. If rejected, it returns `tool.ErrConfirmationRejected` — propagate this error to halt execution. If no `ConfirmationProvider` is configured, `RequestConfirmation` is a no-op (auto-approve). ## ConfirmationRequest The provider receives a `ConfirmationRequest` with context about the tool call: ```go type ConfirmationRequest struct { ToolCallID string // Unique ID of this tool call ToolName string // Name of the tool Input string // JSON-encoded arguments Hint string // Human-readable description (dynamic confirmation only) Payload any // Arbitrary structured data (dynamic confirmation only) } ``` For declarative confirmation (`RequireConfirmation` flag), `Hint` and `Payload` are empty. For dynamic confirmation (`RequestConfirmation`), they carry the values passed by the tool. ## Toolset-Level Confirmation Use `tool.WithConfirmation` to mark all tools in a toolset as requiring confirmation: ```go dangerousTools := tool.NewToolset("dangerous", &DeleteTool{}, &DropTableTool{}, &FormatDiskTool{}, ) confirmed := tool.WithConfirmation(dangerousTools) myAgent := agent.New(llmClient, agent.WithToolsets(confirmed), agent.WithConfirmationProvider(myProvider), ) ``` This sets `RequireConfirmation = true` on every tool in the toolset without modifying the originals. ## Streaming In the streaming path (`ChatStream`), an `EventConfirmationRequired` event is emitted before the provider blocks. This allows the consumer to present a UI and then unblock the provider: ```go for event := range myAgent.ChatStream(ctx, "Delete old records") { switch event.Type { case types.EventConfirmationRequired: req := event.ConfirmationRequest fmt.Printf("Tool %q wants to run with input: %s\n", req.ToolName, req.Input) // The provider is blocking — respond via whatever mechanism it uses case types.EventContentDelta: fmt.Print(event.Content) case types.EventComplete: fmt.Println("\nDone!") } } ``` A common pattern is to use a channel-based provider that the streaming consumer unblocks: ```go type approval struct { approved bool ch chan struct{} } pending := make(map[string]*approval) var mu sync.Mutex provider := func(ctx context.Context, req tool.ConfirmationRequest) (bool, error) { a := &approval{ch: make(chan struct{})} mu.Lock() pending[req.ToolCallID] = a mu.Unlock() <-a.ch // Block until consumer decides return a.approved, nil } // In the stream consumer, when EventConfirmationRequired arrives: // mu.Lock() // a := pending[req.ToolCallID] // mu.Unlock() // a.approved = userClickedApprove // close(a.ch) ``` ## Interaction with Hooks `PreToolUse` hooks run before confirmation. If a hook denies the tool, the confirmation provider is never called: ``` PreToolUse hooks → Confirmation check → tool.Run() ``` This means hooks enforce policy (rate limits, blocklists), while confirmation handles human approval. ## Handoffs Each agent has its own `ConfirmationProvider`. When a handoff occurs, the new agent's provider is used. If the target agent has no provider, its tools run without confirmation. ## Auto-Approve Patterns The provider is a regular function — implement any approval logic: ```go // Always approve (useful for testing) agent.WithConfirmationProvider( func(_ context.Context, _ tool.ConfirmationRequest) (bool, error) { return true, nil }, ) // Check a database of pre-approved tools agent.WithConfirmationProvider( func(ctx context.Context, req tool.ConfirmationRequest) (bool, error) { return db.IsToolPreApproved(ctx, userID, req.ToolName) }, ) // Approve safe tools, prompt for dangerous ones agent.WithConfirmationProvider( func(ctx context.Context, req tool.ConfirmationRequest) (bool, error) { if req.ToolName == "read_file" { return true, nil } return promptUser(ctx, req) }, ) ``` --- ## Agent Framework > Sub-Agents > Source: agent/sub-agents.md # Sub-Agents Sub-agents let an orchestrator delegate tasks to specialized child agents. Each sub-agent becomes a callable tool. ## Setup ```go researcher := agent.New(llmClient, agent.WithSystemPrompt("You are a research specialist."), agent.WithTools(&webSearchTool{}), ) writer := agent.New(llmClient, agent.WithSystemPrompt("You are a content writer."), ) orchestrator := agent.New(llmClient, agent.WithSystemPrompt("You coordinate research and writing tasks."), agent.WithSubAgents( agent.SubAgentConfig{Name: "researcher", Description: "Researches topics", Agent: researcher}, agent.SubAgentConfig{Name: "writer", Description: "Writes content", Agent: writer}, ), ) response, _ := orchestrator.Chat(ctx, "Research and write about quantum computing") ``` ## How It Works 1. Each `SubAgentConfig` registers a tool named after the sub-agent 2. The orchestrator LLM decides when to delegate a task 3. The sub-agent runs to completion and returns its response 4. The orchestrator continues with the sub-agent's output ## SubAgentConfig ```go type SubAgentConfig struct { Name string // Tool name the orchestrator calls Description string // Describes when to use this sub-agent Agent *Agent // The sub-agent instance } ``` ## Background Execution Sub-agents can run asynchronously by passing `background: true`. The orchestrator gets a `task_id` immediately and can check status or wait for results later. ```go orchestrator := agent.New(llmClient, agent.WithSystemPrompt(`Launch background tasks, then collect results.`), agent.WithSubAgents( agent.SubAgentConfig{ Name: "researcher", Description: "Research a topic. Supports background: true for async execution.", Agent: researcher, }, ), ) ``` When the LLM calls the sub-agent with `background: true`: 1. The task launches in a goroutine and returns `{"task_id": "task-1", "status": "launched"}` 2. Three task management tools are automatically available: `get_task_result`, `stop_task`, `list_tasks` 3. The orchestrator uses `get_task_result` with `wait: true` to collect results See [Background Agents](background-agents.md) for the full tool reference and examples. --- ## Agent Framework > Background Agents > Source: agent/background-agents.md # Background Agents Background agents let the orchestrator launch sub-agents asynchronously. Tasks run in goroutines and the orchestrator can continue working, check status, or wait for results. ## Setup ```go researcher := agent.New(llmClient, agent.WithSystemPrompt("You are a concise research assistant."), ) orchestrator := agent.New(llmClient, agent.WithSystemPrompt(`You coordinate research tasks. 1. Launch background tasks with background: true 2. Collect results with get_task_result (wait: true) 3. Synthesize the results`), agent.WithSubAgents( agent.SubAgentConfig{ Name: "researcher", Description: "Research a topic. Supports background: true for async execution.", Agent: researcher, }, ), ) ``` ## How It Works 1. The orchestrator calls a sub-agent tool with `background: true` 2. The sub-agent launches in a goroutine and returns a `task_id` immediately 3. Three task management tools are auto-registered for the orchestrator: | Tool | Description | |------|-------------| | `get_task_result` | Check status or wait for a background task to complete | | `stop_task` | Cancel a running background task | | `list_tasks` | List all background tasks and their status | ## Task Lifecycle Tasks move through these states: | Status | Description | |--------|-------------| | `running` | Task is currently executing | | `completed` | Task finished successfully | | `failed` | Task encountered an error | | `cancelled` | Task was explicitly cancelled | ## Tool Reference ### get_task_result | Parameter | Type | Required | Description | |-----------|------|----------|-------------| | `task_id` | string | yes | The task ID returned when the task was launched | | `wait` | bool | no | If true, block until the task completes | | `timeout` | int | no | Max wait time in milliseconds. 0 means no timeout | ### stop_task | Parameter | Type | Required | Description | |-----------|------|----------|-------------| | `task_id` | string | yes | The task ID to cancel | ### list_tasks No parameters. Returns all tasks with their ID, agent name, and status. ## Streaming Example ```go for event := range orchestrator.ChatStream(ctx, "Compare Go and Rust. Research each in the background.") { switch event.Type { case types.EventContentDelta: fmt.Print(event.Content) case types.EventError: log.Fatal(event.Error) } if event.ToolResult != nil { fmt.Printf("\n[Tool: %s → %s]\n", event.ToolResult.ToolName, event.ToolResult.Output) } } ``` ## Sub-Agent Input When a sub-agent is called, it accepts these parameters: | Parameter | Type | Required | Description | |-----------|------|----------|-------------| | `task` | string | yes | The task or question to send to the sub-agent | | `background` | bool | no | If true, run in background and return a task ID | | `max_turns` | int | no | Maximum tool-execution turns. 0 uses the agent default | --- ## Agent Framework > Handoffs > Source: agent/handoffs.md # Handoffs Handoffs transfer full control from one agent to another. Unlike sub-agents (which return results to the orchestrator), handoffs permanently switch the active agent. ## Setup ```go billing := agent.New(llmClient, agent.WithSystemPrompt("You handle billing inquiries."), ) support := agent.New(llmClient, agent.WithSystemPrompt("You handle technical support."), ) triage := agent.New(llmClient, agent.WithSystemPrompt("Route the user to the right specialist."), agent.WithHandoffs( agent.HandoffConfig{Name: "billing", Description: "Billing questions", Agent: billing}, agent.HandoffConfig{Name: "support", Description: "Technical issues", Agent: support}, ), ) response, _ := triage.Chat(ctx, "I was charged twice on my last invoice") fmt.Println(response.AgentName) // "billing" ``` ## How It Works 1. Each `HandoffConfig` auto-generates a `transfer_to_` tool 2. When the triage agent calls `transfer_to_billing`, control transfers permanently 3. The billing agent's system prompt replaces the triage agent's 4. The conversation history carries over 5. `ChatResponse.AgentName` indicates which agent produced the final response ## HandoffConfig ```go type HandoffConfig struct { Name string // Used to generate transfer_to_ tool Description string // Tells the LLM when to transfer Agent *Agent // The target agent } ``` ## Handoffs vs Sub-Agents | | Sub-Agents | Handoffs | |---|---|---| | Control flow | Returns to orchestrator | Permanent transfer | | System prompt | Sub-agent uses its own | Replaces current | | Use case | Task delegation | Routing/triage | --- ## Agent Framework > Fan-Out > Source: agent/fan-out.md # Fan-Out Fan-out distributes multiple tasks to worker agents in parallel and collects results. ## Setup ```go researcher := agent.New(llmClient, agent.WithSystemPrompt("Research the given topic thoroughly."), ) coordinator := agent.New(llmClient, agent.WithSystemPrompt("You coordinate parallel research tasks."), agent.WithFanOut(agent.FanOutConfig{ Name: "research", Description: "Research multiple topics in parallel", Agent: researcher, MaxConcurrency: 3, }), ) response, _ := coordinator.Chat(ctx, "Compare AI, blockchain, and quantum computing") ``` ## How It Works 1. The `FanOutConfig` registers a tool that accepts multiple tasks 2. When the coordinator calls the fan-out tool, all tasks run concurrently 3. `MaxConcurrency` limits how many worker agents run at the same time 4. Results are collected and returned to the coordinator ## FanOutConfig ```go type FanOutConfig struct { Name string // Tool name Description string // Describes when to use fan-out Agent *Agent // Worker agent (cloned per task) MaxConcurrency int // Max parallel workers (0 = unlimited) } ``` --- ## Agent Framework > Continue/Resume > Source: agent/continue.md # Continue/Resume `Continue()` lets you manually execute tool calls and feed results back into the agent loop. This is useful when tools require human approval, external API calls, or custom execution logic. ## Setup ```go myAgent := agent.New(llmClient, agent.WithAutoExecute(false), // Don't auto-execute tools agent.WithSession("conv-1", session.MemoryStore()), ) ``` ## Usage ```go // First call returns pending tool calls instead of executing them response, _ := myAgent.Chat(ctx, "Search for flights to Tokyo") // Inspect what tools the LLM wants to call for _, tc := range response.ToolCalls { fmt.Printf("Tool: %s, Input: %s\n", tc.Name, tc.Input) } // Execute tools externally with your own logic results := []message.ToolResult{ { ToolCallID: response.ToolCalls[0].ID, Name: "search_flights", Content: `{"flights": [{"airline": "JAL", "price": 850}]}`, }, } // Resume the agent loop with results response, _ = myAgent.Continue(ctx, results) fmt.Println(response.Content) ``` ## Streaming Variant ```go for event := range myAgent.ContinueStream(ctx, results) { switch event.Type { case types.EventContentDelta: fmt.Print(event.Content) case types.EventComplete: fmt.Println("\nDone!") } } ``` > **Note:** > `Continue()` requires a session to be configured, since it needs to restore conversation state from the previous `Chat()` call. > --- ## Agent Framework > Context Strategies > Source: agent/context-strategies.md # Context Strategies Context strategies automatically manage the context window when conversations grow beyond token limits. ## Available Strategies ### Sliding Window Keep only the last N messages: ```go import "github.com/joakimcarlsson/ai/tokens/sliding" myAgent := agent.New(llmClient, agent.WithSystemPrompt("You are a helpful assistant."), agent.WithSession("conv-1", store), agent.WithContextStrategy(sliding.Strategy(sliding.KeepLast(10)), 0), ) ``` ### Truncate Remove oldest messages to fit the token budget: ```go import "github.com/joakimcarlsson/ai/tokens/truncate" myAgent := agent.New(llmClient, agent.WithContextStrategy(truncate.Strategy(), 0), ) ``` ### Summarize Use an LLM to compress older messages into a summary: ```go import "github.com/joakimcarlsson/ai/tokens/summarize" myAgent := agent.New(llmClient, agent.WithContextStrategy(summarize.Strategy(llmClient), 0), ) ``` ## How It Works Before each LLM call, the agent: 1. Counts tokens for all messages + system prompt + tools 2. If total exceeds the limit, applies the strategy 3. The strategy reduces messages while preserving recent context 4. The session is updated if the strategy produces a session update (e.g., summary message) ## Custom Max Tokens The second argument to `WithContextStrategy` sets a custom max token limit. Pass `0` to auto-calculate from the model's context window minus a 4096-token reserve. ```go // Custom limit: 50k tokens agent.WithContextStrategy(sliding.Strategy(sliding.KeepLast(20)), 50000) ``` ## Custom Strategy Implement the `tokens.Strategy` interface: ```go type Strategy interface { Fit(ctx context.Context, input StrategyInput) (*StrategyResult, error) } ``` --- ## Agent Framework > Toolsets > Source: agent/toolsets.md # Toolsets Toolsets group multiple tools under a name with optional dynamic filtering. Unlike static tool lists, toolsets are resolved **per-call** — the predicate runs on every `Chat()` turn, so you can enable or disable tools based on runtime context. ## Creating a Toolset A basic toolset is a named collection of tools: ```go recon := tool.NewToolset("recon", &NmapTool{}, &DnsLookupTool{}, &WhoisTool{}, ) a := agent.New(llmClient, agent.WithToolsets(recon), ) ``` You can mix toolsets with individual tools: ```go a := agent.New(llmClient, agent.WithTools(&AlwaysAvailableTool{}), agent.WithToolsets(recon, exploitation), ) ``` ## Filtered Toolsets `NewFilterToolset` wraps a toolset with a predicate that controls which tools are available. The predicate receives the `context.Context` and each tool, and returns whether that tool should be included. ```go type phaseKey struct{} allTools := tool.NewToolset("pentest", &NmapTool{}, &SqlInjectionTool{}, &BruteForcePasswordTool{}, ) filtered := tool.NewFilterToolset("phase-aware", allTools, func(ctx context.Context, t tool.BaseTool) bool { phase, _ := ctx.Value(phaseKey{}).(string) switch t.Info().Name { case "sql_injection", "brute_force_password": return phase == "exploitation" default: return true } }, ) a := agent.New(llmClient, agent.WithToolsets(filtered), ) // During recon phase, only NmapTool is available ctx := context.WithValue(ctx, phaseKey{}, "recon") resp, _ := a.Chat(ctx, "Start scanning the target") // During exploitation phase, all tools are available ctx = context.WithValue(ctx, phaseKey{}, "exploitation") resp, _ = a.Chat(ctx, "Try exploiting the SQL injection") ``` ### Filtering by Configuration Predicates can also read from engagement configuration or any other source: ```go type EngagementConfig struct { AllowBruteForce bool AllowExploits bool } configKey := struct{}{} filtered := tool.NewFilterToolset("engagement", allTools, func(ctx context.Context, t tool.BaseTool) bool { cfg, _ := ctx.Value(configKey).(*EngagementConfig) if cfg == nil { return false } switch t.Info().Name { case "brute_force": return cfg.AllowBruteForce case "sql_injection", "xss_scanner": return cfg.AllowExploits default: return true } }, ) ``` ## Composing Toolsets Toolsets compose — use `NewCompositeToolset` to merge multiple toolsets into one: ```go recon := tool.NewToolset("recon", &NmapTool{}, &DnsLookupTool{}) exploit := tool.NewToolset("exploit", &SqlInjectionTool{}) reporting := tool.NewToolset("reporting", &ReportTool{}) all := tool.NewCompositeToolset("full-suite", recon, exploit, reporting) ``` Composite toolsets work with filtered toolsets too — you can filter individual groups and then compose them: ```go filteredExploit := tool.NewFilterToolset("filtered-exploit", exploit, exploitPredicate) combined := tool.NewCompositeToolset("suite", recon, filteredExploit, reporting) ``` ## MCP Toolsets Wrap MCP server tools as a toolset: ```go mcpTools := tool.MCPToolset("external", map[string]tool.MCPServer{ "filesystem": { Command: "npx", Args: []string{"-y", "@modelcontextprotocol/server-filesystem", "/tmp"}, Type: tool.MCPStdio, }, }) a := agent.New(llmClient, agent.WithToolsets(mcpTools), ) ``` ## Confirmation Wrapper `tool.WithConfirmation` wraps a toolset so every tool in it requires human approval before execution. Pair it with `WithConfirmationProvider` on the agent: ```go dangerous := tool.NewToolset("exploits", &SqlInjectionTool{}, &BruteForcePasswordTool{}, ) a := agent.New(llmClient, agent.WithToolsets(tool.WithConfirmation(dangerous)), agent.WithConfirmationProvider(myApprovalHandler), ) ``` The original toolset is not modified. See [Tool Confirmation](confirmation.md) for the full protocol. ## Toolsets and Hooks Since toolsets resolve to `[]tool.BaseTool`, [hooks](hooks.md) apply to individual tools regardless of how they were grouped: ```go a := agent.New(llmClient, agent.WithToolsets(exploitToolset), agent.WithHooks(agent.Hooks{ PreToolUse: func(ctx context.Context, tc agent.ToolUseContext) (agent.PreToolUseResult, error) { if tc.ToolName == "sql_injection" { return agent.PreToolUseResult{ Action: agent.HookDeny, DenyReason: "SQL injection blocked by policy", }, nil } return agent.PreToolUseResult{Action: agent.HookAllow}, nil }, }), ) ``` ## Custom Toolset Implementations The `Toolset` interface is simple — implement it for custom resolution logic: ```go type Toolset interface { Name() string Tools(ctx context.Context) []tool.BaseTool } ``` For example, a toolset that loads tools from a database: ```go type DBToolset struct { db *sql.DB } func (d *DBToolset) Name() string { return "db-tools" } func (d *DBToolset) Tools(ctx context.Context) []tool.BaseTool { // Query available tools from database based on user permissions rows, _ := d.db.QueryContext(ctx, "SELECT name, config FROM tools WHERE enabled = true") // ... build and return tools } ``` --- ## Agent Framework > Instruction Templates > Source: agent/instruction-templates.md # Instruction Templates Dynamic system prompts using template variables or runtime-generated instructions. ## Static Templates Use Go template syntax (`{{.var}}`) with `WithState`: ```go myAgent := agent.New(llmClient, agent.WithSystemPrompt("You are {{.role}}. Help {{.user_name}} with their tasks."), agent.WithState(map[string]any{ "role": "a coding assistant", "user_name": "Alice", }), ) ``` ## Conditional Templates ```go myAgent := agent.New(llmClient, agent.WithSystemPrompt(`You are a helpful assistant. {{if .extra_context}} Additional context: {{.extra_context}} {{end}}`), agent.WithState(map[string]any{ "extra_context": "The user prefers concise answers.", }), ) ``` ## Dynamic Provider For fully dynamic prompts generated at runtime: ```go myAgent := agent.New(llmClient, agent.WithInstructionProvider(func(ctx context.Context, state map[string]any) (string, error) { return fmt.Sprintf( "Current time: %s\nYou are a helpful assistant.", time.Now().Format(time.RFC3339), ), nil }), ) ``` The instruction provider receives the state map and can use it alongside any other runtime data (database lookups, feature flags, etc.). --- ## Integrations > PostgreSQL > Source: integrations/postgres.md # PostgreSQL PostgreSQL-backed session store for persistent conversation history. No extensions required. ## Installation ```bash go get github.com/joakimcarlsson/ai/integrations/postgres ``` ## Setup ```go import "github.com/joakimcarlsson/ai/integrations/postgres" sessionStore, err := postgres.SessionStore(ctx, "postgres://user:pass@localhost:5432/mydb?sslmode=disable") if err != nil { log.Fatal(err) } myAgent := agent.New(llmClient, agent.WithSession("conv-1", sessionStore), ) ``` Tables and indexes are created automatically on first use. ## Schema ```sql CREATE TABLE sessions ( id TEXT PRIMARY KEY, created_at TIMESTAMPTZ DEFAULT NOW() ); CREATE TABLE messages ( id TEXT PRIMARY KEY, session_id TEXT NOT NULL REFERENCES sessions(id) ON DELETE CASCADE, role TEXT NOT NULL, parts JSONB NOT NULL, model TEXT, created_at BIGINT NOT NULL ); CREATE INDEX messages_session_idx ON messages(session_id, created_at); ``` ## Options | Option | Description | |--------|-------------| | `postgres.WithIDGenerator(fn)` | Custom ID generator for message records. Default: UUID v4 | ```go store, err := postgres.SessionStore(ctx, connString, postgres.WithIDGenerator(func() string { return myCustomID() }), ) ``` ## Full Example ```go package main import ( "context" "fmt" "log" "os" "github.com/joakimcarlsson/ai/agent" "github.com/joakimcarlsson/ai/agent/memory" "github.com/joakimcarlsson/ai/embeddings" "github.com/joakimcarlsson/ai/integrations/pgvector" "github.com/joakimcarlsson/ai/integrations/postgres" "github.com/joakimcarlsson/ai/model" llm "github.com/joakimcarlsson/ai/providers" ) func main() { ctx := context.Background() connString := "postgres://postgres:password@localhost:5432/example?sslmode=disable" embedder, err := embeddings.NewEmbedding( model.ProviderOpenAI, embeddings.WithAPIKey(os.Getenv("OPENAI_API_KEY")), embeddings.WithModel(model.OpenAIEmbeddingModels[model.TextEmbedding3Small]), ) if err != nil { log.Fatal(err) } llmClient, err := llm.NewLLM( model.ProviderOpenAI, llm.WithAPIKey(os.Getenv("OPENAI_API_KEY")), llm.WithModel(model.OpenAIModels[model.GPT4o]), ) if err != nil { log.Fatal(err) } sessionStore, err := postgres.SessionStore(ctx, connString) if err != nil { log.Fatal(err) } memoryStore, err := pgvector.MemoryStore(ctx, connString, embedder) if err != nil { log.Fatal(err) } myAgent := agent.New(llmClient, agent.WithSystemPrompt("You are a personal assistant with memory."), agent.WithSession("conv-1", sessionStore), agent.WithMemory("alice", memoryStore, memory.AutoExtract(), memory.AutoDedup(), ), ) response, err := myAgent.Chat(ctx, "Hi! My name is Alice and I love Italian food.") if err != nil { log.Fatal(err) } fmt.Println(response.Content) } ``` --- ## Integrations > SQLite > Source: integrations/sqlite.md # SQLite SQLite-backed session store for lightweight persistent conversation history. Bring your own `*sql.DB` connection with any SQLite driver. ## Installation ```bash go get github.com/joakimcarlsson/ai/integrations/sqlite ``` ## Setup ```go import ( "database/sql" _ "modernc.org/sqlite" // or any SQLite driver "github.com/joakimcarlsson/ai/integrations/sqlite" ) db, err := sql.Open("sqlite", "./chat.db") if err != nil { log.Fatal(err) } sessionStore, err := sqlite.SessionStore(ctx, db) if err != nil { log.Fatal(err) } myAgent := agent.New(llmClient, agent.WithSession("conv-1", sessionStore), ) ``` Tables and indexes are created automatically on first use. ## Schema ```sql CREATE TABLE sessions ( id TEXT PRIMARY KEY, created_at INTEGER NOT NULL ); CREATE TABLE messages ( id INTEGER PRIMARY KEY AUTOINCREMENT, session_id TEXT NOT NULL REFERENCES sessions(id) ON DELETE CASCADE, role TEXT NOT NULL, parts TEXT NOT NULL, model TEXT, created_at INTEGER NOT NULL ); CREATE INDEX idx_messages_session ON messages(session_id, id); ``` ## Options | Option | Description | |--------|-------------| | `sqlite.WithTablePrefix(prefix)` | Prefix for all table names. Useful for multi-tenant or multiple stores in one database | ```go store, err := sqlite.SessionStore(ctx, db, sqlite.WithTablePrefix("chat_"), ) // Creates "chat_sessions" and "chat_messages" instead of "sessions" and "messages" ``` ## Full Example ```go package main import ( "context" "database/sql" "fmt" "log" "os" _ "modernc.org/sqlite" "github.com/joakimcarlsson/ai/agent" "github.com/joakimcarlsson/ai/integrations/sqlite" "github.com/joakimcarlsson/ai/model" llm "github.com/joakimcarlsson/ai/providers" ) func main() { ctx := context.Background() db, err := sql.Open("sqlite", "./chat.db") if err != nil { log.Fatal(err) } defer db.Close() llmClient, err := llm.NewLLM( model.ProviderOpenAI, llm.WithAPIKey(os.Getenv("OPENAI_API_KEY")), llm.WithModel(model.OpenAIModels[model.GPT4o]), ) if err != nil { log.Fatal(err) } sessionStore, err := sqlite.SessionStore(ctx, db) if err != nil { log.Fatal(err) } myAgent := agent.New(llmClient, agent.WithSystemPrompt("You are a helpful assistant."), agent.WithSession("conv-1", sessionStore), ) response, err := myAgent.Chat(ctx, "Hello!") if err != nil { log.Fatal(err) } fmt.Println(response.Content) } ``` --- ## Integrations > pgvector > Source: integrations/pgvector.md # pgvector PostgreSQL-backed memory store using [pgvector](https://github.com/pgvector/pgvector) for semantic vector search. Stores facts as embeddings and retrieves them using cosine similarity with HNSW indexing. ## Prerequisites pgvector extension must be available in your PostgreSQL instance. The extension is enabled automatically on first use. ## Installation ```bash go get github.com/joakimcarlsson/ai/integrations/pgvector ``` ## Setup ```go import ( "github.com/joakimcarlsson/ai/integrations/pgvector" "github.com/joakimcarlsson/ai/agent/memory" ) memoryStore, err := pgvector.MemoryStore(ctx, "postgres://user:pass@localhost:5432/mydb?sslmode=disable", embedder) if err != nil { log.Fatal(err) } myAgent := agent.New(llmClient, agent.WithMemory("user-123", memoryStore, memory.AutoExtract(), memory.AutoDedup(), ), ) ``` The table, pgvector extension, and HNSW index are created automatically on first use. The vector dimension is auto-detected from the embedder's model configuration. ## Schema ```sql CREATE EXTENSION IF NOT EXISTS vector; CREATE TABLE memories ( id TEXT PRIMARY KEY, owner_id TEXT NOT NULL, content TEXT NOT NULL, vector vector(1536), -- dimension from embedder metadata JSONB, created_at TIMESTAMPTZ DEFAULT NOW() ); CREATE INDEX memories_owner_idx ON memories(owner_id); CREATE INDEX memories_vector_idx ON memories USING hnsw (vector vector_cosine_ops); ``` ## Options | Option | Description | |--------|-------------| | `pgvector.WithIDGenerator(fn)` | Custom ID generator for memory records. Default: UUID v4 | ```go store, err := pgvector.MemoryStore(ctx, connString, embedder, pgvector.WithIDGenerator(func() string { return myCustomID() }), ) ``` ## Full Example ```go package main import ( "context" "fmt" "log" "os" "github.com/joakimcarlsson/ai/agent" "github.com/joakimcarlsson/ai/agent/memory" "github.com/joakimcarlsson/ai/embeddings" "github.com/joakimcarlsson/ai/integrations/pgvector" "github.com/joakimcarlsson/ai/integrations/postgres" "github.com/joakimcarlsson/ai/model" llm "github.com/joakimcarlsson/ai/providers" ) func main() { ctx := context.Background() connString := "postgres://postgres:password@localhost:5432/example?sslmode=disable" embedder, err := embeddings.NewEmbedding( model.ProviderOpenAI, embeddings.WithAPIKey(os.Getenv("OPENAI_API_KEY")), embeddings.WithModel(model.OpenAIEmbeddingModels[model.TextEmbedding3Small]), ) if err != nil { log.Fatal(err) } llmClient, err := llm.NewLLM( model.ProviderOpenAI, llm.WithAPIKey(os.Getenv("OPENAI_API_KEY")), llm.WithModel(model.OpenAIModels[model.GPT4o]), ) if err != nil { log.Fatal(err) } // PostgreSQL sessions + pgvector memory sessionStore, err := postgres.SessionStore(ctx, connString) if err != nil { log.Fatal(err) } memoryStore, err := pgvector.MemoryStore(ctx, connString, embedder) if err != nil { log.Fatal(err) } myAgent := agent.New(llmClient, agent.WithSystemPrompt("You are a personal assistant with memory."), agent.WithSession("conv-1", sessionStore), agent.WithMemory("alice", memoryStore, memory.AutoExtract(), memory.AutoDedup(), ), ) // First conversation — agent learns facts response, err := myAgent.Chat(ctx, "Hi! My name is Alice and I love Italian food.") if err != nil { log.Fatal(err) } fmt.Println(response.Content) // New conversation — agent recalls memories via vector search agent2 := agent.New(llmClient, agent.WithSystemPrompt("You are a personal assistant with memory."), agent.WithSession("conv-2", sessionStore), agent.WithMemory("alice", memoryStore, memory.AutoExtract(), memory.AutoDedup(), ), ) response, err = agent2.Chat(ctx, "Can you recommend a restaurant for me?") if err != nil { log.Fatal(err) } fmt.Println(response.Content) } ``` --- ## Advanced > Batch Processing > Source: advanced/batch-processing.md # Batch Processing Process bulk LLM and embedding requests efficiently using provider-native batch APIs or bounded concurrent execution. ## Native Batch APIs Native batch APIs submit all requests as a single job that processes asynchronously on the provider side. Providers may offer reduced pricing for batch workloads (see the provider support table below for details). Results are typically returned within 24 hours, often much faster. ### OpenAI ```go import ( "github.com/joakimcarlsson/ai/batch" "github.com/joakimcarlsson/ai/model" ) proc, _ := batch.New( model.ProviderOpenAI, batch.WithAPIKey("your-api-key"), batch.WithModel(model.OpenAIModels[model.GPT4o]), batch.WithPollInterval(30 * time.Second), ) requests := []batch.Request{ { ID: "q1", Type: batch.RequestTypeChat, Messages: []message.Message{ message.NewUserMessage("What is the capital of France?"), }, }, { ID: "q2", Type: batch.RequestTypeChat, Messages: []message.Message{ message.NewUserMessage("What is the capital of Japan?"), }, }, } resp, err := proc.Process(ctx, requests) for _, r := range resp.Results { if r.Err != nil { fmt.Printf("[%s] Error: %v\n", r.ID, r.Err) continue } fmt.Printf("[%s] %s\n", r.ID, r.ChatResponse.Content) } ``` ### Anthropic ```go proc, _ := batch.New( model.ProviderAnthropic, batch.WithAPIKey("your-api-key"), batch.WithModel(model.AnthropicModels[model.Claude4Sonnet]), batch.WithMaxTokens(1024), batch.WithPollInterval(30 * time.Second), ) ``` ### Gemini / Vertex AI ```go proc, _ := batch.New( model.ProviderGemini, batch.WithAPIKey("your-api-key"), batch.WithModel(model.GeminiModels[model.Gemini25Flash]), batch.WithPollInterval(30 * time.Second), ) ``` ## Concurrent Fallback For providers without native batch APIs, pass an existing LLM client. Requests run concurrently with a configurable concurrency limit. ```go client, _ := llm.NewLLM(model.ProviderGroq, llm.WithAPIKey("your-api-key"), llm.WithModel(model.GroqModels[model.Llama4Scout]), ) proc, _ := batch.New( model.ProviderGroq, batch.WithLLM(client), batch.WithMaxConcurrency(10), ) resp, _ := proc.Process(ctx, requests) ``` ## Batch Embeddings ```go embedder, _ := embeddings.NewEmbedding(model.ProviderVoyage, embeddings.WithAPIKey("your-api-key"), embeddings.WithModel(model.VoyageEmbeddingModels[model.Voyage35]), ) proc, _ := batch.New( model.ProviderVoyage, batch.WithEmbedding(embedder), batch.WithMaxConcurrency(5), ) requests := []batch.Request{ {ID: "doc1", Type: batch.RequestTypeEmbedding, Texts: []string{"first document"}}, {ID: "doc2", Type: batch.RequestTypeEmbedding, Texts: []string{"second document"}}, } resp, _ := proc.Process(ctx, requests) ``` ## Provider Support | Provider | Native Batch | Discount (as of writing) | Supported Endpoints | |----------|-------------|--------------------------|---------------------| | OpenAI | ✅ | 50% | Chat, Embeddings | | Anthropic | ✅ | 50% | Messages | | Gemini | ✅ | 50% | Content, Embeddings | | Vertex AI | ✅ | ~50% | Content, Embeddings | | All others | Concurrent fallback | — | Chat, Embeddings | ## Progress Tracking ### Callback ```go proc, _ := batch.New( model.ProviderOpenAI, batch.WithAPIKey("your-api-key"), batch.WithModel(model.OpenAIModels[model.GPT4o]), batch.WithProgressCallback(func(p batch.Progress) { fmt.Printf("%d/%d completed, %d failed [%s]\n", p.Completed, p.Total, p.Failed, p.Status) }), ) ``` ### Async Channel ```go ch, err := proc.ProcessAsync(ctx, requests) for event := range ch { switch event.Type { case batch.EventItem: fmt.Printf("[%s] done\n", event.Result.ID) case batch.EventProgress: fmt.Printf("%d/%d\n", event.Progress.Completed, event.Progress.Total) case batch.EventComplete: fmt.Println("all done") case batch.EventError: fmt.Printf("batch error: %v\n", event.Err) } } ``` ## Error Handling Individual request failures never fail the batch. Each result carries its own error. ```go resp, err := proc.Process(ctx, requests) for _, r := range resp.Results { if r.Err != nil { continue } // use r.ChatResponse or r.EmbedResponse } fmt.Printf("Completed: %d, Failed: %d\n", resp.Completed, resp.Failed) ``` ## Options | Option | Description | Default | |--------|-------------|---------| | `WithAPIKey(key)` | API key for native batch providers | — | | `WithModel(model)` | LLM model for chat batch requests | — | | `WithEmbeddingModel(model)` | Embedding model for embedding batch requests | — | | `WithMaxTokens(n)` | Max tokens per request | 4096 | | `WithLLM(client)` | Existing LLM client for concurrent fallback | — | | `WithEmbedding(client)` | Existing embedding client for concurrent fallback | — | | `WithMaxConcurrency(n)` | Max parallel requests in concurrent mode | 10 | | `WithProgressCallback(fn)` | Progress update callback | — | | `WithPollInterval(d)` | Polling interval for native batch APIs | 30s | | `WithTimeout(d)` | Request timeout | — | | `WithOpenAIOptions(...)` | OpenAI-specific options (base URL, headers) | — | | `WithGeminiOptions(...)` | Gemini-specific options (backend) | — | --- ## Advanced > BYOM > Source: advanced/byom.md # BYOM (Bring Your Own Model) Use Ollama, LocalAI, vLLM, LM Studio, or any OpenAI-compatible inference server. ## Setup ```go // 1. Create model llamaModel := model.NewCustomModel( model.WithModelID("llama3.2"), model.WithAPIModel("llama3.2:latest"), ) // 2. Register provider ollama := llm.RegisterCustomProvider("ollama", llm.CustomProviderConfig{ BaseURL: "http://localhost:11434/v1", DefaultModel: llamaModel, }) // 3. Use it client, _ := llm.NewLLM(ollama) response, _ := client.SendMessages(ctx, messages, nil) ``` ## Supported Servers Any server that implements the OpenAI-compatible API: - **Ollama** — `http://localhost:11434/v1` - **LocalAI** — `http://localhost:8080/v1` - **vLLM** — `http://localhost:8000/v1` - **LM Studio** — `http://localhost:1234/v1` See `example/byom/main.go` for a complete example. --- ## Advanced > MCP Integration > Source: advanced/mcp.md # MCP (Model Context Protocol) Integration This library integrates with the official [Model Context Protocol Go SDK](https://github.com/modelcontextprotocol/go-sdk) to provide seamless access to MCP servers and their tools. ## Stdio Connection (subprocess) ```go import "github.com/joakimcarlsson/ai/tool" mcpServers := map[string]tool.MCPServer{ "filesystem": { Type: tool.MCPStdio, Command: "npx", Args: []string{"-y", "@modelcontextprotocol/server-filesystem", "/path/to/directory"}, Env: []string{"NODE_ENV=production"}, }, } mcpTools, err := tool.GetMcpTools(ctx, mcpServers) if err != nil { log.Fatal(err) } response, err := client.SendMessages(ctx, messages, mcpTools) defer tool.CloseMCPPool() ``` ## SSE Connection (HTTP) ```go mcpServers := map[string]tool.MCPServer{ "remote": { Type: tool.MCPSse, URL: "https://your-mcp-server.com/mcp", Headers: map[string]string{ "Authorization": "Bearer your-token", }, }, } mcpTools, err := tool.GetMcpTools(ctx, mcpServers) if err != nil { log.Fatal(err) } defer tool.CloseMCPPool() ``` ## Complete Example ```go package main import ( "context" "fmt" "log" "os" "github.com/joakimcarlsson/ai/message" "github.com/joakimcarlsson/ai/model" llm "github.com/joakimcarlsson/ai/providers" "github.com/joakimcarlsson/ai/tool" ) func main() { ctx := context.Background() mcpServers := map[string]tool.MCPServer{ "context7": { Type: tool.MCPStdio, Command: "npx", Args: []string{ "-y", "@upstash/context7-mcp", "--api-key", os.Getenv("CONTEXT7_API_KEY"), }, }, } mcpTools, err := tool.GetMcpTools(ctx, mcpServers) if err != nil { log.Fatal(err) } defer tool.CloseMCPPool() client, err := llm.NewLLM( model.ProviderOpenAI, llm.WithAPIKey(os.Getenv("OPENAI_API_KEY")), llm.WithModel(model.OpenAIModels[model.GPT4oMini]), ) if err != nil { log.Fatal(err) } messages := []message.Message{ message.NewUserMessage("Explain React hooks using Context7 to fetch the latest documentation"), } response, err := client.SendMessages(ctx, messages, mcpTools) if err != nil { log.Fatal(err) } fmt.Println(response.Content) } ``` ## StreamableHTTP Connection The newer MCP transport for HTTP-based servers: ```go mcpServers := map[string]tool.MCPServer{ "remote": { Type: tool.MCPStreamableHTTP, URL: "https://your-mcp-server.com/mcp", Headers: map[string]string{ "Authorization": "Bearer your-token", }, }, } mcpTools, err := tool.GetMcpTools(ctx, mcpServers) defer tool.CloseMCPPool() ``` ## Transport Types | Type | Constant | Use Case | |------|----------|----------| | Stdio | `tool.MCPStdio` | Local subprocess (e.g., `npx` commands) | | SSE | `tool.MCPSse` | HTTP server with Server-Sent Events | | StreamableHTTP | `tool.MCPStreamableHTTP` | HTTP server with streamable responses | ## MCPServer Config ```go type MCPServer struct { Command string // Stdio: command to run Args []string // Stdio: command arguments Env []string // Stdio: environment variables Type MCPType // Transport type URL string // SSE/StreamableHTTP: server URL Headers map[string]string // SSE/StreamableHTTP: custom HTTP headers } ``` ## Features - Supports stdio, SSE, and StreamableHTTP transports - Connection pooling for efficient reuse of MCP server connections - Custom HTTP headers for authentication on remote servers - Automatic tool discovery and registration - Compatible with all official MCP servers - Tools are namespaced with server name (e.g., `context7_search`) - Graceful cleanup with `CloseMCPPool()` --- ## Advanced > Tool Calling > Source: advanced/tools.md # Tool Calling ## Defining a Tool ```go import "github.com/joakimcarlsson/ai/tool" type WeatherParams struct { Location string `json:"location" desc:"City name"` Units string `json:"units" desc:"Temperature units" enum:"celsius,fahrenheit" required:"false"` } type WeatherTool struct{} func (w *WeatherTool) Info() tool.Info { return tool.NewInfo("get_weather", "Get current weather for a location", WeatherParams{}) } func (w *WeatherTool) Run(ctx context.Context, params tool.Call) (tool.Response, error) { var input WeatherParams json.Unmarshal([]byte(params.Input), &input) return tool.NewTextResponse("Sunny, 22°C"), nil } ``` ## Function Tools For simple tools that are just a function, use `functiontool.New` to skip the struct boilerplate: ```go import "github.com/joakimcarlsson/ai/tool/functiontool" type WeatherParams struct { Location string `json:"location" desc:"City name"` Units string `json:"units" desc:"Temperature units" enum:"celsius,fahrenheit" required:"false"` } weatherTool := functiontool.New("get_weather", "Get current weather for a location", func(ctx context.Context, p WeatherParams) (string, error) { return fmt.Sprintf("Sunny, 22°C in %s", p.Location), nil }, ) ``` The JSON schema is inferred from the parameter struct using the same struct tags as `tool.NewInfo`. The result is a standard `BaseTool` that works with the registry, toolsets, hooks, and agent system. ### Supported Signatures The function's first parameter can optionally be `context.Context`, and the second can be a struct for input parameters. Both are optional: ```go // With context and params functiontool.New("name", "desc", func(ctx context.Context, p Params) (string, error) { ... }) // Params only (no context) functiontool.New("name", "desc", func(p Params) (string, error) { ... }) // Context only (no input schema) functiontool.New("name", "desc", func(ctx context.Context) (string, error) { ... }) // No inputs at all functiontool.New("name", "desc", func() (string, error) { ... }) ``` ### Return Types The first return value determines the response type: ```go // String → tool.NewTextResponse func(p Params) (string, error) // tool.Response → passed through directly func(p Params) (tool.Response, error) // Any other type → tool.NewJSONResponse (auto-marshaled) func(p Params) (MyStruct, error) ``` ### Options ```go // Require human confirmation before execution functiontool.New("delete", "Delete records", deleteFn, functiontool.WithConfirmation()) ``` ## Using Tools with LLM ```go weatherTool := &WeatherTool{} tools := []tool.BaseTool{weatherTool} response, err := client.SendMessages(ctx, messages, tools) ``` ## Struct Tag Schema Generation Generate JSON schemas automatically from Go structs: ```go type SearchParams struct { Query string `json:"query" desc:"Search query"` Limit int `json:"limit" desc:"Max results" required:"false"` Filters []string `json:"filters" desc:"Filter tags" required:"false"` } info := tool.NewInfo("search", "Search documents", SearchParams{}) ``` Supported tags: | Tag | Description | |-----|-------------| | `json` | Parameter name | | `desc` | Parameter description | | `required` | `"true"` or `"false"` (non-pointer fields default to required) | | `enum` | Comma-separated allowed values | ## Rich Tool Responses ```go // Text response tool.NewTextResponse("Result text") // JSON response (auto-marshals any value) tool.NewJSONResponse(map[string]any{"status": "ok", "count": 42}) // File/binary response tool.NewFileResponse(pdfBytes, "application/pdf") // Image response (base64) tool.NewImageResponse(base64ImageData) // Error response tool.NewTextErrorResponse("Something went wrong") ``` ## Parsing Tool Input The agent package provides a generic helper: ```go input, err := agent.ParseToolInput[WeatherParams](params.Input) ``` ## Requiring Confirmation Set `RequireConfirmation` on a tool's `Info` to require human approval before execution: ```go func (t *DeleteTool) Info() tool.Info { info := tool.NewInfo("delete_records", "Delete database records", DeleteParams{}) info.RequireConfirmation = true return info } ``` Tools can also request confirmation dynamically from within `Run()`: ```go func (t *TransferTool) Run(ctx context.Context, params tool.Call) (tool.Response, error) { if amount > 10000 { if err := tool.RequestConfirmation(ctx, "Large transfer", params); err != nil { return tool.Response{}, err } } // ... } ``` Both require a `ConfirmationProvider` on the agent. See [Tool Confirmation](../agent/confirmation.md) for the full protocol. ## Toolsets For grouping, filtering, and dynamically controlling which tools are available at runtime, see [Toolsets](../agent/toolsets.md). --- ## Advanced > Structured Output > Source: advanced/structured-output.md # Structured Output Constrained generation that forces the LLM to return valid JSON matching a schema. ## Usage ```go type CodeAnalysis struct { Language string `json:"language"` Functions []string `json:"functions"` Complexity string `json:"complexity"` } schema := &schema.StructuredOutputInfo{ Name: "code_analysis", Description: "Analyze code structure", Parameters: map[string]any{ "language": map[string]any{ "type": "string", "description": "Programming language", }, "functions": map[string]any{ "type": "array", "items": map[string]any{"type": "string"}, "description": "List of function names", }, "complexity": map[string]any{ "type": "string", "enum": []string{"low", "medium", "high"}, }, }, Required: []string{"language", "functions", "complexity"}, } response, err := client.SendMessagesWithStructuredOutput(ctx, messages, nil, schema) if err != nil { log.Fatal(err) } var analysis CodeAnalysis json.Unmarshal([]byte(*response.StructuredOutput), &analysis) ``` > **Note:** > Structured output is supported by OpenAI, Gemini, Azure OpenAI, Vertex AI, Groq, OpenRouter, and xAI. Anthropic and AWS Bedrock do not currently support it. > --- ## Advanced > Cost Tracking > Source: advanced/cost-tracking.md # Cost Tracking All models include built-in pricing information for cost calculation. ## LLM Models ```go model := model.OpenAIModels[model.GPT4o] fmt.Printf("Input cost: $%.2f per 1M tokens\n", model.CostPer1MIn) fmt.Printf("Output cost: $%.2f per 1M tokens\n", model.CostPer1MOut) response, err := client.SendMessages(ctx, messages, nil) inputCost := float64(response.Usage.InputTokens) * model.CostPer1MIn / 1_000_000 outputCost := float64(response.Usage.OutputTokens) * model.CostPer1MOut / 1_000_000 ``` ## Image Generation Models ```go model := model.OpenAIImageGenerationModels[model.DALLE3] // Pricing structure: size -> quality -> cost standardCost := model.Pricing["1024x1024"]["standard"] // $0.04 hdCost := model.Pricing["1024x1024"]["hd"] // $0.08 // GPT Image 1 with multiple quality tiers gptImageModel := model.OpenAIImageGenerationModels[model.GPTImage1] lowCost := gptImageModel.Pricing["1024x1024"]["low"] // $0.011 mediumCost := gptImageModel.Pricing["1024x1024"]["medium"] // $0.042 highCost := gptImageModel.Pricing["1024x1024"]["high"] // $0.167 ``` ## Audio Generation Models ```go model := model.ElevenLabsAudioModels[model.ElevenTurboV2_5] fmt.Printf("Cost per 1M chars: $%.2f\n", model.CostPer1MChars) fmt.Printf("Max characters per request: %d\n", model.MaxCharacters) fmt.Printf("Supports streaming: %v\n", model.SupportsStreaming) response, err := client.GenerateAudio(ctx, text, audio.WithVoiceID("voice-id")) cost := float64(response.Usage.Characters) * model.CostPer1MChars / 1_000_000 fmt.Printf("Cost: $%.4f\n", cost) ``` --- ## Advanced > Prompt Templates > Source: advanced/prompt-templates.md # Prompt Templates A template engine for building dynamic prompts with variable substitution, built-in functions, caching, and validation. ## Basic Usage ```go import "github.com/joakimcarlsson/ai/prompt" result, err := prompt.Process("Hello, {{.name}}!", map[string]any{ "name": "World", }) // "Hello, World!" ``` ## Reusable Templates ```go tmpl, err := prompt.New("You are {{.role}}. Help with {{.task}}.") if err != nil { log.Fatal(err) } result, err := tmpl.Process(map[string]any{ "role": "a coding assistant", "task": "debugging", }) // "You are a coding assistant. Help with debugging." ``` ## Caching Thread-safe template caching avoids re-parsing the same template repeatedly. ```go cache := prompt.NewCache() tmpl, err := prompt.New("You are {{.role}}.", prompt.WithCache(cache), prompt.WithName("system"), // cache key ) ``` When using a cache without `WithName`, the template source is hashed automatically as the cache key. ## Validation Require specific variables to be present in the data map: ```go _, err := prompt.Process("Hello, {{.name}}!", map[string]any{}, prompt.WithRequired("name"), ) // error: missing required variables: name ``` ## Strict Mode Error on any missing variable instead of using zero values: ```go tmpl, err := prompt.New("{{.name}} is {{.age}} years old.", prompt.WithStrictMode(), ) _, err = tmpl.Process(map[string]any{"name": "Alice"}) // error: template execution fails because .age is missing ``` ## Built-in Functions ### String | Function | Description | Example | |----------|-------------|---------| | `upper` | Uppercase | `{{upper .name}}` | | `lower` | Lowercase | `{{lower .name}}` | | `title` | Title case | `{{title .name}}` | | `trim` | Trim whitespace | `{{trim .text}}` | | `trimPrefix` | Remove prefix | `{{trimPrefix "Mr. " .name}}` | | `trimSuffix` | Remove suffix | `{{trimSuffix "." .text}}` | | `replace` | Replace all | `{{replace "old" "new" .text}}` | | `contains` | Check substring | `{{if contains .text "error"}}...{{end}}` | | `hasPrefix` | Check prefix | `{{if hasPrefix .name "Dr."}}...{{end}}` | | `hasSuffix` | Check suffix | `{{if hasSuffix .file ".go"}}...{{end}}` | ### Collections | Function | Description | Example | |----------|-------------|---------| | `join` | Join slice | `{{join ", " .items}}` | | `split` | Split string | `{{split "," .csv}}` | | `first` | First element | `{{first .items}}` | | `last` | Last element | `{{last .items}}` | | `list` | Create slice | `{{list "a" "b" "c"}}` | ### Comparison | Function | Description | Example | |----------|-------------|---------| | `eq` | Equal | `{{if eq .role "admin"}}...{{end}}` | | `ne` / `neq` | Not equal | `{{if ne .status "done"}}...{{end}}` | | `lt` | Less than | `{{if lt .count 10}}...{{end}}` | | `le` | Less or equal | `{{if le .count 10}}...{{end}}` | | `gt` | Greater than | `{{if gt .count 0}}...{{end}}` | | `ge` | Greater or equal | `{{if ge .count 1}}...{{end}}` | ### Defaults | Function | Description | Example | |----------|-------------|---------| | `default` | Default value | `{{default "anonymous" .name}}` | | `coalesce` | First non-empty | `{{coalesce .nickname .name "unknown"}}` | | `empty` | Check if empty | `{{if empty .list}}...{{end}}` | | `ternary` | Conditional | `{{ternary .admin "admin" "user"}}` | ### Formatting | Function | Description | Example | |----------|-------------|---------| | `indent` | Indent text | `{{indent 4 .code}}` | | `nindent` | Newline + indent | `{{nindent 4 .code}}` | | `quote` | Double quote | `{{quote .name}}` | | `squote` | Single quote | `{{squote .name}}` | ## Custom Functions Add your own template functions: ```go import "text/template" result, err := prompt.Process("{{shout .name}}", data, prompt.WithFuncs(template.FuncMap{ "shout": func(s string) string { return strings.ToUpper(s) + "!!!" }, }), ) ``` ## Options | Option | Description | |--------|-------------| | `prompt.WithCache(c)` | Enable template caching | | `prompt.WithName(name)` | Set template name (used as cache key) | | `prompt.WithRequired(vars...)` | Require specific variables | | `prompt.WithStrictMode()` | Error on missing variables | | `prompt.WithFuncs(funcs)` | Add custom template functions | ## With Agent Instruction Templates The prompt package powers the agent's [instruction templates](../agent/instruction-templates.md) feature: ```go myAgent := agent.New(llmClient, agent.WithSystemPrompt("You are {{.role}}. The user's name is {{.userName}}."), agent.WithState(map[string]any{ "role": "a helpful assistant", "userName": "Alice", }), ) ``` --- ## Advanced > OpenTelemetry Tracing > Source: advanced/tracing.md # OpenTelemetry Tracing Built-in OpenTelemetry instrumentation for all provider calls and agent execution. Includes traces, metrics, and structured log records following [GenAI semantic conventions](https://opentelemetry.io/docs/specs/semconv/gen-ai/). When no providers are configured, everything is a zero-cost no-op. ## Setup Use the built-in setup helper to initialize traces, metrics, and logs in one call: ```go import ( "github.com/joakimcarlsson/ai/tracing" "go.opentelemetry.io/otel/sdk/resource" semconv "go.opentelemetry.io/otel/semconv/v1.36.0" ) res, _ := resource.New(ctx, resource.WithAttributes( semconv.ServiceNameKey.String("my-ai-service"), semconv.ServiceVersionKey.String("1.0.0"), )) providers, _ := tracing.New(ctx, tracing.WithResource(res), tracing.WithOTLPEndpoint("localhost:4318"), ) defer providers.Shutdown(ctx) ``` This creates and globally registers a `TracerProvider`, `MeterProvider`, and `LoggerProvider` — all configured with OTLP HTTP exporters pointing at the given endpoint. ### Setup Options | Option | Description | |--------|-------------| | `WithResource(r)` | Set service resource (name, version, environment) | | `WithOTLPEndpoint(url)` | Configure OTLP HTTP exporters for all signals | | `WithSpanProcessors(p...)` | Register custom span processors | | `WithMetricReaders(r...)` | Register custom metric readers | | `WithLogProcessors(p...)` | Register custom log processors | If no `WithOTLPEndpoint` is provided, the helper checks the `OTEL_EXPORTER_OTLP_ENDPOINT` environment variable. ### Manual Setup You can also configure providers manually using the standard OpenTelemetry SDK: ```go import ( "go.opentelemetry.io/otel" "go.opentelemetry.io/otel/exporters/stdout/stdouttrace" sdktrace "go.opentelemetry.io/otel/sdk/trace" ) exporter, _ := stdouttrace.New(stdouttrace.WithPrettyPrint()) tp := sdktrace.NewTracerProvider(sdktrace.WithBatcher(exporter)) defer tp.Shutdown(ctx) otel.SetTracerProvider(tp) ``` All subsequent LLM calls, tool executions, and agent runs will produce spans and metrics. ## Span Hierarchy When using the agent framework, spans form a parent-child tree: ``` invoke_agent {agent_name} ├── generate_content {model} (LLM turn 1) ├── execute_tool {tool_name} (single tool call) ├── generate_content {model} (LLM turn 2) └── ... ``` When the LLM requests multiple tool calls at once, they are grouped under a parent span: ``` invoke_agent {agent_name} ├── generate_content {model} ├── execute_tools (merged parent for 2+ tools) │ ├── execute_tool {tool_a} │ └── execute_tool {tool_b} └── generate_content {model} ``` When using providers standalone (no agent), each call produces a root span: ``` generate_content {model} generate_embeddings {model} rerank {model} generate_audio {model} generate_image {model} transcribe {model} fim_complete {model} ``` ## Instrumented Operations Every provider package is instrumented at the public API level — one span per call, covering all underlying providers. | Package | Span Name | Methods | |---------|-----------|---------| | `providers` (LLM) | `generate_content` | `SendMessages`, `StreamResponse`, and structured output variants | | `embeddings` | `generate_embeddings` | `GenerateEmbeddings`, `GenerateMultimodalEmbeddings`, `GenerateContextualizedEmbeddings` | | `rerankers` | `rerank` | `Rerank` | | `audio` | `generate_audio` | `GenerateAudio`, `StreamAudio` | | `image_generation` | `generate_image` | `GenerateImage`, `GenerateImageStreaming` | | `transcription` | `transcribe` / `translate` | `Transcribe`, `Translate` | | `fim` | `fim_complete` | `Complete`, `CompleteStream` | | `agent` | `invoke_agent` | `Chat`, `ChatStream`, `Continue`, `ContinueStream` | | `agent` (tools) | `execute_tool` | Each tool call during agent execution | ## Span Attributes Spans carry GenAI semantic convention attributes. ### LLM (`generate_content`) | Attribute | When | |-----------|------| | `gen_ai.system` | Always | | `gen_ai.request.model` | Always | | `gen_ai.request.max_tokens` | Always | | `gen_ai.request.temperature` | If set | | `gen_ai.request.top_p` | If set | | `gen_ai.usage.input_tokens` | On completion | | `gen_ai.usage.output_tokens` | On completion | | `gen_ai.usage.cache_creation_tokens` | If non-zero | | `gen_ai.usage.cache_read_tokens` | If non-zero | | `gen_ai.response.finish_reason` | On completion | | `gen_ai.response.tool_call_count` | If tool calls present | ### Agent (`invoke_agent`) | Attribute | When | |-----------|------| | `gen_ai.agent.name` | Always | | `gen_ai.usage.input_tokens` | On completion (aggregated) | | `gen_ai.usage.output_tokens` | On completion (aggregated) | | `gen_ai.agent.total_turns` | On completion | | `gen_ai.agent.total_tool_calls` | On completion | ### Tool (`execute_tool`) | Attribute | When | |-----------|------| | `gen_ai.tool.name` | Always | | `gen_ai.tool.call_id` | Always | ## Streaming Streaming calls (`StreamResponse`, `StreamAudio`, `CompleteStream`) are fully traced. The span covers the entire stream lifetime — from the initial call until the channel closes. Response attributes (token usage, finish reason) are recorded when the final event arrives. ## Metrics Every provider call records two metrics via the global `MeterProvider`: | Metric | Type | Unit | Description | |--------|------|------|-------------| | `gen_ai.client.operation.duration` | Float64Histogram | `s` | Duration of each provider call | | `gen_ai.client.token.usage` | Int64Counter | `{token}` | Token consumption per call | Both metrics carry these attributes: | Attribute | Description | |-----------|-------------| | `gen_ai.operation.name` | Operation type (`generate_content`, `generate_embeddings`, `rerank`, etc.) | | `gen_ai.system` | Provider name (`openai`, `anthropic`, `voyage`, etc.) | | `gen_ai.request.model` | Model identifier | | `error.type` | Error message (only on failed calls) | The token usage counter additionally carries `gen_ai.token.type` (`input` or `output`) to distinguish token direction. Token metrics are only recorded when the count is non-zero. ### Metrics Setup Metrics work the same as traces — configure a global `MeterProvider`: ```go import ( "go.opentelemetry.io/otel" sdkmetric "go.opentelemetry.io/otel/sdk/metric" "go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp" ) exporter, _ := otlpmetrichttp.New(ctx) mp := sdkmetric.NewMeterProvider(sdkmetric.WithReader( sdkmetric.NewPeriodicReader(exporter), )) defer mp.Shutdown(ctx) otel.SetMeterProvider(mp) ``` ## Log Records LLM calls emit OpenTelemetry log records tied to the active span. Log bodies are structured JSON following GenAI semantic conventions: | Event Name | Body Structure | |------------|----------------| | `gen_ai.system.message` | `{"content": "..."}` | | `gen_ai.user.message` | `{"content": "..."}` | | `gen_ai.choice` | `{"index": 0, "content": "...", "finish_reason": "..."}` | Log records require a global `LoggerProvider` to be configured. Without one, they are silently dropped. ### Content Capture Message content is **elided by default** for privacy. To include actual message content in log records, set: ```bash export OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT=true ``` When disabled, log bodies contain `` instead of the actual content. ## Retry Visibility When a provider call is retried (rate limits, transient errors), each retry attempt is recorded as a span event on the `generate_content` span: ``` Event: "retry" attempt = 1 retry_after_ms = 2000 error = "429 Too Many Requests" ``` This gives visibility into retries without creating additional spans, making it easy to diagnose latency spikes caused by rate limiting. ## OTLP Export To export traces to Jaeger, Grafana Tempo, Datadog, or any OTLP-compatible backend: ```go import ( "go.opentelemetry.io/otel" "go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp" "go.opentelemetry.io/otel/sdk/resource" sdktrace "go.opentelemetry.io/otel/sdk/trace" semconv "go.opentelemetry.io/otel/semconv/v1.36.0" ) exporter, _ := otlptracehttp.New(ctx) res, _ := resource.New(ctx, resource.WithAttributes( semconv.ServiceNameKey.String("my-ai-service"), semconv.ServiceVersionKey.String("1.0.0"), ), ) tp := sdktrace.NewTracerProvider( sdktrace.WithBatcher(exporter), sdktrace.WithResource(res), ) defer tp.Shutdown(ctx) otel.SetTracerProvider(tp) ``` Configure the OTLP endpoint via environment variable: ```bash export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 ``` ## Standalone Provider Tracing Tracing works without the agent framework. Any provider call creates spans and records metrics automatically: ```go otel.SetTracerProvider(tp) client, _ := llm.NewLLM(model.ProviderAnthropic, llm.WithAPIKey(os.Getenv("ANTHROPIC_API_KEY")), llm.WithModel(model.AnthropicModels[model.Claude4Sonnet]), ) // This call produces a "generate_content claude-sonnet-4-6-20250514" span // and records duration + token usage metrics response, _ := client.SendMessages(ctx, messages, nil) ``` The same applies to embeddings, audio, image generation, transcription, rerankers, and FIM. ## Complete Example ```go package main import ( "context" "fmt" "log" "os" "go.opentelemetry.io/otel/sdk/resource" semconv "go.opentelemetry.io/otel/semconv/v1.36.0" "github.com/joakimcarlsson/ai/agent" "github.com/joakimcarlsson/ai/model" llm "github.com/joakimcarlsson/ai/providers" "github.com/joakimcarlsson/ai/tool/functiontool" "github.com/joakimcarlsson/ai/tracing" ) func main() { ctx := context.Background() res, _ := resource.New(ctx, resource.WithAttributes( semconv.ServiceNameKey.String("my-ai-service"), )) providers, _ := tracing.New(ctx, tracing.WithResource(res), tracing.WithOTLPEndpoint("localhost:4318"), ) defer func() { _ = providers.Shutdown(ctx) }() client, _ := llm.NewLLM(model.ProviderOpenAI, llm.WithAPIKey(os.Getenv("OPENAI_API_KEY")), llm.WithModel(model.OpenAIModels[model.GPT5Nano]), ) timeTool := functiontool.New( "get_time", "Get the current time", func(_ context.Context, p struct{}) (string, error) { return "14:30 UTC", nil }, ) myAgent := agent.New(client, agent.WithTools(timeTool), ) resp, err := myAgent.Chat(ctx, "What time is it?") if err != nil { log.Fatal(err) } fmt.Println(resp.Content) } ``` This produces spans: ``` invoke_agent ├── generate_content gpt-5-nano ├── execute_tool get_time └── generate_content gpt-5-nano ``` --- ## Advanced > Configuration > Source: advanced/configuration.md # Configuration ## LLM Client Options ```go client, err := llm.NewLLM( model.ProviderOpenAI, llm.WithAPIKey("your-key"), llm.WithModel(model.OpenAIModels[model.GPT4o]), llm.WithMaxTokens(2000), llm.WithTemperature(0.7), llm.WithTopP(0.9), llm.WithTimeout(30*time.Second), llm.WithStopSequences("STOP", "END"), ) ``` ## Embedding Client Options ```go embedder, err := embeddings.NewEmbedding( model.ProviderVoyage, embeddings.WithAPIKey(""), embeddings.WithModel(model.VoyageEmbeddingModels[model.Voyage35]), embeddings.WithBatchSize(100), embeddings.WithTimeout(30*time.Second), embeddings.WithVoyageOptions( embeddings.WithInputType("document"), embeddings.WithOutputDimension(1024), embeddings.WithOutputDtype("float"), ), ) ``` ## Reranker Client Options ```go reranker, err := rerankers.NewReranker( model.ProviderVoyage, rerankers.WithAPIKey(""), rerankers.WithModel(model.VoyageRerankerModels[model.Rerank25Lite]), rerankers.WithTopK(10), rerankers.WithReturnDocuments(true), rerankers.WithTruncation(true), rerankers.WithTimeout(30*time.Second), ) ``` ## Image Generation Client Options ```go // OpenAI/xAI client, err := image_generation.NewImageGeneration( model.ProviderOpenAI, image_generation.WithAPIKey("your-key"), image_generation.WithModel(model.OpenAIImageGenerationModels[model.DALLE3]), image_generation.WithTimeout(60*time.Second), image_generation.WithOpenAIOptions( image_generation.WithOpenAIBaseURL("custom-endpoint"), ), ) // Gemini client, err := image_generation.NewImageGeneration( model.ProviderGemini, image_generation.WithAPIKey("your-key"), image_generation.WithModel(model.GeminiImageGenerationModels[model.Imagen4]), image_generation.WithTimeout(60*time.Second), image_generation.WithGeminiOptions( image_generation.WithGeminiBackend(genai.BackendVertexAI), ), ) ``` ## Audio Generation Client Options ```go client, err := audio.NewAudioGeneration( model.ProviderElevenLabs, audio.WithAPIKey("your-key"), audio.WithModel(model.ElevenLabsAudioModels[model.ElevenTurboV2_5]), audio.WithTimeout(30*time.Second), audio.WithElevenLabsOptions( audio.WithElevenLabsBaseURL("custom-endpoint"), ), ) ``` ## Speech-to-Text Client Options ```go client, err := transcription.NewSpeechToText( model.ProviderOpenAI, transcription.WithAPIKey("your-key"), transcription.WithModel(model.OpenAITranscriptionModels[model.GPT4oTranscribe]), transcription.WithTimeout(30*time.Second), ) ``` ## Provider-Specific Options ```go // Anthropic llm.WithAnthropicOptions( llm.WithAnthropicBeta("beta-feature"), llm.WithAnthropicBedrock(true), llm.WithAnthropicDisableCache(), llm.WithAnthropicShouldThinkFn(func(userMsg string) bool { return strings.Contains(userMsg, "think") }), ) // OpenAI llm.WithOpenAIOptions( llm.WithOpenAIBaseURL("custom-endpoint"), llm.WithOpenAIExtraHeaders(map[string]string{"Custom-Header": "value"}), llm.WithOpenAIDisableCache(), llm.WithReasoningEffort("high"), // "low", "medium", "high" llm.WithOpenAIFrequencyPenalty(0.5), llm.WithOpenAIPresencePenalty(0.3), llm.WithOpenAISeed(42), llm.WithOpenAIParallelToolCalls(false), ) // Gemini llm.WithGeminiOptions( llm.WithGeminiDisableCache(), llm.WithGeminiFrequencyPenalty(0.5), llm.WithGeminiPresencePenalty(0.3), llm.WithGeminiSeed(42), ) // Azure OpenAI llm.WithAzureOptions( llm.WithAzureEndpoint("https://your-resource.openai.azure.com"), llm.WithAzureAPIVersion("2024-02-15-preview"), ) // Bedrock (via Anthropic) llm.WithAnthropicOptions( llm.WithAnthropicBedrock(true), ) llm.WithBedrockOptions(...) ``` ## Retry Configuration All LLM providers include automatic retry with exponential backoff and jitter. Each provider has optimized defaults: ```go // Default retry config (used by most providers) llm.DefaultRetryConfig() // retries: 429, 500, 502, 503, 504 // Provider-specific configs llm.OpenAIRetryConfig() // retries: 429, 500 llm.AnthropicRetryConfig() // retries: 429, 529 llm.GeminiRetryConfig() // no Retry-After header support llm.MistralRetryConfig() // retries: 429, 500, 502, 503 ``` | Setting | Default | Description | |---------|---------|-------------| | `MaxRetries` | 3 | Maximum retry attempts | | `BaseBackoffMs` | 2000 | Initial backoff in milliseconds | | `JitterPercent` | 0.2 | Jitter added to backoff (20%) | | `RetryStatusCodes` | varies | HTTP status codes that trigger retries | | `CheckRetryAfter` | true | Respect the `Retry-After` header | Retries use exponential backoff: `base * 2^(attempt-1) + jitter`. When `CheckRetryAfter` is enabled and the server sends a `Retry-After` header, that value takes precedence. ## Agent Options See the [Agent Framework Overview](../agent/overview.md) for a full table of agent configuration options.