Image Generation
The image modality. Vendors live under image/.
image.GenerateImage(ctx, prompt) takes only a prompt — every vendor knob
(size, aspect ratio, quality, response format, style, seed, safety, …) lives
on the vendor's Options and is set at construction. Image generation is
"configure once, prompt many" and vendor request bodies don't share enough
common shape to support a portable per-call surface.
OpenAI
import (
"github.com/joakimcarlsson/ai/image"
imageopenai "github.com/joakimcarlsson/ai/image/openai"
"github.com/joakimcarlsson/ai/model"
)
client := imageopenai.NewGeneration(
imageopenai.WithAPIKey(os.Getenv("OPENAI_API_KEY")),
imageopenai.WithModel(model.OpenAIImageGenerationModels[model.GPTImage15]),
imageopenai.WithSize(imageopenai.Size1024x1024),
imageopenai.WithQuality(imageopenai.QualityHigh),
imageopenai.WithBackground(imageopenai.BackgroundTransparent),
imageopenai.WithOutputFormat(imageopenai.OutputFormatPNG),
)
resp, err := client.GenerateImage(ctx, "A serene mountain landscape at sunset")
if err != nil {
log.Fatal(err)
}
data, _ := image.DecodeBase64Image(resp.Images[0].ImageBase64)
os.WriteFile("output.png", data, 0644)
Full option set (typed enums — see the package's exported Size, Quality,
Background, Moderation, OutputFormat types):
imageopenai.WithN(int) // 1–10
imageopenai.WithSize(imageopenai.Size1024x1024) // 1024x1024 | 1024x1536 | 1536x1024 | auto
imageopenai.WithQuality(imageopenai.QualityHigh) // low | medium | high | auto
imageopenai.WithBackground(imageopenai.BackgroundAuto) // transparent | opaque | auto — gpt-image-1.5 only (gpt-image-2 rejects)
imageopenai.WithModeration(imageopenai.ModerationAuto) // auto | low
imageopenai.WithOutputFormat(imageopenai.OutputFormatPNG) // png | jpeg | webp
imageopenai.WithOutputCompression(int) // 0–100 — jpeg/webp only
imageopenai.WithUser(string) // end-user identifier
imageopenai.WithStreamingOptions(...) // partial-image count for streaming
Supported models: gpt-image-1.5 and gpt-image-2. DALL-E 2/3 and gpt-image-1
(plus mini) are removed; pricing-registry entries dropped along with the
matching package code paths.
Gemini / Imagen
import imagegemini "github.com/joakimcarlsson/ai/image/gemini"
client := imagegemini.NewGeneration(
imagegemini.WithAPIKey(os.Getenv("GEMINI_API_KEY")),
imagegemini.WithModel(model.GeminiImageGenerationModels[model.Imagen4]),
imagegemini.WithAspectRatio(imagegemini.AspectRatio16x9),
imagegemini.WithN(2),
)
resp, err := client.GenerateImage(ctx, "A cyberpunk cityscape")
for i, img := range resp.Images {
data, _ := image.DecodeBase64Image(img.ImageBase64)
os.WriteFile(fmt.Sprintf("image_%d.png", i), data, 0644)
}
Full option set (Imagen-only fields are ignored when the active model is a Gemini Image variant):
import "google.golang.org/genai"
imagegemini.WithN(int32) // Imagen: 1–4
imagegemini.WithAspectRatio(imagegemini.AspectRatio16x9) // see imagegemini.AspectRatio*
imagegemini.WithNegativePrompt(string) // Imagen only
imagegemini.WithSeed(int32) // Imagen only (requires AddWatermark=false)
imagegemini.WithPersonGeneration(genai.PersonGenerationAllowAdult) // both paths
imagegemini.WithSafetyFilterLevel(genai.SafetyFilterLevelBlockOnlyHigh) // Imagen only
imagegemini.WithLanguage(genai.ImagePromptLanguageEn) // Imagen only
imagegemini.WithEnhancePrompt(bool) // Imagen only
imagegemini.WithImageSize(imagegemini.ImageSize2K) // 1K | 2K | 4K — model-dependent
imagegemini.WithIncludeRAIReason(bool) // Imagen only
imagegemini.WithOutputMIMEType(imagegemini.OutputMIMETypePNG) // image/png | image/jpeg — Imagen only
imagegemini.WithOutputCompressionQuality(int32) // 0–100 — Imagen jpeg only
xAI Grok Imagine
import imagexai "github.com/joakimcarlsson/ai/image/xai"
client := imagexai.NewGeneration(
imagexai.WithAPIKey(os.Getenv("XAI_API_KEY")),
imagexai.WithModel(model.XAIImageGenerationModels[model.XAIGrokImagineImage]),
imagexai.WithAspectRatio(imagexai.AspectRatio16x9),
imagexai.WithResolution(imagexai.Resolution2K),
imagexai.WithResponseFormat(imagexai.ResponseFormatBase64),
)
resp, err := client.GenerateImage(ctx, "A neon-lit street market")
Full option set:
imagexai.WithN(int) // 1–10
imagexai.WithAspectRatio(imagexai.AspectRatio16x9) // 14 values — see imagexai.AspectRatio*
imagexai.WithResolution(imagexai.Resolution2K) // 1K | 2K
imagexai.WithResponseFormat(imagexai.ResponseFormatBase64) // url | b64_json
imagexai.WithUser(string) // end-user identifier
Per-model capability data — including SupportedAspectRatios — lives on
model.ImageGenerationModel. Inspect it to know what a given model accepts:
m := model.GeminiImageGenerationModels[model.Imagen4]
fmt.Println(m.SupportedAspectRatios) // [1:1 3:4 4:3 9:16 16:9]
Streaming partial images (OpenAI gpt-image-*)
client := imageopenai.NewGeneration(
imageopenai.WithAPIKey(...),
imageopenai.WithModel(model.OpenAIImageGenerationModels[model.GPTImage15]),
imageopenai.WithStreamingOptions(imageopenai.StreamingOptions{PartialImages: 3}),
)
err := client.GenerateImageStreaming(ctx, prompt,
func(event image.StreamEvent) error {
switch event.Type {
case image.EventPartialImage:
data, _ := image.DecodeBase64Image(event.ImageBase64)
os.WriteFile(fmt.Sprintf("partial_%d.png", event.PartialImageIndex), data, 0644)
case image.EventCompleted:
data, _ := image.DecodeBase64Image(event.ImageBase64)
os.WriteFile("final.png", data, 0644)
}
return nil
},
)
Returns image.ErrStreamingNotSupported if the model can't stream.