Image Generation with Responses API
This guide demonstrates how to build an AI-powered application that combines web search and image generation capabilities using the Responses API.
Overview
The Responses API enables you to create multi-tool workflows that combine different capabilities. In this example, we'll:
- Use web search to find current information
- Generate an image based on that information
- Process and save the generated image
Prerequisites
Required Dependencies
Add these dependencies to your Cargo.toml:
[dependencies]
vllora_llm = "0.1.17"
tokio = { version = "1", features = ["full"] }
serde_json = "1.0"
base64 = "0.22"
Environment Setup
Set your API key as an environment variable:
export VLLORA_OPENAI_API_KEY="your-api-key-here"
Note: Make sure to keep your API key secure. Never commit it to version control or expose it in client-side code.
Building the Request
Creating the CreateResponse Structure
We'll create a request that uses both web search and image generation tools:
use vllora_llm::async_openai::types::responses::CreateResponse;
use vllora_llm::async_openai::types::responses::ImageGenTool;
use vllora_llm::async_openai::types::responses::InputParam;
use vllora_llm::async_openai::types::responses::Tool;
use vllora_llm::async_openai::types::responses::WebSearchTool;
let responses_req = CreateResponse {
model: Some("gpt-4.1".to_string()),
input: InputParam::Text(
"Search for the latest news from today and generate an image about it".to_string(),
),
tools: Some(vec![
Tool::WebSearch(WebSearchTool::default()),
Tool::ImageGeneration(ImageGenTool::default()),
]),
..Default::default()
};
Understanding the Components
Model Selection - We're using "gpt-4.1", which supports the Responses API and tool calling. Make sure to use a model that supports these features.
Input Parameter - We use InputParam::Text to provide a simple text prompt. The model will:
- First use the web search tool to find current news
- Then use the image generation tool to create an image related to that news
Tool Configuration - We specify two tools:
WebSearchTool::default()- Uses default web search configurationImageGenTool::default()- Uses default image generation settings
Initializing the Client
Set up the Vllora LLM client with your credentials:
use vllora_llm::client::VlloraLLMClient;
use vllora_llm::types::credentials::ApiKeyCredentials;
use vllora_llm::types::credentials::Credentials;
let client = VlloraLLMClient::default()
.with_credentials(Credentials::ApiKey(ApiKeyCredentials {
api_key: std::env::var("VLLORA_OPENAI_API_KEY")
.expect("VLLORA_OPENAI_API_KEY must be set"),
}));
Processing Text Messages
Extract and display text content from the response:
use vllora_llm::async_openai::types::responses::OutputItem;
use vllora_llm::async_openai::types::responses::OutputMessageContent;
for output in &response.output {
match output {
OutputItem::Message(message) => {
for content in &message.content {
match content {
OutputMessageContent::OutputText(text_output) => {
// Print the text content
println!("\n{}", text_output.text);
// Print sources/annotations if available
if !text_output.annotations.is_empty() {
println!("Annotations: {:#?}", text_output.annotations);
}
}
_ => {
println!("Other content type: {:?}", content);
}
}
}
}
// ... handle other output types
}
}
Annotations - Text outputs can include annotations which provide:
- Citations and sources (especially useful with web search)
- References to tool calls
- Additional metadata
Handling Image Generation Results
When the model uses the image generation tool, the response includes OutputItem::ImageGenerationCall variants. Each call contains:
- A
resultfield with the base64-encoded image data - Metadata about the generation
Decoding and Saving Images
Here's a complete function to decode and save generated images:
use vllora_llm::async_openai::types::responses::ImageGenToolCall;
use base64::{engine::general_purpose::STANDARD, Engine as _};
use std::fs;
/// Decodes a base64-encoded image from an ImageGenerationCall and saves it to a file.
///
/// # Arguments
/// * `image_generation_call` - The image generation call containing the base64-encoded image
/// * `index` - The index to use in the filename
///
/// # Returns
/// * `Ok(filename)` - The filename where the image was saved
/// * `Err(e)` - An error if the call has no result, decoding fails, or file writing fails
fn decode_and_save_image(
image_generation_call: &ImageGenToolCall,
index: usize,
) -> Result<String, Box<dyn std::error::Error>> {
// Extract base64 image from the call
let base64_image = image_generation_call
.result
.as_ref()
.ok_or("Image generation call has no result")?;
// Decode base64 image
let image_data = STANDARD.decode(base64_image)?;
// Save to file
let filename = format!("generated_image_{}.png", index);
fs::write(&filename, image_data)?;
Ok(filename)
}
Step-by-Step Breakdown
-
Extract Base64 Data - We access the
resultfield, which is anOption<String>. We use.ok_or()to convertNoneinto an error if the result is missing. -
Decode Base64 - The
base64crate'sSTANDARDengine decodes the base64 string into raw bytes. This can fail if the string is malformed, so we use?to propagate errors. -
Save to File - We use Rust's standard library
fs::write()to save the decoded bytes to a file. We name itgenerated_image_{index}.pngto avoid conflicts when multiple images are generated.
Complete Example
Here's the complete working example that puts it all together:
use vllora_llm::async_openai::types::responses::CreateResponse;
use vllora_llm::async_openai::types::responses::ImageGenTool;
use vllora_llm::async_openai::types::responses::ImageGenToolCall;
use vllora_llm::async_openai::types::responses::InputParam;
use vllora_llm::async_openai::types::responses::OutputItem;
use vllora_llm::async_openai::types::responses::OutputMessageContent;
use vllora_llm::async_openai::types::responses::Tool;
use vllora_llm::async_openai::types::responses::WebSearchTool;
use base64::{engine::general_purpose::STANDARD, Engine as _};
use std::fs;
use vllora_llm::client::VlloraLLMClient;
use vllora_llm::error::LLMResult;
use vllora_llm::types::credentials::ApiKeyCredentials;
use vllora_llm::types::credentials::Credentials;
fn decode_and_save_image(
image_generation_call: &ImageGenToolCall,
index: usize,
) -> Result<String, Box<dyn std::error::Error>> {
let base64_image = image_generation_call
.result
.as_ref()
.ok_or("Image generation call has no result")?;
let image_data = STANDARD.decode(base64_image)?;
let filename = format!("generated_image_{}.png", index);
fs::write(&filename, image_data)?;
Ok(filename)
}
#[tokio::main]
async fn main() -> LLMResult<()> {
// 1) Build a Responses-style request using async-openai-compat types
// with tools for web_search_preview and image_generation
let responses_req = CreateResponse {
model: Some("gpt-4.1".to_string()),
input: InputParam::Text(
"Search for the latest news from today and generate an image about it".to_string(),
),
tools: Some(vec![
Tool::WebSearch(WebSearchTool::default()),
Tool::ImageGeneration(ImageGenTool::default()),
]),
..Default::default()
};
// 2) Construct a VlloraLLMClient
let client =
VlloraLLMClient::default().with_credentials(Credentials::ApiKey(ApiKeyCredentials {
api_key: std::env::var("VLLORA_OPENAI_API_KEY")
.expect("VLLORA_OPENAI_API_KEY must be set"),
}));
// 3) Non-streaming: send the request and print the final reply
println!("Sending request with tools: web_search_preview and image_generation");
let response = client.responses().create(responses_req).await?;
println!("\nNon-streaming reply:");
println!("{}", "=".repeat(80));
for (index, output) in response.output.iter().enumerate() {
match output {
OutputItem::ImageGenerationCall(image_generation_call) => {
println!("\n[Image Generation Call {}]", index);
match decode_and_save_image(image_generation_call, index) {
Ok(filename) => {
println!("✓ Successfully saved image to: {}", filename);
}
Err(e) => {
eprintln!("✗ Failed to decode/save image: {}", e);
}
}
}
OutputItem::Message(message) => {
println!("\n[Message {}]", index);
println!("{}", "-".repeat(80));
for content in &message.content {
match content {
OutputMessageContent::OutputText(text_output) => {
// Print the text content
println!("\n{}", text_output.text);
// Print sources/annotations if available
if !text_output.annotations.is_empty() {
println!("Annotations: {:#?}", text_output.annotations);
}
}
_ => {
println!("Other content type: {:?}", content);
}
}
}
println!("\n{}", "=".repeat(80));
}
_ => {
println!("\n[Other Output {}]", index);
println!("{:?}", output);
}
}
}
Ok(())
}
Expected Output
When you run this example, you'll see output like:
Sending request with tools: web_search_preview and image_generation
Non-streaming reply:
================================================================================
[Message 0]
--------------------------------------------------------------------------------
Here's the latest news from today: [summary of current news]
Annotations: [citations and sources from web search]
================================================================================
[Image Generation Call 1]
✓ Successfully saved image to: generated_image_1.png
The actual news content and image will vary based on what's happening when you run it!
Execution Flow
- Request Construction - We build a
CreateResponsewith our prompt and tools - Client Initialization - We create and configure the Vllora LLM client
- API Call - We send the request and await the response
- Response Processing - We iterate through output items:
- Handle image generation calls by decoding and saving
- Display text messages with annotations
- Handle any other output types
- File Output - Generated images are saved to disk as PNG files
Summary
This example demonstrates how to use the Responses API to create multi-tool workflows that combine web search and image generation. The key steps are:
- Build a
CreateResponserequest with the desired tools (WebSearchToolandImageGenTool) - Initialize the
VlloraLLMClientwith your API credentials - Send the request and receive structured outputs
- Process different output types: extract text from
OutputItem::Messageand decode base64 images fromOutputItem::ImageGenerationCall - Save decoded images to disk using standard Rust file I/O
The Responses API enables powerful, structured workflows that go beyond simple text completions, making it ideal for building applications that need to orchestrate multiple AI capabilities.