RubyLLM 1.9.0: Tool Schemas, Prompt Caching & Transcriptions ✨🎙️
Major release that makes tool definitions feel like Ruby, lets you lean on Anthropic prompt caching everywhere, and turns audio transcription into a one-liner—plus better Gemini structured output and Nano Banana image responses.
🧰 JSON Schema Tooling That Feels Native
The new RubyLLM::Schema params DSL supports full JSON Schema for tool parameter definitions, including nested objects, arrays, enums, and nullable fields.
class Scheduler < RubyLLM::Tool
  description "Books a meeting"
  params do
    object :window, description: "Time window to reserve" do
      string :start, description: "ISO8601 start"
      string :finish, description: "ISO8601 finish"
    end
    array :participants, of: :string, description: "Email invitees"
    any_of :format, description: "Optional meeting format" do
      string enum: %w[virtual in_person]
      null
    end
  end
  def execute(window:, participants:, format: nil)
    Booking.reserve(window:, participants:, format:)
  end
end- Powered by 
RubyLLM::Schema, the same awesome Ruby DSL we recommend to use for Structured Output'schat.with_schema. - Already handles Anthropic/Gemini quirks like nullable unions and enums - no more ad-hoc translation layers.
 - Prefer raw hashes? Pass 
params schema: { ... }to keep your existing JSON Schema verbatim. 
🧱 Raw Content Blocks & Anthropic Prompt Caching Everywhere
When you need to handcraft message envelopes:
chat = RubyLLM.chat(model: "claude-sonnet-4-5")
raw_request = RubyLLM::Content::Raw.new([
  { type: "text", text: File.read("prompt.md"), cache_control: { type: "ephemeral" } },
  { type: "text", text: "Summarize today’s work." }
])
chat.ask(raw_request)We also provide an helper specifically for Anthropic Prompt Caching:
system_block = RubyLLM::Providers::Anthropic::Content.new(
  "You are our release-notes assistant.",
  cache: true
)
chat.add_message(role: :system, content: system_block)RubyLLM::Content::Rawlets you ship provider-native payloads for content blocks.- Anthropic helpers keep 
cache_controlhints readable while still producing the right JSON structure. - Every 
RubyLLM::Messagenow exposescached_tokensandcache_creation_tokens, so you can see exactly what the provider pulled from cache versus what it had to recreate. 
Please run rails generate ruby_llm:upgrade_to_v1_9 in your Rails app if you come from 1.8.x.
⚙️ Tool.with_params Plays Nice with Anthropic Caching
Similarly to Raw Content Blocks, .with_params lets you set arbitrary params in tool definitions. Perfect for Anthropic’s cache_control hints.
class ChangelogTool < RubyLLM::Tool
  description "Formats commits into release notes"
  params do
    array :commits, of: :string
  end
  with_params cache_control: { type: "ephemeral" }
  def execute(commits:)
    ReleaseNotes.format(commits)
  end
end🎙️ RubyLLM.transcribe Turns Audio into Text (With Diarization)
One method call gives you transcripts, diarized segments, and consistent token tallies across providers.
transcription = RubyLLM.transcribe(
  "all-hands.m4a",
  model: "gpt-4o-transcribe-diarize",
  language: "en",
  prompt: "Focus on action items."
)
transcription.segments.each do |segment|
  puts "#{segment['speaker']}: #{segment['text']} (#{segment['start']}s – #{segment['end']}s)"
end- Supports OpenAI (
whisper-1,gpt-4o-transcribe, diarization variants), Gemini 2.5 Flash, and Vertex AI with the same API. - Optional speaker references map diarized voices to real names.
 
🛠️ Gemini Structured Output Fixes & Nano Banana Inline Images
We went deep on Gemini’s edges so you don’t have to.
- Nullables and 
anyOfnow translate cleanly, and Gemini 2.5 finally respectsresponseJsonSchema, so complex structured output works out of the box. - Parallel tool calls return one single message with the right role. This should increase its accuracy in using and responding to tool calls.
 - Gemini 2.5 Flash Image (“Nano Banana”) surfaces inline images as actual attachments—pair it with your UI immediately.
 
chat = RubyLLM.chat(model: "gemini-2.5-flash-image")
reply = chat.ask("Sketch a Nano Banana wearing aviators.")
image = reply.content.attachments.first
File.binwrite("nano-banana.png", image.read)(If you missed the backstory, my blog post Nano Banana with RubyLLM has the full walkthrough.)
🗂️ Configurable Model Registry file path
Deploying to read-only filesystems? Point RubyLLM at a writable JSON registry and keep refreshing models without hacks.
RubyLLM.models.save_to_json("/var/app/models.json")
RubyLLM.configure do |config|
  config.model_registry_file = "/var/app/models.json"
endJust remember that RubyLLM.models.refresh! only updates the in-memory registry. To persist changes to disk, call:
RubyLLM.models.refresh!
RubyLLM.models.save_to_json- Plays nicely with the ActiveRecord integration (which still stores models in the DB).
 
Installation
gem "ruby_llm", "1.9.0"Upgrading from 1.8.x
bundle update ruby_llm
rails generate ruby_llm:upgrade_to_v1_9`Merged PRs
- Feat: Support Gemini's Different API versions by @thefishua in #444
 
New Contributors
- @thefishua made their first contribution in #444
 
Full Changelog: 1.8.2...1.9.0