github crmne/ruby_llm 1.9.0

one day ago

RubyLLM 1.9.0: Tool Schemas, Prompt Caching & Transcriptions ✨🎙️

Major release that makes tool definitions feel like Ruby, lets you lean on Anthropic prompt caching everywhere, and turns audio transcription into a one-liner—plus better Gemini structured output and Nano Banana image responses.

🧰 JSON Schema Tooling That Feels Native

The new RubyLLM::Schema params DSL supports full JSON Schema for tool parameter definitions, including nested objects, arrays, enums, and nullable fields.

class Scheduler < RubyLLM::Tool
  description "Books a meeting"

  params do
    object :window, description: "Time window to reserve" do
      string :start, description: "ISO8601 start"
      string :finish, description: "ISO8601 finish"
    end

    array :participants, of: :string, description: "Email invitees"

    any_of :format, description: "Optional meeting format" do
      string enum: %w[virtual in_person]
      null
    end
  end

  def execute(window:, participants:, format: nil)
    Booking.reserve(window:, participants:, format:)
  end
end
  • Powered by RubyLLM::Schema, the same awesome Ruby DSL we recommend to use for Structured Output's chat.with_schema.
  • Already handles Anthropic/Gemini quirks like nullable unions and enums - no more ad-hoc translation layers.
  • Prefer raw hashes? Pass params schema: { ... } to keep your existing JSON Schema verbatim.

🧱 Raw Content Blocks & Anthropic Prompt Caching Everywhere

When you need to handcraft message envelopes:

chat = RubyLLM.chat(model: "claude-sonnet-4-5")
raw_request = RubyLLM::Content::Raw.new([
  { type: "text", text: File.read("prompt.md"), cache_control: { type: "ephemeral" } },
  { type: "text", text: "Summarize today’s work." }
])

chat.ask(raw_request)

We also provide an helper specifically for Anthropic Prompt Caching:

system_block = RubyLLM::Providers::Anthropic::Content.new(
  "You are our release-notes assistant.",
  cache: true
)

chat.add_message(role: :system, content: system_block)
  • RubyLLM::Content::Raw lets you ship provider-native payloads for content blocks.
  • Anthropic helpers keep cache_control hints readable while still producing the right JSON structure.
  • Every RubyLLM::Message now exposes cached_tokens and cache_creation_tokens, so you can see exactly what the provider pulled from cache versus what it had to recreate.

Please run rails generate ruby_llm:upgrade_to_v1_9 in your Rails app if you come from 1.8.x.

⚙️ Tool.with_params Plays Nice with Anthropic Caching

Similarly to Raw Content Blocks, .with_params lets you set arbitrary params in tool definitions. Perfect for Anthropic’s cache_control hints.

class ChangelogTool < RubyLLM::Tool
  description "Formats commits into release notes"

  params do
    array :commits, of: :string
  end

  with_params cache_control: { type: "ephemeral" }

  def execute(commits:)
    ReleaseNotes.format(commits)
  end
end

🎙️ RubyLLM.transcribe Turns Audio into Text (With Diarization)

One method call gives you transcripts, diarized segments, and consistent token tallies across providers.

transcription = RubyLLM.transcribe(
  "all-hands.m4a",
  model: "gpt-4o-transcribe-diarize",
  language: "en",
  prompt: "Focus on action items."
)

transcription.segments.each do |segment|
  puts "#{segment['speaker']}: #{segment['text']} (#{segment['start']}s – #{segment['end']}s)"
end
  • Supports OpenAI (whisper-1, gpt-4o-transcribe, diarization variants), Gemini 2.5 Flash, and Vertex AI with the same API.
  • Optional speaker references map diarized voices to real names.

🛠️ Gemini Structured Output Fixes & Nano Banana Inline Images

We went deep on Gemini’s edges so you don’t have to.

  • Nullables and anyOf now translate cleanly, and Gemini 2.5 finally respects responseJsonSchema, so complex structured output works out of the box.
  • Parallel tool calls return one single message with the right role. This should increase its accuracy in using and responding to tool calls.
  • Gemini 2.5 Flash Image (“Nano Banana”) surfaces inline images as actual attachments—pair it with your UI immediately.
chat = RubyLLM.chat(model: "gemini-2.5-flash-image")
reply = chat.ask("Sketch a Nano Banana wearing aviators.")
image = reply.content.attachments.first
File.binwrite("nano-banana.png", image.read)

(If you missed the backstory, my blog post Nano Banana with RubyLLM has the full walkthrough.)

🗂️ Configurable Model Registry file path

Deploying to read-only filesystems? Point RubyLLM at a writable JSON registry and keep refreshing models without hacks.

RubyLLM.models.save_to_json("/var/app/models.json")

RubyLLM.configure do |config|
  config.model_registry_file = "/var/app/models.json"
end

Just remember that RubyLLM.models.refresh! only updates the in-memory registry. To persist changes to disk, call:

RubyLLM.models.refresh!
RubyLLM.models.save_to_json
  • Plays nicely with the ActiveRecord integration (which still stores models in the DB).

Installation

gem "ruby_llm", "1.9.0"

Upgrading from 1.8.x

bundle update ruby_llm
rails generate ruby_llm:upgrade_to_v1_9`

Merged PRs

New Contributors

Full Changelog: 1.8.2...1.9.0

Don't miss a new ruby_llm release

NewReleases is sending notifications on new releases.