github crmne/ruby_llm 1.13.0

9 hours ago

RubyLLM 1.13: Massive Amount of Fixes + A Ton of Merged PRs šŸŽ‰šŸ¤–šŸ› ļø

This is a big stabilization release.

RubyLLM 1.13.0 ships a very large set of reliability fixes and production-grade polish across tool calling, structured output, provider configuration, retries/error classification, Rails generators, and agent lifecycle behavior.

There are also many merged PRs from the community in this cycle.

Highlights

šŸ› ļø Tool Calling: More Control + Better Real-World Failure Handling

RubyLLM now supports built-in tool control parameters and better edge-case handling.

Control tool behavior with two options:

  • choice to control whether/how tools are called (:auto, :none, :required, or a specific tool)
  • calls to control whether the model may return one or multiple tool calls in a single assistant response (:one / :many) (aka "parallel" tool calling)
  • invalid kwargs and hallucinated/unavailable tool calls are now returned to the model as tool errors so the model can recover and try again (instead of raising app exceptions)
  • fixed streaming tool-call nil-argument handling and assistant tool-call messages with nil content, so tool-call transcripts stay valid across turns
chat = RubyLLM.chat(model: "gpt-5-nano")
  .with_tools(WeatherTool, CalculatorTool, choice: :required, calls: :one)

response = chat.ask("Use tools to estimate commute time + cost")
puts response.content

Tool Choice (choice)

Use choice to control whether the model can call tools and which one it can call.

# Model decides whether to call tools
chat.with_tools(WeatherTool, CalculatorTool, choice: :auto)

# Model must call one of the provided tools
chat.with_tools(WeatherTool, CalculatorTool, choice: :required)

# Disable tool calls
chat.with_tools(WeatherTool, CalculatorTool, choice: :none)

# Force one specific tool
chat.with_tools(WeatherTool, CalculatorTool, choice: :weather_tool)

Valid values:

  • :auto
  • :required
  • :none
  • tool name symbol/string or ToolClass

"Parallel" Tool Calling control (calls)

Use calls to control how many tool calls the model may return in a single assistant response.

Providers usually call this parallel tool calling. We call it calls because "parallel" can be misleading: tools are not executed in parallel unless your tool executor itself is parallelized. calls describes response behavior directly.

# provider/model default behavior
chat = RubyLLM.chat.with_tools(WeatherTool, CalendarTool)

# allow multiple tool calls in one assistant response
chat = RubyLLM.chat.with_tools(WeatherTool, CalendarTool, calls: :many)

# allow one tool call in one assistant response
chat = RubyLLM.chat.with_tools(WeatherTool, CalendarTool, calls: :one)

# equivalent:
chat = RubyLLM.chat.with_tools(WeatherTool, CalendarTool, calls: 1)

Valid values:

  • :many
  • :one
  • 1

If calls is not provided, RubyLLM uses provider/model defaults, usually equivalent to calls: :many.

Invalid tool kwargs now return explicit tool errors

class SignatureTool < RubyLLM::Tool
  def execute(questions:)
    questions
  end
end

result = SignatureTool.new.call({ "questions" => [], "isOther" => true })
puts result
# => { error: "Invalid tool arguments: unknown keyword: isOther" }

Hallucinated tool calls are handled gracefully

tool_results = []

chat = RubyLLM.chat.with_tool(WeatherTool)
  .on_tool_result { |result| tool_results << result }

# If the model tries to call a non-existent tool,
# RubyLLM reports a tool error and continues the conversation safely.
chat.ask("What tools do you support?")

p tool_results
# => [{ error: "Model tried to call unavailable tool `...`. Available tools: [\"weather\"]." }]

🧩 Structured Output: Expanded Coverage + Better Accuracy via Schema Names

Structured output support was expanded (including Bedrock + Anthropic), and a multi-turn structured-output regression was fixed.

class PersonSchema < RubyLLM::Schema
  string :name
  integer :age
end

chat = RubyLLM.chat(model: "claude-haiku-4-5", provider: :bedrock)
response = chat.with_schema(PersonSchema).ask("Generate a user profile")
puts response.content
# => {"name"=>"...", "age"=>...}

Schema naming also got better: RubyLLM now passes more meaningful schema names to providers, which helps the model better understand expected output structure.

# RubyLLM::Schema class names are now used as schema names automatically
class InvoiceSummarySchema < RubyLLM::Schema
  string :customer
  number :total
end

response = RubyLLM.chat.with_schema(InvoiceSummarySchema).ask("Summarize this invoice")

And manual schemas can now provide explicit names:

invoice_schema = {
  name: "InvoiceSummarySchema",
  schema: {
    type: "object",
    properties: {
      customer: { type: "string" },
      total: { type: "number" }
    },
    required: ["customer", "total"],
    additionalProperties: false
  }
}

response = RubyLLM.chat.with_schema(invoice_schema).ask("Summarize this invoice")

ā˜ļø Provider Configuration Flexibility

This release adds multiple endpoint/base URL and credential options so teams can use self-hosted gateways, private routing, enterprise proxies, and compatible hosted services without patching providers.

RubyLLM.configure do |config|
  config.openrouter_api_base = ENV["OPENROUTER_API_BASE"]
  config.anthropic_api_base  = ENV["ANTHROPIC_API_BASE"]
  config.deepseek_api_base   = ENV["DEEPSEEK_API_BASE"]
  config.ollama_api_key      = ENV["OLLAMA_API_KEY"]
end

Ollama API Key support

ollama_api_key support enables authenticated/remote Ollama endpoints (including Ollama Cloud-style setups) where auth headers are required.

ā˜ļø Vertex AI: Service Account Key Support

Vertex AI auth support was improved to allow service account key usage without ADC regressions, plus scope handling fixes for GCE credentials.

RubyLLM.configure do |config|
  config.vertexai_project_id          = ENV["GOOGLE_CLOUD_PROJECT"]
  config.vertexai_location            = ENV["GOOGLE_CLOUD_LOCATION"]
  config.vertexai_service_account_key = ENV["VERTEXAI_SERVICE_ACCOUNT_KEY"] # optional JSON key
end

šŸ” Error Handling and Retries

Error/retry behavior has been tightened for context-length and transient server cases:

  • automatic retries were effectively not working for most LLM calls because POST requests were not being retried
  • POST retries are now enabled
  • context-length detection on HTTP 400
  • improved classification for context-length 429 responses
  • improved 504 classification
  • retries are enabled by default (max_retries = 3) so check your configuration to confirm this matches your desired behavior
begin
  RubyLLM.chat.ask("...")
rescue RubyLLM::ContextLengthExceededError
  # trim messages / reduce response size / retry
end

šŸ¤– Agent + Rails Lifecycle Fixes

Agent and Rails-backed chat behavior received important fixes:

  • runtime agent instructions now persist correctly across to_llm rebuilds
  • missing prompts now raise RubyLLM::PromptNotFoundError
  • Rails install flow now separates schema migration from model data loading (v1.13+)
  • Rails docs now include fiber-safe ActiveRecord isolation guidance for async/fiber-heavy workloads (config.active_support.isolation_level = :fiber)
  • generator and migration naming fixes (including acronym model classes)
  • chat UI streaming preserves whitespace chunks correctly

Rails setup now looks like:

rails generate ruby_llm:install
rails db:migrate
rails ruby_llm:load_models
begin
  SupportAgent.new.ask("Help me with this request")
rescue RubyLLM::PromptNotFoundError => e
  puts e.message
end

Performance & DX Polishes

  • lazy block-style debug logging to reduce allocations when debug logging is disabled
  • configurable log_regexp_timeout
  • rubocop/lint/test stability improvements
  • model matrix/docs refreshes (including newer xAI model IDs and image-generation coverage updates)
  • obsolete codecov gem removed
  • docs and model listings refreshed

Installation

gem "ruby_llm", "1.13.0"

Upgrading from 1.12.x

bundle update ruby_llm

Merged PRs

  • Fix POST retries and 504 retry classification by @crmne in #624
  • Fix streaming to preserve whitespace chunks in chat UI template by @kryzhovnik in #636
  • Fix migration class name for model names with acronyms (e.g. model:AIModel) by @Saidbek in #640
  • Remove dependency on obsolete 'codecov' gem by @mvz in #625
  • Detect context length exceeded errors on HTTP 400 responses by @plehoux in #642
  • Use UTC for created_at in order to prevent diff noise when running models:update from a different timezone by @radanskoric in #631
  • Adds opentelemetry-instrumentation-ruby_llm to the ecosystem by @clarissalimab in #599
  • Add configurable Anthropic API base URL by @ericproulx in #589
  • Add ollama_api_key support for remote Ollama endpoints by @geeksilva97 in #612
  • Add Anthropic structured output support by @hiasinho in #608
  • Add Bedrock structured output support by @llenodo in #619
  • Add thought signature support for Google Gemini OpenAI compatibility by @ericproulx in #588
  • Data loss in cleanup_orphaned_tool_results with custom association by @bschmeck in #584
  • Handle tool hallucination gracefully by @redox in #580
  • Fix streaming tool call nil arguments by @afurm in #587
  • Preserves assume_model_exists in to_llm for custom models by @creaumond in #564
  • Fix paint not working with OpenRouter provider (#513) by @khasinski in #558
  • Allow DeepSeek api base override by @flyerhzm in #575
  • Add specs for v1.7/v1.9 upgrade generators by @afurm in #539
  • chore: update GitHub Actions to latest major versions by @seuros in #534
  • Fix structured output multi-turn conversation error by @alexey-hunter-io in #531
  • Fix acts_as_tool_call message: option (#514) by @saurabh-sikchi in #515
  • feat: Allow configuring OpenRouter API base via openrouter_api_base s… by @graysonchen in #381
  • Add built-in support for tool control parameters by @nwumnn in #347
  • Adding option for configuring custom log Regexp timeout by @nina-instrumentl in #364
  • Fix Openrouter Error Parser to Handle Detailed Error Messages by @xiaohui-zhangxh in #431
  • Add custom schema name support by @llenodo in #476
  • Adding RubyLLM::RedCandle to the ecosystem page (documentation only) by @cpetersen in #535
  • [docs] Add RubyLLM::Instrumentation and RubyLLM::Monitoring to ecosystem by @patriciomacadden in #569
  • Fix VertexAI scope argument for GCE credentials by @jesse-spevack in #520
  • VertexAI: service account key support without ADC regressions by @crmne in #646

New Contributors

Full Changelog: 1.12.1...1.13.0

Don't miss a new ruby_llm release

NewReleases is sending notifications on new releases.