RubyLLM 1.13: Massive Amount of Fixes + A Ton of Merged PRs 🎉🤖🛠️

This is a big stabilization release.

RubyLLM 1.13.0 ships a very large set of reliability fixes and production-grade polish across tool calling, structured output, provider configuration, retries/error classification, Rails generators, and agent lifecycle behavior.

There are also many merged PRs from the community in this cycle.

Highlights

🛠️ Tool Calling: More Control + Better Real-World Failure Handling

RubyLLM now supports built-in tool control parameters and better edge-case handling.

Control tool behavior with two options:

choice to control whether/how tools are called (:auto, :none, :required, or a specific tool)
calls to control whether the model may return one or multiple tool calls in a single assistant response (:one / :many) (aka "parallel" tool calling)
invalid kwargs and hallucinated/unavailable tool calls are now returned to the model as tool errors so the model can recover and try again (instead of raising app exceptions)
fixed streaming tool-call nil-argument handling and assistant tool-call messages with nil content, so tool-call transcripts stay valid across turns

chat = RubyLLM.chat(model: "gpt-5-nano")
  .with_tools(WeatherTool, CalculatorTool, choice: :required, calls: :one)

response = chat.ask("Use tools to estimate commute time + cost")
puts response.content

Tool Choice (`choice`)

Use choice to control whether the model can call tools and which one it can call.

# Model decides whether to call tools
chat.with_tools(WeatherTool, CalculatorTool, choice: :auto)

# Model must call one of the provided tools
chat.with_tools(WeatherTool, CalculatorTool, choice: :required)

# Disable tool calls
chat.with_tools(WeatherTool, CalculatorTool, choice: :none)

# Force one specific tool
chat.with_tools(WeatherTool, CalculatorTool, choice: :weather_tool)

Valid values:

:auto
:required
:none
tool name symbol/string or ToolClass

"Parallel" Tool Calling control (`calls`)

Use calls to control how many tool calls the model may return in a single assistant response.

Providers usually call this parallel tool calling. We call it calls because "parallel" can be misleading: tools are not executed in parallel unless your tool executor itself is parallelized. calls describes response behavior directly.

# provider/model default behavior
chat = RubyLLM.chat.with_tools(WeatherTool, CalendarTool)

# allow multiple tool calls in one assistant response
chat = RubyLLM.chat.with_tools(WeatherTool, CalendarTool, calls: :many)

# allow one tool call in one assistant response
chat = RubyLLM.chat.with_tools(WeatherTool, CalendarTool, calls: :one)

# equivalent:
chat = RubyLLM.chat.with_tools(WeatherTool, CalendarTool, calls: 1)

Valid values:

:many
:one
1

If calls is not provided, RubyLLM uses provider/model defaults, usually equivalent to calls: :many.

Invalid tool kwargs now return explicit tool errors

class SignatureTool < RubyLLM::Tool
  def execute(questions:)
    questions
  end
end

result = SignatureTool.new.call({ "questions" => [], "isOther" => true })
puts result
# => { error: "Invalid tool arguments: unknown keyword: isOther" }

Hallucinated tool calls are handled gracefully

tool_results = []

chat = RubyLLM.chat.with_tool(WeatherTool)
  .on_tool_result { |result| tool_results << result }

# If the model tries to call a non-existent tool,
# RubyLLM reports a tool error and continues the conversation safely.
chat.ask("What tools do you support?")

p tool_results
# => [{ error: "Model tried to call unavailable tool `...`. Available tools: [\"weather\"]." }]

🧩 Structured Output: Expanded Coverage + Better Accuracy via Schema Names

Structured output support was expanded (including Bedrock + Anthropic), and a multi-turn structured-output regression was fixed.

class PersonSchema < RubyLLM::Schema
  string :name
  integer :age
end

chat = RubyLLM.chat(model: "claude-haiku-4-5", provider: :bedrock)
response = chat.with_schema(PersonSchema).ask("Generate a user profile")
puts response.content
# => {"name"=>"...", "age"=>...}

Schema naming also got better: RubyLLM now passes more meaningful schema names to providers, which helps the model better understand expected output structure.

# RubyLLM::Schema class names are now used as schema names automatically
class InvoiceSummarySchema < RubyLLM::Schema
  string :customer
  number :total
end

response = RubyLLM.chat.with_schema(InvoiceSummarySchema).ask("Summarize this invoice")

And manual schemas can now provide explicit names:

invoice_schema = {
  name: "InvoiceSummarySchema",
  schema: {
    type: "object",
    properties: {
      customer: { type: "string" },
      total: { type: "number" }
    },
    required: ["customer", "total"],
    additionalProperties: false
  }
}

response = RubyLLM.chat.with_schema(invoice_schema).ask("Summarize this invoice")

☁️ Provider Configuration Flexibility

This release adds multiple endpoint/base URL and credential options so teams can use self-hosted gateways, private routing, enterprise proxies, and compatible hosted services without patching providers.

RubyLLM.configure do |config|
  config.openrouter_api_base = ENV["OPENROUTER_API_BASE"]
  config.anthropic_api_base  = ENV["ANTHROPIC_API_BASE"]
  config.deepseek_api_base   = ENV["DEEPSEEK_API_BASE"]
  config.ollama_api_key      = ENV["OLLAMA_API_KEY"]
end

Ollama API Key support

ollama_api_key support enables authenticated/remote Ollama endpoints (including Ollama Cloud-style setups) where auth headers are required.

☁️ Vertex AI: Service Account Key Support

Vertex AI auth support was improved to allow service account key usage without ADC regressions, plus scope handling fixes for GCE credentials.

RubyLLM.configure do |config|
  config.vertexai_project_id          = ENV["GOOGLE_CLOUD_PROJECT"]
  config.vertexai_location            = ENV["GOOGLE_CLOUD_LOCATION"]
  config.vertexai_service_account_key = ENV["VERTEXAI_SERVICE_ACCOUNT_KEY"] # optional JSON key
end

🔁 Error Handling and Retries

Error/retry behavior has been tightened for context-length and transient server cases:

automatic retries were effectively not working for most LLM calls because POST requests were not being retried
POST retries are now enabled
context-length detection on HTTP 400
improved classification for context-length 429 responses
improved 504 classification
retries are enabled by default (max_retries = 3) so check your configuration to confirm this matches your desired behavior

begin
  RubyLLM.chat.ask("...")
rescue RubyLLM::ContextLengthExceededError
  # trim messages / reduce response size / retry
end

🤖 Agent + Rails Lifecycle Fixes

Agent and Rails-backed chat behavior received important fixes:

runtime agent instructions now persist correctly across to_llm rebuilds
missing prompts now raise RubyLLM::PromptNotFoundError
Rails install flow now separates schema migration from model data loading (v1.13+)
Rails docs now include fiber-safe ActiveRecord isolation guidance for async/fiber-heavy workloads (config.active_support.isolation_level = :fiber)
generator and migration naming fixes (including acronym model classes)
chat UI streaming preserves whitespace chunks correctly

Rails setup now looks like:

rails generate ruby_llm:install
rails db:migrate
rails ruby_llm:load_models

begin
  SupportAgent.new.ask("Help me with this request")
rescue RubyLLM::PromptNotFoundError => e
  puts e.message
end

Performance & DX Polishes

lazy block-style debug logging to reduce allocations when debug logging is disabled
configurable log_regexp_timeout
rubocop/lint/test stability improvements
model matrix/docs refreshes (including newer xAI model IDs and image-generation coverage updates)
obsolete codecov gem removed
docs and model listings refreshed

Installation

gem "ruby_llm", "1.13.0"

Upgrading from 1.12.x

bundle update ruby_llm

Merged PRs

Fix POST retries and 504 retry classification by @crmne in #624
Fix streaming to preserve whitespace chunks in chat UI template by @kryzhovnik in #636
Fix migration class name for model names with acronyms (e.g. model:AIModel) by @Saidbek in #640
Remove dependency on obsolete 'codecov' gem by @mvz in #625
Detect context length exceeded errors on HTTP 400 responses by @plehoux in #642
Use UTC for created_at in order to prevent diff noise when running models:update from a different timezone by @radanskoric in #631
Adds opentelemetry-instrumentation-ruby_llm to the ecosystem by @clarissalimab in #599
Add configurable Anthropic API base URL by @ericproulx in #589
Add ollama_api_key support for remote Ollama endpoints by @geeksilva97 in #612
Add Anthropic structured output support by @hiasinho in #608
Add Bedrock structured output support by @llenodo in #619
Add thought signature support for Google Gemini OpenAI compatibility by @ericproulx in #588
Data loss in cleanup_orphaned_tool_results with custom association by @bschmeck in #584
Handle tool hallucination gracefully by @redox in #580
Fix streaming tool call nil arguments by @afurm in #587
Preserves assume_model_exists in to_llm for custom models by @creaumond in #564
Fix paint not working with OpenRouter provider (#513) by @khasinski in #558
Allow DeepSeek api base override by @flyerhzm in #575
Add specs for v1.7/v1.9 upgrade generators by @afurm in #539
chore: update GitHub Actions to latest major versions by @seuros in #534
Fix structured output multi-turn conversation error by @alexey-hunter-io in #531
Fix acts_as_tool_call message: option (#514) by @saurabh-sikchi in #515
feat: Allow configuring OpenRouter API base via openrouter_api_base s… by @graysonchen in #381
Add built-in support for tool control parameters by @nwumnn in #347
Adding option for configuring custom log Regexp timeout by @nina-instrumentl in #364
Fix Openrouter Error Parser to Handle Detailed Error Messages by @xiaohui-zhangxh in #431
Add custom schema name support by @llenodo in #476
Adding RubyLLM::RedCandle to the ecosystem page (documentation only) by @cpetersen in #535
[docs] Add RubyLLM::Instrumentation and RubyLLM::Monitoring to ecosystem by @patriciomacadden in #569
Fix VertexAI scope argument for GCE credentials by @jesse-spevack in #520
VertexAI: service account key support without ADC regressions by @crmne in #646

New Contributors

@kryzhovnik made their first contribution in #636
@Saidbek made their first contribution in #640
@plehoux made their first contribution in #642
@radanskoric made their first contribution in #631
@clarissalimab made their first contribution in #599
@ericproulx made their first contribution in #589
@geeksilva97 made their first contribution in #612
@hiasinho made their first contribution in #608
@llenodo made their first contribution in #619
@bschmeck made their first contribution in #584
@creaumond made their first contribution in #564
@khasinski made their first contribution in #558
@flyerhzm made their first contribution in #575
@alexey-hunter-io made their first contribution in #531
@saurabh-sikchi made their first contribution in #515
@nwumnn made their first contribution in #347
@nina-instrumentl made their first contribution in #364
@xiaohui-zhangxh made their first contribution in #431
@cpetersen made their first contribution in #535
@patriciomacadden made their first contribution in #569
@jesse-spevack made their first contribution in #520

Full Changelog: 1.12.1...1.13.0

crmne/ruby_llm 1.13.0 on GitHub