RubyLLM 1.13: Massive Amount of Fixes + A Ton of Merged PRs šš¤š ļø
This is a big stabilization release.
RubyLLM 1.13.0 ships a very large set of reliability fixes and production-grade polish across tool calling, structured output, provider configuration, retries/error classification, Rails generators, and agent lifecycle behavior.
There are also many merged PRs from the community in this cycle.
Highlights
š ļø Tool Calling: More Control + Better Real-World Failure Handling
RubyLLM now supports built-in tool control parameters and better edge-case handling.
Control tool behavior with two options:
choiceto control whether/how tools are called (:auto,:none,:required, or a specific tool)callsto control whether the model may return one or multiple tool calls in a single assistant response (:one/:many) (aka "parallel" tool calling)- invalid kwargs and hallucinated/unavailable tool calls are now returned to the model as tool errors so the model can recover and try again (instead of raising app exceptions)
- fixed streaming tool-call nil-argument handling and assistant tool-call messages with nil content, so tool-call transcripts stay valid across turns
chat = RubyLLM.chat(model: "gpt-5-nano")
.with_tools(WeatherTool, CalculatorTool, choice: :required, calls: :one)
response = chat.ask("Use tools to estimate commute time + cost")
puts response.contentTool Choice (choice)
Use choice to control whether the model can call tools and which one it can call.
# Model decides whether to call tools
chat.with_tools(WeatherTool, CalculatorTool, choice: :auto)
# Model must call one of the provided tools
chat.with_tools(WeatherTool, CalculatorTool, choice: :required)
# Disable tool calls
chat.with_tools(WeatherTool, CalculatorTool, choice: :none)
# Force one specific tool
chat.with_tools(WeatherTool, CalculatorTool, choice: :weather_tool)Valid values:
:auto:required:none- tool name symbol/string or
ToolClass
"Parallel" Tool Calling control (calls)
Use calls to control how many tool calls the model may return in a single assistant response.
Providers usually call this parallel tool calling. We call it calls because "parallel" can be misleading: tools are not executed in parallel unless your tool executor itself is parallelized. calls describes response behavior directly.
# provider/model default behavior
chat = RubyLLM.chat.with_tools(WeatherTool, CalendarTool)
# allow multiple tool calls in one assistant response
chat = RubyLLM.chat.with_tools(WeatherTool, CalendarTool, calls: :many)
# allow one tool call in one assistant response
chat = RubyLLM.chat.with_tools(WeatherTool, CalendarTool, calls: :one)
# equivalent:
chat = RubyLLM.chat.with_tools(WeatherTool, CalendarTool, calls: 1)Valid values:
:many:one1
If calls is not provided, RubyLLM uses provider/model defaults, usually equivalent to calls: :many.
Invalid tool kwargs now return explicit tool errors
class SignatureTool < RubyLLM::Tool
def execute(questions:)
questions
end
end
result = SignatureTool.new.call({ "questions" => [], "isOther" => true })
puts result
# => { error: "Invalid tool arguments: unknown keyword: isOther" }Hallucinated tool calls are handled gracefully
tool_results = []
chat = RubyLLM.chat.with_tool(WeatherTool)
.on_tool_result { |result| tool_results << result }
# If the model tries to call a non-existent tool,
# RubyLLM reports a tool error and continues the conversation safely.
chat.ask("What tools do you support?")
p tool_results
# => [{ error: "Model tried to call unavailable tool `...`. Available tools: [\"weather\"]." }]š§© Structured Output: Expanded Coverage + Better Accuracy via Schema Names
Structured output support was expanded (including Bedrock + Anthropic), and a multi-turn structured-output regression was fixed.
class PersonSchema < RubyLLM::Schema
string :name
integer :age
end
chat = RubyLLM.chat(model: "claude-haiku-4-5", provider: :bedrock)
response = chat.with_schema(PersonSchema).ask("Generate a user profile")
puts response.content
# => {"name"=>"...", "age"=>...}Schema naming also got better: RubyLLM now passes more meaningful schema names to providers, which helps the model better understand expected output structure.
# RubyLLM::Schema class names are now used as schema names automatically
class InvoiceSummarySchema < RubyLLM::Schema
string :customer
number :total
end
response = RubyLLM.chat.with_schema(InvoiceSummarySchema).ask("Summarize this invoice")And manual schemas can now provide explicit names:
invoice_schema = {
name: "InvoiceSummarySchema",
schema: {
type: "object",
properties: {
customer: { type: "string" },
total: { type: "number" }
},
required: ["customer", "total"],
additionalProperties: false
}
}
response = RubyLLM.chat.with_schema(invoice_schema).ask("Summarize this invoice")āļø Provider Configuration Flexibility
This release adds multiple endpoint/base URL and credential options so teams can use self-hosted gateways, private routing, enterprise proxies, and compatible hosted services without patching providers.
RubyLLM.configure do |config|
config.openrouter_api_base = ENV["OPENROUTER_API_BASE"]
config.anthropic_api_base = ENV["ANTHROPIC_API_BASE"]
config.deepseek_api_base = ENV["DEEPSEEK_API_BASE"]
config.ollama_api_key = ENV["OLLAMA_API_KEY"]
endOllama API Key support
ollama_api_key support enables authenticated/remote Ollama endpoints (including Ollama Cloud-style setups) where auth headers are required.
āļø Vertex AI: Service Account Key Support
Vertex AI auth support was improved to allow service account key usage without ADC regressions, plus scope handling fixes for GCE credentials.
RubyLLM.configure do |config|
config.vertexai_project_id = ENV["GOOGLE_CLOUD_PROJECT"]
config.vertexai_location = ENV["GOOGLE_CLOUD_LOCATION"]
config.vertexai_service_account_key = ENV["VERTEXAI_SERVICE_ACCOUNT_KEY"] # optional JSON key
endš Error Handling and Retries
Error/retry behavior has been tightened for context-length and transient server cases:
- automatic retries were effectively not working for most LLM calls because POST requests were not being retried
- POST retries are now enabled
- context-length detection on HTTP 400
- improved classification for context-length 429 responses
- improved 504 classification
- retries are enabled by default (
max_retries = 3) so check your configuration to confirm this matches your desired behavior
begin
RubyLLM.chat.ask("...")
rescue RubyLLM::ContextLengthExceededError
# trim messages / reduce response size / retry
endš¤ Agent + Rails Lifecycle Fixes
Agent and Rails-backed chat behavior received important fixes:
- runtime agent instructions now persist correctly across
to_llmrebuilds - missing prompts now raise
RubyLLM::PromptNotFoundError - Rails install flow now separates schema migration from model data loading (v1.13+)
- Rails docs now include fiber-safe ActiveRecord isolation guidance for async/fiber-heavy workloads (
config.active_support.isolation_level = :fiber) - generator and migration naming fixes (including acronym model classes)
- chat UI streaming preserves whitespace chunks correctly
Rails setup now looks like:
rails generate ruby_llm:install
rails db:migrate
rails ruby_llm:load_modelsbegin
SupportAgent.new.ask("Help me with this request")
rescue RubyLLM::PromptNotFoundError => e
puts e.message
endPerformance & DX Polishes
- lazy block-style debug logging to reduce allocations when debug logging is disabled
- configurable
log_regexp_timeout - rubocop/lint/test stability improvements
- model matrix/docs refreshes (including newer xAI model IDs and image-generation coverage updates)
- obsolete
codecovgem removed - docs and model listings refreshed
Installation
gem "ruby_llm", "1.13.0"Upgrading from 1.12.x
bundle update ruby_llmMerged PRs
- Fix POST retries and 504 retry classification by @crmne in #624
- Fix streaming to preserve whitespace chunks in chat UI template by @kryzhovnik in #636
- Fix migration class name for model names with acronyms (e.g. model:AIModel) by @Saidbek in #640
- Remove dependency on obsolete 'codecov' gem by @mvz in #625
- Detect context length exceeded errors on HTTP 400 responses by @plehoux in #642
- Use UTC for created_at in order to prevent diff noise when running models:update from a different timezone by @radanskoric in #631
- Adds opentelemetry-instrumentation-ruby_llm to the ecosystem by @clarissalimab in #599
- Add configurable Anthropic API base URL by @ericproulx in #589
- Add ollama_api_key support for remote Ollama endpoints by @geeksilva97 in #612
- Add Anthropic structured output support by @hiasinho in #608
- Add Bedrock structured output support by @llenodo in #619
- Add thought signature support for Google Gemini OpenAI compatibility by @ericproulx in #588
- Data loss in cleanup_orphaned_tool_results with custom association by @bschmeck in #584
- Handle tool hallucination gracefully by @redox in #580
- Fix streaming tool call nil arguments by @afurm in #587
- Preserves assume_model_exists in to_llm for custom models by @creaumond in #564
- Fix paint not working with OpenRouter provider (#513) by @khasinski in #558
- Allow DeepSeek api base override by @flyerhzm in #575
- Add specs for v1.7/v1.9 upgrade generators by @afurm in #539
- chore: update GitHub Actions to latest major versions by @seuros in #534
- Fix structured output multi-turn conversation error by @alexey-hunter-io in #531
- Fix acts_as_tool_call message: option (#514) by @saurabh-sikchi in #515
- feat: Allow configuring OpenRouter API base via openrouter_api_base s⦠by @graysonchen in #381
- Add built-in support for tool control parameters by @nwumnn in #347
- Adding option for configuring custom log Regexp timeout by @nina-instrumentl in #364
- Fix Openrouter Error Parser to Handle Detailed Error Messages by @xiaohui-zhangxh in #431
- Add custom schema name support by @llenodo in #476
- Adding RubyLLM::RedCandle to the ecosystem page (documentation only) by @cpetersen in #535
- [docs] Add RubyLLM::Instrumentation and RubyLLM::Monitoring to ecosystem by @patriciomacadden in #569
- Fix VertexAI scope argument for GCE credentials by @jesse-spevack in #520
- VertexAI: service account key support without ADC regressions by @crmne in #646
New Contributors
- @kryzhovnik made their first contribution in #636
- @Saidbek made their first contribution in #640
- @plehoux made their first contribution in #642
- @radanskoric made their first contribution in #631
- @clarissalimab made their first contribution in #599
- @ericproulx made their first contribution in #589
- @geeksilva97 made their first contribution in #612
- @hiasinho made their first contribution in #608
- @llenodo made their first contribution in #619
- @bschmeck made their first contribution in #584
- @creaumond made their first contribution in #564
- @khasinski made their first contribution in #558
- @flyerhzm made their first contribution in #575
- @alexey-hunter-io made their first contribution in #531
- @saurabh-sikchi made their first contribution in #515
- @nwumnn made their first contribution in #347
- @nina-instrumentl made their first contribution in #364
- @xiaohui-zhangxh made their first contribution in #431
- @cpetersen made their first contribution in #535
- @patriciomacadden made their first contribution in #569
- @jesse-spevack made their first contribution in #520
Full Changelog: 1.12.1...1.13.0