github crmne/ruby_llm 1.4.0

latest releases: 1.6.4, 1.6.3, 1.6.2...
one month ago

RubyLLM 1.4.0: Structured Output, Custom Parameters, and Rails Generators 🚀

We're shipping 1.4.0! This release brings structured output that actually works, direct access to provider-specific parameters, and Rails generators that produce idiomatic Rails code.

🎯 Structured Output with JSON Schemas

Wrestling with LLMs to return valid JSON is over. We've added with_schema that makes structured output as simple as defining what you want:

# Define your schema with the RubyLLM::Schema DSL
# First: gem install ruby_llm-schema
require 'ruby_llm/schema'

class PersonSchema < RubyLLM::Schema
  string :name
  integer :age
  array :skills, of: :string
end

# Get perfectly structured JSON every time
chat = RubyLLM.chat.with_schema(PersonSchema)
response = chat.ask("Generate a Ruby developer profile")

# => {"name" => "Yukihiro", "age" => 59, "skills" => ["Ruby", "C", "Language Design"]}

No more prompt engineering gymnastics. Just schemas and results. Use the RubyLLM::Schema gem for the cleanest DSL, or provide raw JSON schemas if you prefer.

🛠️ Direct Provider Access with with_params

Need to use that one provider-specific parameter? with_params gives you direct access:

# OpenAI's JSON mode
chat.with_params(response_format: { type: "json_object" })
     .ask("List Ruby features as JSON")

No more workarounds. Direct access to any parameter your provider supports.

🚄 Rails Generator That Produces Idiomatic Rails Code

From rails new to chatting with LLMs in under 5 minutes:

rails generate ruby_llm:install

This creates:

  • Migrations with proper Rails conventions
  • Models with acts_as_chat, acts_as_message, and acts_as_tool_call
  • A readable initializer with sensible defaults
  • Zero boilerplate, maximum convention

Your Chat model works exactly as you'd expect:

chat = Chat.create!(model: "gpt-4")
response = chat.ask("Explain Ruby blocks")
# Messages are automatically persisted with proper associations
# Tool calls are tracked, tokens are counted

🔍 Tool Call Transparency

New on_tool_call callback lets you observe and log tool usage:

chat.on_tool_call do |tool_call|
  puts "🔧 AI is calling: #{tool_call.name}"
  puts "   Arguments: #{tool_call.arguments}"
  
  # Perfect for debugging and auditing
  Rails.logger.info "[AI Tool] #{tool_call.name}: #{tool_call.arguments}"
end

chat.ask("What's the weather in Tokyo?").with_tools([weather_tool])
# => 🔧 AI is calling: get_weather
#    Arguments: {"location": "Tokyo"}

🔌 Raw Response Access

Access the underlying Faraday response for debugging or advanced use cases:

response = chat.ask("Hello!")

# Access headers, status, timing
puts response.raw.headers["x-request-id"]
puts response.raw.status
puts response.raw.env.duration

🏭 GPUStack Support

Run models on your own hardware with GPUStack:

RubyLLM.configure do |config|
  config.gpustack_api_base = 'http://localhost:8080/v1'
  config.gpustack_api_key = 'your-key'
end

chat = RubyLLM.chat(model: 'qwen3', provider: 'gpustack')

🐛 Important Bug Fixes

  • Anthropic multiple tool calls now properly handled (was only processing the first tool)
  • Anthropic system prompts fixed to use plain text instead of JSON serialization
  • Message ordering in streaming responses is rock solid
  • Embedding arrays return consistent formats for single and multiple strings
  • URL attachments work properly without argument errors
  • Streaming errors handled correctly in both Faraday V1 and V2
  • JRuby officially supported and tested

🎁 Enhanced Rails Integration

  • Message ordering guidance to prevent race conditions
  • Provider-specific configuration examples
  • Custom model name support with acts_as_ helpers
  • Improved generator output

Context isolation works seamlessly without global config pollution:

# Each request gets its own isolated configuration
tenant_context = RubyLLM.context do |config|
  config.openai_api_key = tenant.api_key
end

tenant_context.chat.ask("Process this tenant's request")
# Global configuration remains untouched

📚 Quality of Life Improvements

  • Removed 60MB of test fixture data
  • OpenAI base URL configuration in bin/console
  • Better error messages for invalid models
  • Enhanced Ollama documentation
  • More code examples throughout

Installation

gem 'ruby_llm', '1.4.0'

Full backward compatibility maintained. Your existing code continues to work while new features await when you need them.

Merged PRs

  • Add OpenAI base URL config to bin/console by @infinityrobot in #283
  • Reject models from Parsera that does not have :provider or :id by @K4sku in #271
  • Fix embedding return format inconsistency for single-string arrays by @finbarr in #267
  • Fix compatibility issue with URL attachments wrong number of arguments by @DustinFisher in #250
  • Add JRuby to CI test job by @headius in #255
  • Add provider specifying example to rails guide by @tpaulshippy in #233
  • More details for configuring Ollama by @jslag in #252
  • Remove 60 MB of the letter 'a' from spec/fixtures/vcr_cassettes by @compumike in #287
  • docs: add guide for using custom model names with acts_as helpers by @matheuscumpian in #171
  • Add RubyLLM::Chat#with_params to add custom parameters to the underlying API payload by @compumike in #265
  • Support gpustack by @graysonchen in #142
  • Update CONTRIBUTING.md by @graysonchen in #289
  • Fix handling of multiple tool calls in single LLM response by @finbarr in #241
  • Rails Generator for RubyLLM Models by @kieranklaassen in #75
  • Anthropic: Fix system prompt (use plain text instead of serialized JSON) by @MichaelHoste in #302
  • Provide access to raw response object from Faraday by @tpaulshippy in #304
  • Add Chat#on_tool_call callback by @bryan-ash in #299
  • Added proper handling of streaming error responses across both Faraday V1 and V2 by @dansingerman in #273
  • Add message ordering guidance to Rails docs by @crmne in #288

New Contributors

Full Changelog: 1.3.1...1.4.0

Don't miss a new ruby_llm release

NewReleases is sending notifications on new releases.