RubyLLM 1.6.4: Multimodal Tools & Better Schemas ๐ผ๏ธ
Maintenance release bringing multimodal tool responses, improved rake tasks, and important fixes for Gemini schema conversion. Plus better documentation and developer experience!
๐ผ๏ธ Tools Can Now Return Files and Images
Tools can now return rich content with attachments, not just text! Perfect for screenshot tools, document generators, and visual analyzers:
class ScreenshotTool < RubyLLM::Tool
description "Takes a screenshot and returns it"
param :url, desc: "URL to screenshot"
def execute(url:)
screenshot_path = capture_screenshot(url) # Your screenshot logic
# Return a Content object with text and attachments
RubyLLM::Content.new(
"Screenshot of #{url} captured successfully",
[screenshot_path] # Can be file path, StringIO, or ActiveStorage blob
)
end
end
# The LLM can now see and analyze the screenshot
chat = RubyLLM.chat.with_tool(ScreenshotTool)
response = chat.ask("Take a screenshot of ruby-lang.org and describe what you see")
This opens up powerful workflows:
- Visual debugging: Screenshot tools that capture and analyze UI states
- Document generation: Tools that create PDFs and return them for review
- Data visualization: Generate charts and have the LLM interpret them
- Multi-step workflows: Chain tools that produce and consume visual content
Works with all providers that support multimodal content.
๐ง Fixed: Gemini Schema Conversion
Gemini's structured output was not preserving all the schema fields and integer schemas were converted to number. Now the conversion logic correctly handles:
# Preserve description
schema = {
type: 'object',
description: 'An object',
properties: {
example: {
type: "string",
description: "a brief description about the person's time at the conference"
}
},
required: ['example']
}
# Define schema with both number and integer types
schema = {
type: 'object',
properties: {
number1: {
type: 'number',
},
number2: {
type: 'integer',
}
}
}
Also added tests to cover simple and complex schemas, nested objects and arrays, all constraint attributes, nullable fields, descriptions, property ordering for objects.
Thanks to @BrianBorge for reporting and working on the initial PR.
๐ ๏ธ Developer Experience: Improved Rake Tasks
Consolidated Model Management
All model-related tasks are now streamlined and better organized:
# Default task now runs overcommit hooks + model updates
bundle exec rake
# Update models, generate docs, and create aliases in one command
bundle exec rake models
# Individual tasks still available
bundle exec rake models:update # Fetch latest models from providers
bundle exec rake models:docs # Generate model documentation
bundle exec rake models:aliases # Generate model aliases
The tasks have been refactored from 3 separate files into a single, well-organized models.rake
file following Rails conventions.
Release Preparation
New comprehensive release preparation task:
# Prepare for release: refresh cassettes, run hooks, update models
bundle exec rake release:prepare
This task:
- Automatically refreshes stale VCR cassettes (>1 day old)
- Runs overcommit hooks for code quality
- Updates models, docs, and aliases
- Ensures everything is ready for a clean release
Cassette Management
# Verify cassettes are fresh
bundle exec rake release:verify_cassettes
# Refresh stale cassettes automatically
bundle exec rake release:refresh_stale_cassettes
๐ Documentation Updates
- Redirect fix:
/installation
now properly redirects to/getting-started
- Badge refresh: README badges updated to bust GitHub's cache
- Async pattern fix: Corrected supervisor pattern example in agentic workflows guide to avoid "Cannot wait on own fiber!" errors
๐งน Additional Updates
- Appraisal gemfiles updated: All Rails version test matrices refreshed
- Test coverage: New specs for multimodal tool responses
- Provider compatibility: Verified with latest API versions
Installation
gem 'ruby_llm', '1.6.4'
Full backward compatibility maintained. The multimodal tool support is opt-in - existing tools continue working as before.
Full Changelog: 1.6.3...1.6.4