NVIDIA-NeMo/Guardrails v0.22.0 on GitHub

What's changed

The three major features in this release are:

Anonymous usage reporting: Basic usage reporting is documented with clear privacy boundaries and opt-out controls. The telemetry reference explains what fields are collected, what data is excluded, how local audit files work, and how to opt out with NEMO_GUARDRAILS_NO_USAGE_STATS=1, DO_NOT_TRACK=1, or the ~/.config/nemoguardrails/do_not_track file.
LangChain decoupling: We have had many customers ask us to be less dependent on LangChain. Now, LangChain is optional. For OpenAI-compatible LLMs, we now ship with a built-in client over httpx that offers better direct support. For others, you can bring LangChain back with NEMOGUARDRAILS_LLM_FRAMEWORK=langchain, and install the matching provider package.
IORails milestone 2: The second phase of our new, more efficient orchestration engine. This brings us closer to feature parity to LLMRails by including streaming, OpenTelemetry, and reasoning-model support. We also added speculative generation in non-streaming mode, which runs input rails in parallel with the main application LLM response generation. This hides the latency of input rails while still guaranteeing response safety.

For an exhaustive list of new features and bugfixes, please see the release notes for more details.

🚀 Features

(iorails) IORails support for streaming output rails (#1765, #1766)
(iorails) IORails OpenTelemetry tracing support (#1793, #1794, #1798)
(iorails) IORails OpenTelemetry token-level metrics support (#1812, #1846)
(iorails) IORails reasoning model support (#1842, #1843)
(llm) Add LangChain adapter and framework registry (#1759)
(llm) Add streaming tool call accumulation and LLMResponse parity (#1789)
(llm) Add default framework with OpenAI-compatible client (#1797)
(llm/frameworks) Validate framework on registration (#1863)
(types) Add framework-agnostic LLM type system (#1745)
(compat) Transitional compat layer to migrate from 0.21 to 0.22+ (#1841)
(testing) Add public testing surface under nemoguardrails.testing (#1860)
(api) Canonical top-level imports for LLM types and registry functions (#1882)
(config) Forbade extra fields in GLiNER rails configs (#1898)
(framework) Support Azure as a first-class default framework preset (#1896)

🐛 Bug Fixes

[breaking] Reject Colang 2.0 public runtime state (#1885)
(server) Prioritize env var API key over forwarded client header (#1688)
(utils) Removing extra space from UtteranceBotActionScriptUpdated (#1708)
(actions) Remove redundant embedding search in generate_user_intent (#1754)
(llmrails) Backfill embedding model params into search provider config (fixes stale KB cache) (#1753)
(embeddings) Persist in-memory embedding cache instance across calls (#1755)
(ci) Pin baseline x86-64 compiler target to prevent SIGILL on cached venvs (#1785)
(tests) Use asyncio.run instead of get_event_loop in middleware tests (#1804)
(actions) Guard bot message extraction against composite specs (#1810)
(llm) Drop stop param for OpenAI reasoning models (#1811)
(llmrails) Scope no main LLM warning to generation path (#1813)
(chat-ui) Replace Chatbot UI with Chainlit (#1734)
(library) Unblock reasoning models in self-check and content-safety actions (#1816)
(llm) Drop temperature/stop and rename max_tokens for OpenAI reasoning models (#1837)
(build) Remove chat-ui references from wheel build script (#1835)
(ci) Override annoy's -march=native to actually enforce baseline x86-64 (#1839)
(llm/clients) Retry on stale event loop binding (#1840)
(llm/frameworks) Point users to LangChain when DefaultFramework has no base_url (#1865)
(taskmanager) Preserve multimodal list content in vision safety prompts (#1815)
(actions) Extract text from multimodal events in colang history (#1636)
(iorails) Route to LLMRails if Guardrails inits with provided LLM (#1844)
(iorails) Strip trailing /v1 from base_url to avoid doubled path (#1862)
(iorails) Add all LLMRails methods, llm and runtime getters to Guardrails facade (#1886, #1889)
(iorails) Annotate cancelled OTEL spans with error (#1897)

🚜 Refactor

(llm) [breaking] Atomic switch to LLMModel protocol (#1760)
(iorails) [breaking] Move AsyncWorkQueue from Guardrails to IORails (#1817)
(deps) [breaking] Demote LangChain and LangChain-providers from core to dev (#1806)
(iorails) Return LLMResponse(Chunk) from ModelEngine (#1827)
(iorails) Refactor ModelManager (#1778)
(iorails) Refactor Guardrails and IORails for top-level import and clean separation (#1893)
(iorails) Refactor RailsManager and Nemoguard Actions (#1762)
(llm) Rename generate/stream to generate_async/stream_async (#1769)
(llm) Remove LangChain imports from core modules (#1770)
(llm) Move LangChain implementations into integrations/langchain/ (#1772)
(llm) Framework-owned provider registry (#1773)
(llm) Share OpenAI reasoning-model classifier across adapters (#1836)
(llm) Reorganize llm package into clients/models/frameworks (#1801)
(llm/clients) Return HTTPResponse(body, headers, status_code) from _apost (#1830)
(llm/default_framework) Split reset() into aclose() + clear_providers() (#1829)
(tests) Framework-agnostic test infrastructure (#1790)
(deps) Make server-only dependencies optional (#1689)
(jailbreak) Use onnx instead of pickle to load model (#1715)
(logging) Remove LangChain LoggingCallbackHandler dependency (#1616)

📚 Documentation

Document release notes for 0.21 and additional details (#1726)
(middleware) Fix incorrect example query and expected output in agent-middleware guide (#1784)
(iorails) OTEL Logging page (#1807)
Fix jira 407 (#1809)
Update README (#1820)
Mark LangChain integration as opt-in in 0.22 entry-point docs (#1856)
Documentation for langchain decoupling (#1854)
(configure-rails) Align with 0.22 DefaultFramework / LangChain split (#1855)
(custom-initialization) Add customLLM and customFramework guides (#1857)
(examples) Align example configs and deployment docs with 0.22 DefaultFramework / LangChain split (#1858)
(iorails) IORails OTEL Metrics (#1864)
(iorails) Speculative Generation (#1876)
(migration) Explain the "No default base_url" config-load error (#1881)
(iorails) Use Guardrails entry-point not IORails (#1892)
Prometheus client install instructions (#1894)
(guardrails) Document max_tokens fallback and reasoning model guidance (#1833)
(colang-1) Fix Hello World tutorial issues (NGUARD-666) (#1834)
(telemetry) Document anonymous usage reporting (#1891)
Add prompts.yml to code snippets (#1904)
Update Benchmark README with updated configs (#1905)

🧪 Testing

(llm) Probe OpenAI API to validate _is_openai_reasoning_model (#1814)
(llm) Expand reasoning-model param probe + regenerate baseline (#1838)
(telemetry) Add smoke driver and fixtures (#1879)

⚙️ Miscellaneous Tasks

Include scripts in docker image (#1902)