github NVIDIA-NeMo/Guardrails v0.21.0

7 hours ago

What's Changed

This release introduces IORails, a new optimized Input/Output rail engine that supports parallel execution of NemoGuard rails (content-safety, topic-safety, and jailbreak detection) with logging and unique request IDs. A new check_async method in LLMRails enables standalone input/output rails validation without requiring a full conversation flow. The guardrails server is now fully OpenAI-compatible (including a new v1/models endpoint), and a new GuardrailsMiddleware enables seamless integration with LangChain agents. New community integrations include PolicyAI for content moderation, CrowdStrike AIDR, and regex-based detection rails. Embedding index initialization is now lazy, improving startup performance. Streaming internals have been cleaned up along with a major documentation revamp.

🚀 Features

  • (library) Update Trend Micro Vision One AI Guard official endpoint (#1546)
  • (llmrails) Add check_async method for input/output rails validation (#1605)
  • (server) Make guardrails server OpenAI compatible (#1340)
  • (integration) Add GuardrailsMiddleware for LangChain agent (#1606)
  • (library) Update Fiddler Guardrails API to match new specification (#1619)
  • (library) Add CrowdStrike AIDR community integration (#1601)
  • (iorails) Introduce IORails optimized Input/Output rail engine. Supports non-streaming parallel nemoguard input/output rails (content-safety, topic-safety, jailbreak detection) (#1638, #1649, #1654, #1656, #1658, #1660, #1661, #1674)
  • (server) Add OpenAI compatible v1/models endpoint (#1637)
  • (benchmark) Add Locust stress-test (#1629)
  • (jailbreak) Validate Jailbreak Detection config at create-time (#1675)
  • (library) Add PolicyAI Integration for Content Moderation (#1576)

🐛 Bug Fixes

  • (server) Make openai an optional server-only dependency (#1623)
  • (actions) Rename generate_next_step to generate_next_steps for task-specific LLM support (#1603)
  • (library) Add valid alias to action results in GuardrailsAI integration (#1578) (#1611)
  • (llm) Filter stop parameter for OpenAI reasoning models (#1653)
  • (logging) Show cache hits in Stats log and fix duplicate metadata restore (#1666)
  • (cache) Make cache stats log visible in verbose mode (#1667)
  • (library) Use bot refuse to respond in gliner PII detection flows (#1671)
  • (streaming) Handle None stop tokens in streaming handler (#1685)
  • (streaming) Handle dict chunks in RollingBuffer.format_chunks (#1687)
  • (middleware) Handle MODIFIED status in GuardrailsMiddleware instead of silently dropping it (#1714)

🚜 Refactor

  • (streaming) Remove LangChain callback dependencies from StreamingHandler (#1547)
  • (streaming) Remove ChatNVIDIA streaming patch (#1607)
  • (streaming) [breaking] Remove stream_usage and fix streaming metadata capture (#1624)

⚡ Performance

  • (actions) Lazy initialization of embedding indexes (#1572)

⚙️ Miscellaneous Tasks

  • Update Pangea User-Agent repo URL (#1595) (#1610)
  • (jailbreak) Update dependencies for jailbreak detection docker container. (#1596)
  • Remove multi_kb example (#1673)
  • (iorails) Increase work queue concurrency and depth (#1674)
  • (docs) Remove AI Virtual Assistant Blueprint notebook (#1682)
  • Update dependencies ahead of v0.21 release (#1617)

New Contributors

Full Changelog: v0.20.0...v0.21.0

Don't miss a new Guardrails release

NewReleases is sending notifications on new releases.