BerriAI/litellm v1.40.27 on GitHub

✨ Thrilled to launch support for @NVIDIA NIM LLM API on @LiteLLM 1.40.27 👉 Start here: https://docs.litellm.ai/docs/providers/nvidia_nim

🔥 Proxy 100+ LLMS & set budgets

🛠️ [Fix] - use n in mock completion response on litellm mock responses

⚡️ [Feat] add endpoint to debug memory utilization

🔑 enterprise - allow verifying license in air gapped vpc

What's Changed

[Fix-Improve] Improve Ollama prompt input and fix Ollama function calling key error and fix Ollama function calling can only join an iterable error by @CorrM in #4373
Fix Groq Prices by @kiriloman in #4401
[Feat] add endpoint to debug memory util by @ishaan-jaff in #4364
[Feat-New Provider] Add Nvidia NIM by @ishaan-jaff in #4403
[Fix] - use n in mock completion responses by @ishaan-jaff in #4405
enterprise - allow verifying license in air gapped vpc by @ishaan-jaff in #4409
Create litellm user to fix issue with prisma in k8s by @lolsborn in #4402
[Enterprise] Add secret detection pre call hook by @ishaan-jaff in #4410
Revert "Create litellm user to fix issue with prisma in k8s " by @krrishdholakia in #4412
fix(router.py): set cooldown_time: per model by @krrishdholakia in #4411

Full Changelog: v1.40.26...v1.40.27

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.40.27

Name	Status	Median Response Time (ms)	Average Response Time (ms)	Requests/s	Failures/s	Request Count	Failure Count	Min Response Time (ms)	Max Response Time (ms)
/chat/completions	Passed ✅	130.0	156.61068343517005	6.372506185089714	0.0	1905	0	109.52021800000011	1799.9076889999515
Aggregated	Passed ✅	130.0	156.61068343517005	6.372506185089714	0.0	1905	0	109.52021800000011	1799.9076889999515