We are excited to announce the release of MLflow 2.18.0! This release includes a number of significant features, enhancements, and bug fixes.
Python Version Update
Python 3.8 is now at an end-of-life point. With official support being dropped for this legacy version, MLflow now requires Python 3.9
as a minimum supported version.
Note: If you are currently using MLflow's
ChatModelinterface for authoring custom GenAI applications, please ensure that you
have read the future breaking changes section below.
Major New Features
-
🦺 Fluent API Thread/Process Safety - MLflow's fluent APIs for tracking and the model registry have been overhauled to add support for both thread and multi-process safety. You are now no longer forced to use the Client APIs for managing experiments, runs, and logging from within multiprocessing and threaded applications. (#13456, #13419, @WeichenXu123)
-
🧩 DSPy flavor - MLflow now supports logging, loading, and tracing of
DSPymodels, broadening the support for advanced GenAI authoring within MLflow. Check out the MLflow DSPy Flavor documentation to get started! (#13131, #13279, #13369, #13345, @chenmoneygithub, #13543, #13800, #13807, @B-Step62, #13289, @michael-berk) -
🖥️ Enhanced Trace UI - MLflow Tracing's UI has undergone a significant overhaul to bring usability and quality of life updates to the experience of auditing and investigating the contents of GenAI traces, from enhanced span content rendering using markdown to a standardized span component structure, (#13685, #13357, #13242, @daniellok-db)
-
🚄 New Tracing Integrations - MLflow Tracing now supports DSPy, LiteLLM, and Google Gemini, enabling a one-line, fully automated tracing experience. These integrations unlock enhanced observability across a broader range of industry tools. Stay tuned for upcoming integrations and updates! (#13801, @TomeHirata, #13585, @B-Step62)
-
📊 Expanded LLM-as-a-Judge Support - MLflow now enhances its evaluation capabilities with support for additional providers, including
Anthropic,Bedrock,Mistral, andTogetherAI, alongside existing providers likeOpenAI. Users can now also configure proxy endpoints or self-hosted LLMs that follow the provider API specs by using the newproxy_urlandextra_headersoptions. Visit the LLM-as-a-Judge documentation for more details! (#13715, #13717, @B-Step62) -
⏰ Environment Variable Detection - As a helpful reminder for when you are deploying models, MLflow now detects and reminds users of environment variables set during model logging, ensuring they are configured for deployment. In addition to this, the
mlflow.models.predictutility has also been updated to include these variables in serving simulations, improving pre-deployment validation. (#13584, @serena-ruan)
Breaking Changes to ChatModel Interface
-
ChatModel Interface Updates - As part of a broader unification effort within MLflow and services that rely on or deeply integrate
with MLflow's GenAI features, we are working on a phased approach to making a consistent and standard interface for custom GenAI
application development and usage. In the first phase (planned for release in the next few releases of MLflow), we are marking
several interfaces as deprecated, as they will be changing. These changes will be:- Renaming of Interfaces:
ChatRequest→ChatCompletionRequestto provide disambiguation for future planned request interfaces.ChatResponse→ChatCompletionResponsefor the same reason as the input interface.metadatafields withinChatRequestandChatResponse→custom_inputsandcustom_outputs, respectively.
- Streaming Updates:
predict_streamwill be updated to enable true streaming for custom GenAI applications. Currently, it returns a generator with synchronous outputs from predict. In a future release, it will return a generator ofChatCompletionChunks, enabling asynchronous streaming. While the API call structure will remain the same, the returned data payload will change significantly, aligning with LangChain’s implementation.
- Legacy Dataclass Deprecation:
- Dataclasses in
mlflow.models.rag_signatureswill be deprecated, merging into unifiedChatCompletionRequest,ChatCompletionResponse, andChatCompletionChunks.
- Dataclasses in
- Renaming of Interfaces:
Other Features:
- [Evaluate] Add Huggingface BLEU metrics to MLflow Evaluate (#12799, @nebrass)
- [Models / Databricks] Add support for
spark_udfwhen running on Databricks Serverless runtime, Databricks connect, and prebuilt python environments (#13276, #13496, @WeichenXu123) - [Scoring] Add a
model_configparameter forpyfunc.spark_udffor customization of batch inference payload submission (#13517, @WeichenXu123) - [Tracing] Standardize retriever span outputs to a list of MLflow
Documents (#13242, @daniellok-db) - [UI] Add support for visualizing and comparing nested parameters within the MLflow UI (#13012, @jescalada)
- [UI] Add support for comparing logged artifacts within the Compare Run page in the MLflow UI (#13145, @jescalada)
- [Databricks] Add support for
resourcesdefinitions forLangchainmodel logging (#13315, @sunishsheth2009) - [Databricks] Add support for defining multiple retrievers within
dependenciesfor Agent definitions (#13246, @sunishsheth2009)
Bug fixes:
- [Database] Cascade deletes to datasets when deleting experiments to fix a bug in MLflow's
gccommand when deleting experiments with logged datasets (#13741, @daniellok-db) - [Models] Fix a bug with
Langchain'spyfuncpredict input conversion (#13652, @serena-ruan) - [Models] Fix signature inference for subclasses and
Optionaldataclasses that define a model's signature (#13440, @bbqiu) - [Tracking] Fix an issue with async logging batch splitting validation rules (#13722, @WeichenXu123)
- [Tracking] Fix an issue with
LangChain's autologging thread-safety behavior (#13672, @B-Step62) - [Tracking] Disable support for running spark autologging in a threadpool due to limitations in Spark (#13599, @WeichenXu123)
- [Tracking] Mark
roleandindexas required for chat schema (#13279, @chenmoneygithub) - [Tracing] Handle raw response in openai autolog (#13802, @harupy)
- [Tracing] Fix a bug with tracing source run behavior when running inference with multithreading on
Langchainmodels (#13610, @WeichenXu123)
Documentation updates:
- [Docs] Add docstring warnings for upcoming changes to ChatModel (#13730, @stevenchen-db)
- [Docs] Add a contributor's guide for implementing tracing integrations (#13333, @B-Step62)
- [Docs] Add guidance in the use of
model_configwhen logging models as code (#13631, @sunishsheth2009) - [Docs] Add documentation for the use of custom library artifacts with the
code_pathsmodel logging feature (#13702, @TomeHirata) - [Docs] Improve
SparkMLlog_modeldocumentation with guidance on how return probabilities from classification models (#13684, @WeichenXu123)
Small bug fixes and documentation updates:
#13775, #13768, #13764, #13744, #13699, #13742, #13703, #13669, #13682, #13569, #13563, #13562, #13539, #13537, #13533, #13408, #13295, @serena-ruan; #13768, #13764, #13761, #13738, #13737, #13735, #13734, #13723, #13726, #13662, #13692, #13689, #13688, #13680, #13674, #13666, #13661, #13625, #13460, #13626, #13546, #13621, #13623, #13603, #13617, #13614, #13606, #13600, #13583, #13601, #13602, #13604, #13598, #13596, #13597, #13531, #13594, #13589, #13581, #13112, #13587, #13582, #13579, #13578, #13545, #13572, #13571, #13564, #13559, #13565, #13558, #13541, #13560, #13556, #13534, #13386, #13532, #13385, #13384, #13383, #13507, #13523, #13518, #13492, #13493, #13487, #13490, #13488, #13449, #13471, #13417, #13445, #13430, #13448, #13443, #13429, #13418, #13412, #13382, #13402, #13381, #13364, #13356, #13309, #13313, #13334, #13331, #13273, #13322, #13319, #13308, #13302, #13268, #13298, #13296, @harupy; #13705, @williamjamir; #13632, @shichengzhou-db; #13755, #13712, #13260, @BenWilson2; #13745, #13743, #13697, #13548, #13549, #13577, #13349, #13351, #13350, #13342, #13341, @WeichenXu123; #13807, #13798, #13787, #13786, #13762, #13749, #13733, #13678, #13721, #13611, #13528, #13444, #13450, #13360, #13416, #13415, #13336, #13305, #13271, @B-Step62; #13808, #13708, @smurching; #13739, @fedorkobak; #13728, #13719, #13695, #13677, @TomeHirata; #13776, #13736, #13649, #13285, #13292, #13282, #13283, #13267, @daniellok-db; #13711, @bhavya2109sharma; #13693, #13658, @aravind-segu; #13553, @dsuhinin; #13663, @gitlijian; #13657, #13629, @parag-shendye; #13630, @JohannesJungbluth; #13613, @itepifanio; #13480, @agjendem; #13627, @ilyaresh; #13592, #13410, #13358, #13233, @nojaf; #13660, #13505, @sunishsheth2009; #13414, @lmoros-DB; #13399, @Abubakar17; #13390, @KekmaTime; #13291, @michael-berk; #12511, @jgiannuzzi; #13265, @Ahar28; #13785, @Rick-McCoy; #13676, @hyolim-e; #13718, @annzhang-db; #13705, @williamjamir