MLflow 1.0 includes many significant features and improvements. From this version, MLflow is no longer beta, and all APIs except those marked as experimental are intended to be stable until the next major version. As such, this release includes a number of breaking changes.
Major features, improvements, and breaking changes
-
Support for recording, querying, and visualizing metrics along a new “step” axis (x coordinate), providing increased flexibility for examining model performance relative to training progress. For example, you can now record performance metrics as a function of the number of training iterations or epochs. MLflow 1.0’s enhanced metrics UI enables you to visualize the change in a metric’s value as a function of its step, augmenting MLflow’s existing UI for plotting a metric’s value as a function of wall-clock time. (#1202, #1237, @dbczumar; #1132, #1142, #1143, @smurching; #1211, #1225, @Zangr; #1372, @stbof)
-
Search improvements. MLflow 1.0 includes additional support in both the API and UI for searching runs within a single experiment or a group of experiments. The search filter API supports a simplified version of the
SQL WHEREclause. In addition to searching using run's metrics and params, the API has been enhanced to support a subset of run attributes as well as user and system tags. For details see Search syntax and examples for programmatically searching runs. (#1245, #1272, #1323, #1326, @mparkhe; #1052, @Zangr; #1363, @aarondav) -
Logging metrics in batches. MLflow 1.0 now has a
runs/log-batchREST API endpoint for logging multiple metrics, params, and tags in a single API request. The endpoint useful for performant logging of multiple metrics at the end of a model training epoch (see example), or logging of many input model parameters at the start of training. You can call this batched-logging endpoint from Python (mlflow.log_metrics,mlflow.log_params,mlflow.set_tags), R (mlflow_log_batch), and Java (MlflowClient.logBatch). (#1214, @dbczumar; see 0.9.1 and 0.9.0 for other changes) -
Windows support for MLflow Tracking. The Tracking portion of the MLflow client is now supported on Windows. (#1171, @eedeleon, @tomasatdatabricks)
-
HDFS support for artifacts. Hadoop artifact repository with Kerberos authorization support was added, so you can use HDFS to log and retrieve models and other artifacts. (#1011, @jaroslawk)
-
CLI command to build Docker images for serving. Added an
mlflow models build-dockerCLI command for building a Docker image capable of serving an MLflow model. The model is served at port 8080 within the container by default. Note that this API is experimental and does not guarantee that the arguments nor format of the Docker container will remain the same. (#1329, @smurching, @tomasatdatabricks) -
New
onnxmodel flavor for saving, loading, and evaluating ONNX models with MLflow. ONNX flavor APIs are available in themlflow.onnxmodule. (#1127, @avflor, @dbczumar; #1388, @dbczumar) -
Major breaking changes:
-
Some of the breaking changes involve database schema changes in the SQLAlchemy tracking store. If your database instance's schema is not up-to-date, MLflow will issue an error at the start-up of
mlflow serverormlflow ui. To migrate an existing database to the newest schema, you can use themlflow db upgradeCLI command. (#1155, #1371, @smurching; #1360, @aarondav) -
[Installation] The MLflow Python package no longer depends on
scikit-learn,mleap, orboto3. If you want to use thescikit-learnsupport, theMLeapsupport, ors3artifact repository /sagemakersupport, you will have to install these respective dependencies explicitly. (#1223, @aarondav) -
[Artifacts] In the Models API, an artifact's location is now represented as a URI. See the documentation for the list of accepted URIs. (#1190, #1254, @dbczumar; #1174, @dbczumar, @sueann; #1206, @tomasatdatabricks; #1253, @stbof)
-
The affected methods are:
- Python:
<model-type>.load_model,azureml.build_image,sagemaker.deploy,sagemaker.run_local,pyfunc._load_model_env,pyfunc.load_pyfunc, andpyfunc.spark_udf - R:
mlflow_load_model,mlflow_rfunc_predict,mlflow_rfunc_serve - CLI:
mlflow models serve,mlflow models predict,mlflow sagemaker,mlflow azureml(with the new--model-urioption)
- Python:
-
To allow referring to artifacts in the context of a run, MLflow introduces a new URI scheme of the form
runs:/<run_id>/relative/path/to/artifact. (#1169, #1175, @sueann)
-
-
[CLI]
mlflow pyfuncandmlflow rfunccommands have been unified asmlflow models(#1257, @tomasatdatabricks; #1321, @dbczumar) -
[CLI]
mlflow artifacts download,mlflow artifacts download-from-uriandmlflow downloadcommands have been consolidated intomlflow artifacts download(#1233, @sueann) -
[Runs] Expose
RunDatafields (metrics,params,tags) as dictionaries. Note that themlflow.entities.RunDataconstructor still accepts lists ofmetric/param/tagentities. (#1078, @smurching) -
[Runs] Rename
run_uuidtorun_idin Python, Java, and REST API. Where necessary, MLflow will continue to acceptrun_uuiduntil MLflow 1.1. (#1187, @aarondav)
-
Other breaking changes
CLI:
- The
--file-storeoption is deprecated inmlflow serverandmlflow uicommands. (#1196, @smurching) - The
--hostand--gunicorn-optsoptions are removed in themlflow uicommand. (#1267, @aarondav) - Arguments to
mlflow experimentssubcommands, notably--experiment-nameand--experiment-idare now options (#1235, @sueann) mlflow sagemaker list-flavorshas been removed (#1233, @sueann)
Tracking:
- The
userproperty ofRuns has been moved to tags (similarly, therun_name,source_type,source_nameproperties were moved to tags in 0.9.0). (#1230, @acroz; #1275, #1276, @aarondav) - In R, the return values of experiment CRUD APIs have been updated to more closely match the REST API. In particular,
mlflow_create_experimentnow returns a string experiment ID instead of an experiment, and the other APIs return NULL. (#1246, @smurching) RunInfo.status's type is now string. (#1264, @mparkhe)- Remove deprecated
RunInfoproperties fromstart_run. (#1220, @aarondav) - As deprecated in 0.9.1 and before, the
RunInfofieldsrun_name,source_name,source_version,source_type, andentry_point_nameand theSearchRunsfieldanded_expressionshave been removed from the REST API and Python, Java, and R tracking client APIs. They are still available as tags, documented in the REST API documentation. (#1188, @aarondav)
Models and deployment:
-
In Python, require arguments as keywords in
log_model,save_modelandadd_to_modelmethods in thetensorflowandmleapmodules to avoid breaking changes in the future (#1226, @sueann) -
Remove the unsupported
jarsargument from ```spark.log_model`` in Python (#1222, @sueann) -
Introduce
pyfunc.load_modelto be consistent with other Models modules.pyfunc.load_pyfuncwill be deprecated in the near future. (#1222, @sueann) -
Rename
dst_pathparameter inpyfunc.save_modeltopath(#1221, @aarondav) -
R flavors refactor (#1299, @kevinykuo)
mlflow_predict()has been added in favor ofmlflow_predict_model()andmlflow_predict_flavor()which have been removed.mlflow_save_model()is now a generic andmlflow_save_flavor()is no longer needed and has been removed.mlflow_predict()takes...to pass to underlying predict methods.mlflow_load_flavor()now has the signaturefunction(flavor, model_path)and flavor authors should implementmlflow_load_flavor.mlflow_flavor_{FLAVORNAME}. The flavor argument is inferred from the inputs of user-facingmlflow_load_model()and does not need to be explicitly provided by the user.
Projects:
- Remove and rename some
projects.runparameters for generality and consistency. (#1222, @sueann) - In R, the
mlflow_runAPI for running MLflow projects has been modified to more closely reflect the Pythonmlflow.runAPI. In particular, the order of theuriandentry_pointarguments has been reversed and theparam_listargument has been renamed toparameters. (#1265, @smurching)
R:
- Remove
mlflow_snapshotandmlflow_restore_snapshotAPIs. Also, ther_dependenciesargument used to specify the path to a packrat r-dependencies.txt file has been removed from all APIs. (#1263, @smurching) - The
mlflow_cliandcrateAPIs are now private. (#1246, @smurching)
Environment variables:
-
Prefix environment variables with "MLFLOW_" (#1268, @aarondav). Affected variables are:
- [Tracking]
_MLFLOW_SERVER_FILE_STORE,_MLFLOW_SERVER_ARTIFACT_ROOT,_MLFLOW_STATIC_PREFIX - [SageMaker]
MLFLOW_SAGEMAKER_DEPLOY_IMG_URL,MLFLOW_DEPLOYMENT_FLAVOR_NAME - [Scoring]
MLFLOW_SCORING_SERVER_MIN_THREADS,MLFLOW_SCORING_SERVER_MAX_THREADS
- [Tracking]
More features and improvements
- [Tracking] Non-default driver support for SQLAlchemy backends:
db+driveris now a valid tracking backend URI scheme (#1297, @drewmcdonald; #1374, @mparkhe) - [Tracking] Validate backend store URI before starting tracking server (#1218, @luke-zhu, @sueann)
- [Tracking] Add
GetMetricHistoryclient API in Python and Java corresponding to the REST API. (#1178, @smurching) - [Tracking] Add
view_typeargument toMlflowClient.list_experiments()in Python. (#1212, @smurching) - [Tracking] Dictionary values provided to
mlflow.log_paramsandmlflow.set_tagsin Python can now be non-string types (e.g., numbers), and they are automatically converted to strings. (#1364, @aarondav) - [Tracking] R API additions to be at parity with REST API and Python (#1122, @kevinykuo)
- [Tracking] Limit number of results returned from
SearchRunsAPI and UI for faster load (#1125, @mparkhe; #1154, @andrewmchen) - [Artifacts] To avoid having many copies of large model files in serving,
ArtifactRepository.download_artifactsno longer copies local artifacts (#1307, @andrewmchen; #1383, @dbczumar) - [Artifacts][Projects] Support GCS in download utilities.
gs://bucket/pathfiles are now supported by themlflow artifacts downloadCLI command and as parameters of typepathin MLProject files. (#1168, @drewmcdonald) - [Models] All Python models exported by MLflow now declare
mlflowas a dependency by default. In addition, we introduce a flag--install-mlflowusers can pass tomlflow models serveandmlflow models predictmethods to force installation of the latest version of MLflow into the model's environment. (#1308, @tomasatdatabricks) - [Models] Update model flavors to lazily import dependencies in Python. Modules that define Model flavors now import extra dependencies such as
tensorflow,scikit-learn, andpytorchinside individual methods, ensuring that these modules can be imported and explored even if the dependencies have not been installed on your system. Also, theDEFAULT_CONDA_ENVIRONMENTmodule variable has been replaced with aget_default_conda_env()function for each flavor. - [Models] It is now possible to pass extra arguments to
mlflow.keras.load_modelthat will be passed through tokeras.load_model. (#1330, @@yorickvP) - [Serving] For better performance, switch to
gunicornfor serving Python models. This does not change the user interface. (#1322, @tomasatdatabricks) - [Deployment] For SageMaker, use the uniquely-generated model name as the S3 bucket prefix instead of requiring one. (#1183, @dbczumar)
- [REST API] Add support for API paths without the
previewcomponent. Thepreviewpaths will be deprecated in a future version of MLflow. (#1236, @mparkhe)
Bug fixes and documentation updates
- [Tracking] Log metric timestamps in milliseconds by default (#1177, @smurching; #1333, @dbczumar)
- [Tracking] Fix bug when deserializing integer experiment ID for runs in
SQLAlchemyStore(#1167, @smurching) - [Tracking] Ensure unique constraint names in MLflow tracking database (#1292, @smurching)
- [Tracking] Fix base64 encoding for basic auth in R tracking client (#1126, @freefrag)
- [Tracking] Correctly handle
file:URIs for the-—backend-store-urioption inmlflow serverandmlflow uiCLI commands (#1171, @eedeleon, @tomasatdatabricks) - [Artifacts] Update artifact repository download methods to return absolute paths (#1179, @dbczumar)
- [Artifacts] Make FileStore respect the default artifact location (#1332, @dbczumar)
- [Artifacts] Fix
log_artifactfailures due to existing directory on FTP server (#1327, @kafendt) - [Artifacts] Fix GCS artifact logging of subdirectories (#1285, @jason-huling)
- [Projects] Fix bug not sharing
SQLitedatabase file with Docker container (#1347, @tomasatdatabricks; #1375, @aarondav) - [Java] Mark
sendPostandsendGetas experimental (#1186, @aarondav) - [Python][CLI] Mark
azureml.build_imageas experimental (#1222, #1233 @sueann) - [Docs] Document public MLflow environment variables (#1343, @aarondav)
- [Docs] Document MLflow system tags for runs (#1342, @aarondav)
- [Docs] Autogenerate CLI documentation to include subcommands and descriptions (#1231, @sueann)
- [Docs] Update run selection description in
mlflow_get_runin R documentation (#1258, @dbczumar) - [Examples] Update examples to reflect API changes (#1361, @tomasatdatabricks; #1367, @mparkhe)
Small bug fixes and doc updates (#1359, #1350, #1331, #1301, #1270, #1271, #1180, #1144, #1135, #1131, #1358, #1369, #1368, #1387, @aarondav; #1373, @akarloff; #1287, #1344, #1309, @stbof; #1312, @hchiuzhuo; #1348, #1349, #1294, #1227, #1384, @tomasatdatabricks; #1345, @withsmilo; #1316, @ancasarb; #1313, #1310, #1305, #1289, #1256, #1124, #1097, #1162, #1163, #1137, #1351, @smurching; #1319, #1244, #1224, #1195, #1194, #1328, @dbczumar; #1213, #1200, @Kublai-Jing; #1304, #1320, @andrewmchen; #1311, @Zangr; #1306, #1293, #1147, @mateiz; #1303, @gliptak; #1261, #1192, @eedeleon; #1273, #1259, @kevinykuo; #1277, #1247, #1243, #1182, #1376, @mparkhe; #1210, @vgod-dbx; #1199, @ashtuchkin; #1176, #1138, #1365, @sueann; #1157, @cclauss; #1156, @clemens-db; #1152, @pogil; #1146, @srowen; #875, #1251, @jimthompson5802)