Release 2.5.0
Major Features and Improvements
-
TPU embedding support
- Added
profile_data_directory
toEmbeddingConfigSpec
in
_tpu_estimator_embedding.py
. This allows embedding lookup statistics
gathered at runtime to be used in embedding layer partitioning decisions.
- Added
-
tf.keras.metrics.AUC
now support logit predictions. -
Creating
tf.random.Generator
undertf.distribute.Strategy
scopes is now allowed (except fortf.distribute.experimental.CentralStorageStrategy
andtf.distribute.experimental.ParameterServerStrategy
). Different replicas will get different random-number streams. -
tf.data
:- tf.data service now supports strict round-robin reads, which is useful
for synchronous training workloads where example sizes vary. With strict
round robin reads, users can guarantee that consumers get similar-sized
examples in the same step. - tf.data service now supports optional compression. Previously data would
always be compressed, but now you can disable compression by passing
compression=None
totf.data.experimental.service.distribute(...)
. tf.data.Dataset.batch()
now supportsnum_parallel_calls
and
deterministic
arguments.num_parallel_calls
is used to indicate that
multiple input batches should be computed in parallel. With
num_parallel_calls
set,deterministic
is used to indicate that
outputs can be obtained in the non-deterministic order.- Options returned by
tf.data.Dataset.options()
are no longer mutable. - tf.data input pipelines can now be executed in debug mode, which
disables any asynchrony, parallelism, or non-determinism and forces
Python execution (as opposed to trace-compiled graph execution) of
user-defined functions passed into transformations such asmap
. The
debug mode can be enabled throughtf.data.experimental.enable_debug_mode()
.
- tf.data service now supports strict round-robin reads, which is useful
-
tf.lite
- Enabled the new MLIR-based quantization backend by default
- The new backend is used for 8 bits full integer post-training quantization
- The new backend removes the redundant rescales and fixes some bugs (shared weight/bias, extremely small scales, etc)
- Set
experimental_new_quantizer
in tf.lite.TFLiteConverter to False to disable this change
- Enabled the new MLIR-based quantization backend by default
-
tf.keras
- Enabled a new supported input type in
Model.fit
,
tf.keras.utils.experimental.DatasetCreator
, which takes a
callable,dataset_fn
.
DatasetCreator
is intended to work across alltf.distribute
strategies, and is the only input type supported for Parameter Server
strategy.
- Enabled a new supported input type in
-
tf.distribute
tf.distribute.experimental.ParameterServerStrategy
now supports
training with KerasModel.fit
when used withDatasetCreator
.
-
PluggableDevice
- Third-party devices can now connect to TensorFlow modularly through StreamExecutor C API and PluggableDevice interface.
- Add custom ops and kernels through
kernel and op registration C API. - Register custom graph optimization passes with
graph optimization C API.
- Add custom ops and kernels through
- Third-party devices can now connect to TensorFlow modularly through StreamExecutor C API and PluggableDevice interface.
-
oneAPI Deep Neural Network Library (oneDNN)
CPU performance optimizations from
Intel-optimized TensorFlow
are now available in the official x86-64 Linux and Windows builds.- They are off by default. Enable them by setting the environment variable
TF_ENABLE_ONEDNN_OPTS=1
. - We do not recommend using them in GPU systems, as they have not been
sufficiently tested with GPUs yet.
- They are off by default. Enable them by setting the environment variable
-
TensorFlow pip packages are now built with CUDA11.2 and cuDNN 8.1.0
Breaking Changes
- The
TF_CPP_MIN_VLOG_LEVEL
environment variable has been renamed to to
TF_CPP_MAX_VLOG_LEVEL
which correctly describes its effect.
Bug Fixes and Other Changes
-
tf.keras
:- Preprocessing layers API consistency changes:
StringLookup
addedoutput_mode
,sparse
, and
pad_to_max_tokens
arguments with same semantics as
TextVectorization
.IntegerLookup
addedoutput_mode
,sparse
, and
pad_to_max_tokens
arguments with same semantics as
TextVectorization
. Renamedmax_values
,oov_value
and
mask_value
tomax_tokens
,oov_token
andmask_token
to align
withStringLookup
andTextVectorization
.TextVectorization
default forpad_to_max_tokens
switched to
False.CategoryEncoding
no longer supportsadapt
,IntegerLookup
now supports equivalent functionality.max_tokens
argument renamed
tonum_tokens
.Discretization
addednum_bins
argument for learning bins
boundaries through callingadapt
on a dataset. Renamedbins
argument tobin_boundaries
for specifying bins withoutadapt
.
- Improvements to model saving/loading:
model.load_weights
now accepts paths to saved models.
- Keras inputs can now be created directly from arbitrary
tf.TypeSpecs
. - Two new learning rate schedules added:
tf.keras.optimizers.schedules.CosineDecay
and
tf.keras.optimizers.schedules.CosineDecayRestarts
.
- Preprocessing layers API consistency changes:
-
tf.data
:- Exposing
tf.data.experimental.ExternalStatePolicy
, which can be used
to control how external state should be handled during dataset
serialization or iterator checkpointing. - Changing
tf.data.experimental.save
to store the type specification of
the dataset elements. This avoids the need for explicitly specifying the
element_spec
argument oftf.data.experimental.load
when loading the
previously saved dataset. - Add
.element_spec
property totf.data.DatasetSpec
to access the
inner spec. This can be used to extract the structure of nested
datasets. - Add
tf.data.experimental.AutoShardingPolicy.HINT
which can be used
to provide hints to tf.distribute-based auto-sharding as to where in
the input pipeline to insert sharding transformations. - Make tf.data.Options persistent across
tf.function
andGraphDef
boundaries.
- Exposing
-
XLA compilation:
tf.function(experimental_compile=True)
has become a stable API,
renamedtf.function(jit_compile=True)
.- XLA can now compile MirroredStrategy: the step function passed to
strategy.run
can now be annoted withjit_compile=True
.
-
tf.distribute
:- Rename
experimental_prefetch_to_device
intf.distribute.InputOptions
toexperimental_fetch_to_device
to better reflect the purpose.
- Rename
-
tf.lite
:- class
tflite::Subgraph
:- Removed the
tensors()
method and the non-const overload of the
nodes_and_registration()
method, both of which were previously
documented as temporary and to be removed.- Uses of
tensors()
can be replaced by calling the existing
methodstensors_size()
andtensor(int)
. - Uses of the non-const overload of
nodes_and_registration
can be replaced by calling the existing methodsnodes_size()
andcontext()
, and then calling theGetNodeAndRegistration
method in theTfLiteContext
returned bycontext()
.
- Uses of
- Removed the
- NNAPI
- Removed deprecated
Interpreter::UseNNAPI(bool)
C++ API.- Use
NnApiDelegate()
and related delegate configuration methods
directly.
- Use
- Replaced the model cache key for models computation algorithm with
one guaranteed to be stable across runs.
- Removed deprecated
- 16 bits quantization
- Added int16x8 support for ABS, REDUCE_MAX and REDUCE_MIN operators.
- Additional tests and fixes for ADD and SUB operators.
- Added support for saved model's session initializer through
TFLiteConverter.from_saved_model
. - Added DEPTH_TO_SPACE support in Post training quantization.
- Added dynamic range quantization support for the BatchMatMul op.
- Both symmetric and asymmetric quantized input tensor are supported.
- Add
RFFT2D
as builtin op. (RFFT2D
also supportsRFFTD
.) Currently
only supports float32 input. - Add 5D support to
SLICE
op. - TFLite Supports SingatureDef:
- TFLiteConverter exports models with SignatureDef
- Interpreter supports getting a list of signatures and getting callable
function for a given signaturedef.
- Add int8 support for
ReshapeV2
. - Add experimental support for optimization with sparsity.
- Add nominal support for unsigned 32-bit integer tensor types. Note that
very few TFLite kernels support this type natively, so its use in mobile
ML authoring is generally discouraged. - Add support for static hash tables through
TFLiteConverter.from_saved_model
. - The Python TF Lite Interpreter bindings now has an option
experimental_preserve_all_tensors
to aid in debugging conversion. - Quantized x86 execution defaults to Ruy GEMM library for platforms with
AVX support. - Deprecate
tf.compat.v1.lite.experimental.get_potentially_supported_ops
.
Usetf.lite.TFLiteConverter
directly to check whether a model is
convertible. - Add support to select one of three different built-in op resolvers to be
- Enabled post training with calibrations for models that require user
provied TensorFlow Lite custom op libraries via
converter.target_spec._experimental_custom_op_registerers
.
used in Python Interpreter API.
- class
-
TF Core:
- Corrected higher-order gradients of control flow constructs (
tf.cond
,
tf.while_loop
, and compositions liketf.foldl
) computed with
tf.GradientTape
inside atf.function
. - Changed the default step size in
gradient_checker_v2.compute_gradients
to be exactly representable as a binary floating point numbers. This avoids poluting gradient approximations needlessly, which is some cases leads to false negatives in op gradient tests. - Added
tf.config.experimental.get_memory_info
, returning a dict with the
current and peak memory usage. Deprecated
tf.config.experimental.get_memory_usage
in favor of this new function. - Extended
tf.config.experimental.enable_tensor_float_32_execution
to
control Tensor-Float-32 evaluation in RNNs. - Added a 'experimental_payloads' field to tf.errors.OpError and
its subclasses to support more detailed error reporting.
This is inspired from Abseil Status payloads:
https://github.com/abseil/abseil-cpp/blob/master/absl/status/status.h
- Corrected higher-order gradients of control flow constructs (
-
tf.summary
:- New
tf.summary.graph
allows manual write of TensorFlow graph
(tf.Graph
ortf.compat.v1.GraphDef
) as a summary. This is not a
replacement for the trace-based API.
- New
-
Set
/d2ReducedOptimizeHugeFunctions
by default for Windows builds. This
provides a big compile-time speedup, and effectively raises the minimum
supported MSVC version to 16.4 (current: 16.8). -
TensorRT
- Removed the deprecated
session_config
parameter for the TF1-TRT
converterTrtGraphConverter
. Previously, we issued a warning when the
value of the parameter is not None. - The TF2-TRT converter
TrtGraphConverterV2
takes an object of class
TrtConversionParams as a parameter. Removed three deprecated fields from
this class:rewriter_config_template
,is_dynamic_op
, and
max_batch_size
. Previously, we issued a warning when the value of
rewriter_config_template
is not None. We issued an error when the
value ofis_dynamic_op
is not True. We didn't use the value for
max_batch_size
for building TensorRT engines. Add parameters
use_dynamic_shape
to enable dynamic shape support. The default is to
disable dynamic shape support. Adddynamic_shape_profile_strategy
for selecting a dynamic shape profile strategy. The default is profile
strategy isRange
. - Issue a warning when function get_tensorrt_rewriter_config is used.
- Removed the deprecated
-
TF XLA
- Add new enum value
MLIR_BRIDGE_ROLLOUT_SAFE_MODE_ENABLED
to
tf.config.experimental.mlir_bridge_rollout
to enable a "safe" mode.
This runs the MLIR bridge only when an analysis of the graph only when
an analysis of the graph determines that it is safe to run. - Add new enum value 'MLIR_BRIDGE_ROLLOUT_SAFE_MODE_FALLBACK_ENABLED' to
tf.config.experimental.mlir_bridge_rollout
to enable a fallback for
the MLIR bridge in a "safe" mode. This runs the MLIR bridge in a
FallbackEnabled mode when an analysis of the graph determines
that the graph does not have unsupported features.
- Add new enum value
-
Other
- Added
show_debug_info
tomlir.convert_graph_def
and
mlir.convert_function
. - Added Arm Compute Library (ACL)
support to--config=mkl_aarch64
build.
- Added
Thanks to our Contributors
This release contains contributions from many people at Google, as well as:
8bitmp3, Aaron S. Mondal, Abhilash Mahendrakar, Abhinav Upadhyay, Abhishek Kulkarni, Abolfazl Shahbazi, Adam Hillier, Aditya Kane, Ag Ramesh, ahmedsabie, Albert Villanova Del Moral, Aleksey Vitebskiy, Alex Hoffman, Alexander Bayandin, Alfie Edwards, Aman Kishore, Amogh Joshi, andreABbauer, Andrew Goodbody, Andrzej Pomirski, Artemiy Ryabinkov, Ashish Jha, ather, Ayan Moitra, Bairen Yi, Bart Ribbers, Bas Aarts, Behzad Abghari, Ben Arnao, Ben Barsdell, Benjamin Klimczak, bhack, Brendan Collins, Can Wang, Cheng Ren, Chris Leary, Chris Olivier, Clemens Giuliani, Cloud Han, Corey Cole, Cui, Yifeng, Cuong V. Nguyen, Daniel Moore, Dawid Wojciechowski, Ddavis-2015, Dean Wyatte, Denisa Roberts, dependabot[bot], Dmitry Volodin, Dominic Jack, Duncan Riach, dushuai, Elena Zhelezina, Eli Osherovich, Erik Smistad, ewsn1593, Felix Fent, fo40225, François Chollet, Frederic Bastien, Freedom" Koan-Sin Tan, fsx950223, ganand1, gbaned, Georgiy Manuilov, gerbauz, Guillaume Klein, Guozhong Zhuang, Harry Slatyer, Harsh188, henri, Henri Woodcock, Hiran Sarkar, Hollow Man, Håkon Sandsmark, I Wayan Dharmana, icysapphire, Ikko Ashimine, Jab Hofmeier, Jack Hessel, Jacob Valdez, Jakub Jatczak, James Bernardi, Jared Smolens, Jason Zaman, jedlimlx, Jenny Plunkett, Jens Elofsson, Jerry Shih, jgehw, Jia Fu Low, Jim Fisher, jpodivin, Julien Stephan, Jungsub Lim, Junha Park, Junhyuk So, justkw, Kaixi Hou, kashyapraval, Kasra Bigdeli, Kazuaki Ishizaki, Keith Mok, Kevin Cheng, kopytjuk, Kristian Hartikainen, ksood12345, Kulin Seth, kushanam, latyas, Lequn Chen, Leslie-Fang, Long M. Lưu, Lukas Geiger, machineko, Mahmoud Abuzaina, Manish, Mao Yunfei, Maozhou, Ge, Marcin Juszkiewicz, Marcin Owsiany, Marconi Jiang, Marcos Pereira, Maria Romanenko Vexlard, Maria Vexlard, Marius Brehler, marload, Martin Kubovčík, Matej, Mateusz Holenko, Maxiwell S. Garcia, Mazhar, mazharul, mbhuiyan, mdfaijul, Michael Gielda, Michael Kuchnik, Michal Szutenberg, Mikhail Stepanov, Milan Straka, Mitchel Humpherys, Mohamed Moselhy, Mohamed Nour Abouelseoud, Måns Bermell, Måns Nilsson, Nathan Luehr, Nico Jahn, Niroop Ammbashankar, Oceania2018, Omri Steiner, Orivej Desh, Oskar Flordal, oujiafan, Patrik Laurell, Paul B. Isaac'S, Paul Klinger, Pawel Piskorski, Pedro Marques, Phat Tran, Piotr Zierhoffer, piyushdatta, Pnikam-Cad, Prashant Kumar, Prateek Gupta, PratsBhatt, Pravin Karandikar, qqq.jq, QQ喵, Quintin, Rama Ketineni, ravikyram, Rehan Guha, rhdong, rmothukuru, Roger Cheng, Rohit Santhanam, rposts, Rsanthanam-Amd, rsun, Rsun-Bdti, Ryan Kuester, ryanking13, Saduf2019, Sami Kama, Samuel Marks, Scott Tseng, Sean Moriarity, Sergey Popov, Sergii Khomenko, Sheng, Yang, shwetaoj, Sidong-Wei, Simon Maurer, Simrit Kaur, Srini511, Srinivasan Narayanamoorthy, Stephan, Stephen Matthews, Sungmann Cho, Sunoru, Suraj Sudhir, Suraj Upadhyay, Taebum Kim, Takayoshi Koizumi, Tamas Bela Feher, Teng Lu, Thibaut Goetghebuer-Planchon, Tomwildenhain-Microsoft, Tony, Traun Leyden, Trent Lo, TVLIgnacy, Tzu-Wei Sung, vaibhav, Vignesh Kothapalli, Vikram Dattu, viktprog, Vinayaka Bandishti, Vincent Abriou, Vishakha Agrawal, Vivek Panyam, Vladimir Silyaev, Võ Văn Nghĩa, wamuir, Wang, Yanzhang, wangsiyu, Waqar Hameed, wxinix, Xiao Yang, xiaohong1031, Xiaoming (Jason) Cui, Xinan Jiang, Yair Ehrenwald, Yajush Vyas, Yasir Modak, Yimei Sun, Yong Tang, Yosshi999, youshenmebutuo, yqtianust, Yuan Tang, yuanbopeng, Yuriy Chernyshov, Yuta Fukasawa, Zachary Deane-Mayer, Zeno Gantner, Zhoulong Jiang, zhuyie, zilinzhu, 彭震东