Detailed Changes
Added
-
Added support for backward_passes_per_step > 1 for TF Keras graph mode. (#2346)
-
Added support for backward_passes_per_step > 1 for TF Keras eager execution. (#2371)
-
Added support for backward_passes_per_step > 1 for TF LegacyOptimizer in graph mode. (#2401)
-
Added grouped allreduce to enable more efficient tensor fusion and deterministic training. (#2453)
-
Add support for specifying
op
andcompression
inhorovod.tensorflow.keras.allreduce()
. (#2423) -
Adding support for batched D2D memcopy kernel on GPU. (#2435)
-
Added schema inference in Spark Estimator without sampling. (#2373)
-
Added
Store.create("dbfs:/")
mapping toDBFSLocalStore("/dbfs/...")
. (#2376)
Changed
-
Changed Keras callbacks to require parameter
initial_lr
ofLearningRateScheduleCallback
andLearningRateWarmupCallback
. (#2459) -
Changed default cycle time from 5ms to 1ms and fusion threshold from 64MB to 128MB. (#2468)
Fixed
-
Fixed support for TensorFlow v2.4.0. (#2381)
-
Fixed averaging using CUDA half2 implementation one element half buffers. (#2375)
-
Fixed
HOROVOD_THREAD_AFFINITY
when using oneCCL. (#2350) -
Added timeout to SSH check in horovodrun to prevent hanging. (#2448)
-
Added
HOROVOD_GLOO_TIMEOUT_SECONDS
value to error messages. (#2436) -
Fixed race condition in dynamic timeline API. (#2341)
-
Fixed --log-hide-timestamp to apply to driver logs with Gloo. (#2388)