github horovod/horovod v0.21.0
Local Gradient Aggregation, Grouped Allreduce

latest releases: v0.28.1, v0.28.0, v0.27.0...
3 years ago

Detailed Changes

Added

  • Added support for backward_passes_per_step > 1 for TF Keras graph mode. (#2346)

  • Added support for backward_passes_per_step > 1 for TF Keras eager execution. (#2371)

  • Added support for backward_passes_per_step > 1 for TF LegacyOptimizer in graph mode. (#2401)

  • Added grouped allreduce to enable more efficient tensor fusion and deterministic training. (#2453)

  • Add support for specifying op and compression in horovod.tensorflow.keras.allreduce(). (#2423)

  • Adding support for batched D2D memcopy kernel on GPU. (#2435)

  • Added schema inference in Spark Estimator without sampling. (#2373)

  • Added Store.create("dbfs:/") mapping to DBFSLocalStore("/dbfs/..."). (#2376)

Changed

  • Changed Keras callbacks to require parameter initial_lr of LearningRateScheduleCallback and LearningRateWarmupCallback. (#2459)

  • Changed default cycle time from 5ms to 1ms and fusion threshold from 64MB to 128MB. (#2468)

Fixed

  • Fixed support for TensorFlow v2.4.0. (#2381)

  • Fixed averaging using CUDA half2 implementation one element half buffers. (#2375)

  • Fixed HOROVOD_THREAD_AFFINITY when using oneCCL. (#2350)

  • Added timeout to SSH check in horovodrun to prevent hanging. (#2448)

  • Added HOROVOD_GLOO_TIMEOUT_SECONDS value to error messages. (#2436)

  • Fixed race condition in dynamic timeline API. (#2341)

  • Fixed --log-hide-timestamp to apply to driver logs with Gloo. (#2388)

Don't miss a new horovod release

NewReleases is sending notifications on new releases.