github horovod/horovod v0.19.2
Platform LSF support, Spark on Gloo, and Sync Batch Norm

latest releases: v0.28.1, v0.28.0, v0.27.0...
4 years ago

Highlights

  • Added Platform LSF and jsrun support to horovodrun. (#1805)
  • Added support for running Horovod on Spark with Gloo in place of MPI. (#1807)
  • Added synchronous batch normalization for horovod.torch API. (#1923)

Additional changes

  • Added support for providing a set of inclusive NICs to horovodrun. (#1808)
  • Added optional initial_lr parameter to LearningRateScheduleCallback, deprecated implicit initialization. (#1933)
  • Changed Spark Estimators to use Petastorm BatchDataLoader. (#1879)
  • Changed Spark Estimators to use Petastorm's make_reader API. (#1804)
  • Improved latency of background thread loop. (#1880)
  • Enabled setting Horovod background thread affinity with all frameworks. (#1881)
  • Added verbose parameter to SparkBackend. (#1922)
  • Use parameter names when scheduling broadcasts in MXNet broadcast_parameters. (#1894)
  • Added metadata cache with calling fit_on_parquet. (#1826)
  • Added optional local version to package version. (#1925)

Bugfixes

  • Fixed module resolution for tf.keras optimizers when calling hvd.load_model. (#1935)
  • Modified safe_shell_exec to use multiprocessing spawn instead of fork to prevent deadlocks. (#1915)
  • Fixed multiprocessing to support Python 3.8. (#1904)
  • Added extra preprocessor guard for FMA optimization. (#1835)
  • Fixed exception in KerasEstimator when num_proc is larger than 4. (#1945)
  • Fixed memory leaks. (#1845)
  • Fixed a bug with sample weight in TorchEstimator. (#1790)
  • Removed torchvision from pytorch extra. (#1899)

Don't miss a new horovod release

NewReleases is sending notifications on new releases.