Highlights
- Added Platform LSF and
jsrun
support tohorovodrun
. (#1805) - Added support for running Horovod on Spark with Gloo in place of MPI. (#1807)
- Added synchronous batch normalization for
horovod.torch
API. (#1923)
Additional changes
- Added support for providing a set of inclusive NICs to
horovodrun
. (#1808) - Added optional
initial_lr
parameter toLearningRateScheduleCallback
, deprecated implicit initialization. (#1933) - Changed Spark Estimators to use Petastorm
BatchDataLoader
. (#1879) - Changed Spark Estimators to use Petastorm's
make_reader
API. (#1804) - Improved latency of background thread loop. (#1880)
- Enabled setting Horovod background thread affinity with all frameworks. (#1881)
- Added
verbose
parameter toSparkBackend
. (#1922) - Use parameter names when scheduling broadcasts in MXNet
broadcast_parameters
. (#1894) - Added metadata cache with calling
fit_on_parquet
. (#1826) - Added optional local version to package version. (#1925)
Bugfixes
- Fixed module resolution for
tf.keras
optimizers when callinghvd.load_model
. (#1935) - Modified
safe_shell_exec
to use multiprocessing spawn instead of fork to prevent deadlocks. (#1915) - Fixed multiprocessing to support Python 3.8. (#1904)
- Added extra preprocessor guard for FMA optimization. (#1835)
- Fixed exception in
KerasEstimator
whennum_proc
is larger than 4. (#1945) - Fixed memory leaks. (#1845)
- Fixed a bug with sample weight in
TorchEstimator
. (#1790) - Removed
torchvision
frompytorch
extra. (#1899)