[1.6.4] - 2022-06-01
Added
- Added all DDP params to be exposed through hpu parallel strategy (#13067)
Changed
- Keep
torch.backends.cudnn.benchmark=False
by default (unlike in v1.6.{0-4}) after speed and memory problems depending on the data used. Please consider tuningTrainer(benchmark)
manually. (#13154) - Prevent modification of
torch.backends.cudnn.benchmark
whenTrainer(benchmark=...)
is not set (#13154)
Fixed
- Fixed an issue causing zero-division error for empty dataloaders (#12885)
- Fixed mismatching default values for the types of some arguments in the DeepSpeed and Fully-Sharded strategies which made the CLI unable to use them (#12989)
- Avoid redundant callback restore warning while tuning (#13026)
- Fixed
Trainer(precision=64)
during evaluation which now uses the wrapped precision module (#12983) - Fixed an issue to use wrapped
LightningModule
for evaluation duringtrainer.fit
forBaguaStrategy
(#12983) - Fixed an issue wrt unnecessary usage of habana mixed precision package for fp32 types (#13028)
- Fixed the number of references of
LightningModule
so it can be deleted (#12897) - Fixed
materialize_module
setting a module's child recursively (#12870) - Fixed issue where the CLI could not pass a
Profiler
to theTrainer
(#13084) - Fixed torchelastic detection with non-distributed installations (#13142)
- Fixed logging's step values when multiple dataloaders are used during evaluation (#12184)
- Fixed epoch logging on train epoch end (#13025)
- Fixed
DDPStrategy
andDDPSpawnStrategy
to initialize optimizers only after moving the module to the device (#11952)
Contributors
@akihironitta @ananthsub @ar90n @awaelchli @Borda @carmocca @dependabot @jerome-habana @mads-oestergaard @otaj @rohitgr7