Lightning-AI/pytorch-lightning 0.7.2 on GitHub

Overview

This release aims at fixing particular issues and improving the user development experience via extending docs, adding typing and supporting python 3.8. In particular, some of the release highlights are:

Added benchmark for comparing lightning with vanilla implementations
Extended optimizer support with particular frequency
Several improvements for loggers such as represent no-primitive types, supporting hierarchical dictionaries for hyper param searchers
Added model configuration checking before it runs
Simplify the PL examples structure (shallower and more readable)
Improved Trainer CLI arguments handling (generalization)
Two Trainer argument become deprecated: print_nan_grads and show_progress_bar

Detail changes

Added

Added same step loggers' metrics aggregation (#1278)
Added parity test between a vanilla MNIST model and lightning model (#1284)
Added parity test between a vanilla RNN model and lightning model (#1351)
Added Reinforcement Learning - Deep Q-network (DQN) lightning example (#1232)
Added support for hierarchical dict (#1152)
Added TrainsLogger class (#1122)
Added type hints to pytorch_lightning.core (#946)
Added support for IterableDataset in validation and testing (#1104)
Added support for non-primitive types in hparams for TensorboardLogger (#1130)
Added a check that stops the training when loss or weights contain NaN or inf values. (#1097)
Added support for IterableDataset when val_check_interval=1.0 (default), this will trigger validation at the end of each epoch. (#1283)
Added summary method to Profilers. (#1259)
Added informative errors if user defined dataloader has zero length (#1280)
Added testing for python 3.8 (#915)
Added a training_epoch_end method which is the mirror of validation_epoch_end. (#1357)
Added model configuration checking (#1199)
Added support for optimizer frequencies through LightningModule.configure_optimizers() (#1269)
Added option to run without an optimizer by returning None from configure_optimizers. (#1279)
Added a warning when the number of data loader workers is small. (#1378)

Changed

Changed (renamed and refactored) TensorRunningMean -> TensorRunningAccum: running accumulations were generalized. (#1278)
Changed progress_bar_refresh_rate trainer flag to disable progress bar when setting to 0. (#1108)
Enhanced load_from_checkpoint to also forward params to the model (#1307)
Updated references to self.forward() to instead use the __call__ interface. (#1211)
Changed default behaviour of configure_optimizers to use no optimizer rather than Adam. (#1279)
Allow uploading models on W&B (#1339)
On DP and DDP2 unsqueeze is automated now (#1319)
Did not always create a DataLoader during reinstantiation, but the same type as before (if a subclass of DataLoader) (#1346)
Did not interfere with a default sampler (#1318)
Removed default Adam optimizer (#1317)
Gave warnings for unimplemented required lightning methods (#1317)
Made evaluate method private >> Trainer._evaluate(...). (#1260)
Simplify the PL examples structure (shallower and more readable) (#1247)
Changed min-max GPU memory to be on their own plots (#1358)
Remove .item which causes sync issues (#1254)
Changed smoothing in TQDM to decrease variability of time remaining between training/eval (#1194)
Change default logger to a dedicated one (#1064)

Deprecated

Deprecated Trainer argument print_nan_grads (#1097)
Deprecated Trainer argument show_progress_bar (#1108)

Removed

Removed duplicated module pytorch_lightning.utilities.arg_parse for loading CLI arguments (#1167)
Removed wandb logger's finalize method (#1193)
Dropped torchvision dependency in tests and added own MNIST dataset class instead (#986)

Fixed

Fixed model_checkpoint when saving all models (#1359)
Trainer.add_argparse_args classmethod fixed. Now it adds a type for the arguments (#1147)
Fixed bug related to type cheking of ReduceLROnPlateau lr schedulers(#1114)
Fixed a bug to ensure lightning checkpoints to be backward compatible (#1132)
Fixed a bug that created an extra dataloader with active reload_dataloaders_every_epoch (#1181)
Fixed all warnings and errors in the docs build process (#1191)
Fixed an issue where val_percent_check=0 would not disable validation (#1251)
Fixed average of incomplete TensorRunningMean (#1309)
Fixed WandbLogger.watch with wandb.init() (#1311)
Fixed an issue with early stopping that would prevent it from monitoring training metrics when validation is disabled / not implemented (#1235)
Fixed a bug that would cause trainer.test() to run on the validation set when overloading validation_epoch_end and test_end (#1353)
Fixed WandbLogger.watch - use of the watch method without importing wandb (#1311)
Fixed WandbLogger to be used with 'ddp' - allow reinits in sub-processes (#1149, #1360)
Made training_epoch_end behave like validation_epoch_end (#1357)
Fixed fast_dev_run running validation twice (#1365)
Fixed pickle error from quick patch __code__ (#1352)
Fixed memory leak on GPU0 (#1094, #1349)
Fixed checkpointing interval (#1272)
Fixed validation and training loops run the partial dataset (#1192)
Fixed running on_validation_end only on main process in DDP (#1125)
Fixed load_spawn_weights only in proc rank 0 (#1385)
Fixes use_amp issue (#1145)
Fixes using deprecated use_amp attribute (#1145)
Fixed Tensorboard logger error: lightning_logs directory not exists in multi-node DDP on nodes with rank != 0 (#1375)
Fixed Unimplemented backend XLA error on TPU (#1387)

Contributors

@alexeykarnachev, @amoudgl, @areshytko, @asafmanor, @awaelchli, @bkkaggle, @bmartinn, @Borda, @borisdayma, @cmpute, @djbyrne, @ethanwharris, @gerardrbentley, @jbschiratti, @jeremyjordan, @justusschock, @monney, @mpariente, @pertschuk, @rmrao, @S-aiueo32, @shubhamagarwal92, @SkafteNicki, @sneiman, @tullie, @vanpelt, @williamFalcon, @xingzhaolee

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Lightning-AI/pytorch-lightning 0.7.2 Many bug fixes, added flexibility, parity tests with pytorch and more on GitHub