Overview
The newest PyTorch Lightning release includes final API clean-up with better data decoupling and shorter logging syntax.
Were happy to release PyTorch Lightning 0.9 today, which contains many great new features, more bugfixes than any release we ever had, but most importantly it introduced our mostly final API changes! Lightning is being adopted by top researchers and AI labs around the world, and we are working hard to make sure we provide a smooth experience and support for all the latest best practices.
Detail changes
Added
- Added SyncBN for DDP (#2801, #2838)
- Added basic
CSVLogger
(#2721) - Added SSIM metrics (#2671)
- Added BLEU metrics (#2535)
- Added support to export a model to ONNX format (#2596)
- Added support for
Trainer(num_sanity_val_steps=-1)
to check all validation data before training (#2246) - Added struct. output:
- Added class
LightningDataModule
(#2668) - Added support for PyTorch 1.6 (#2745)
- Added call DataModule hooks implicitly in trainer (#2755)
- Added support for Mean in DDP Sync (#2568)
- Added remaining
sklearn
metrics:AveragePrecision
,BalancedAccuracy
,CohenKappaScore
,DCG
,Hamming
,Hinge
,Jaccard
,MeanAbsoluteError
,MeanSquaredError
,MeanSquaredLogError
,MedianAbsoluteError
,R2Score
,MeanPoissonDeviance
,MeanGammaDeviance
,MeanTweedieDeviance
,ExplainedVariance
(#2562) - Added support for
limit_{mode}_batches (int)
to work with infinite dataloader (IterableDataset) (#2840) - Added support returning python scalars in DP (#1935)
- Added support to Tensorboard logger for OmegaConf
hparams
(#2846) - Added tracking of basic states in
Trainer
(#2541) - Tracks all outputs including TBPTT and multiple optimizers (#2890)
- Added GPU Usage Logger (#2932)
- Added
strict=False
forload_from_checkpoint
(#2819) - Added saving test predictions on multiple GPUs (#2926)
- Auto log the computational graph for loggers that support this (#3003)
- Added warning when changing monitor and using results obj (#3014)
- Added a hook
transfer_batch_to_device
to theLightningDataModule
(#3038)
Changed
- Truncated long version numbers in progress bar (#2594)
- Enabling val/test loop disabling (#2692)
- Refactored into
accelerator
module: - Using
.comet.config
file forCometLogger
(#1913) - Updated hooks arguments - breaking for
setup
andteardown
(#2850) - Using
gfile
to support remote directories (#2164) - Moved optimizer creation after device placement for DDP backends (#2904](https://github.com/PyTorchLightning/pytorch-lighting/pull/2904))
- Support
**DictConfig
forhparam
serialization (#2519) - Removed callback metrics from test results obj (#2994)
- Re-enabled naming metrics in ckpt name (#3060)
- Changed progress bar epoch counting to start from 0 (#3061)
Deprecated
- Deprecated Trainer attribute
ckpt_path
, which will now be set byweights_save_path
(#2681)
Removed
- Removed deprecated: (#2760)
- core decorator
data_loader
- Module hook
on_sanity_check_start
and loadingload_from_metrics
- package
pytorch_lightning.logging
- Trainer arguments:
show_progress_bar
,num_tpu_cores
,use_amp
,print_nan_grads
- LR Finder argument
num_accumulation_steps
- core decorator
Fixed
- Fixed
accumulate_grad_batches
for last batch (#2853) - Fixed setup call while testing (#2624)
- Fixed local rank zero casting (#2640)
- Fixed single scalar return from training (#2587)
- Fixed Horovod backend to scale LR schedlers with the optimizer (#2626)
- Fixed
dtype
anddevice
properties not getting updated in submodules (#2657) - Fixed
fast_dev_run
to run for all dataloaders (#2581) - Fixed
save_dir
in loggers getting ignored by default value ofweights_save_path
when user did not specifyweights_save_path
(#2681) - Fixed
weights_save_path
getting ignored whenlogger=False
is passed to Trainer (#2681) - Fixed TPU multi-core and Float16 (#2632)
- Fixed test metrics not being logged with
LoggerCollection
(#2723) - Fixed data transfer to device when using
torchtext.data.Field
andinclude_lengths is True
(#2689) - Fixed shuffle argument for the distributed sampler (#2789)
- Fixed logging interval (#2694)
- Fixed loss value in the progress bar is wrong when
accumulate_grad_batches > 1
(#2738) - Fixed correct CWD for DDP sub-processes when using Hydra (#2719)
- Fixed selecting GPUs using
CUDA_VISIBLE_DEVICES
(#2739, #2796) - Fixed false
num_classes
warning in metrics (#2781) - Fixed shell injection vulnerability in subprocess call (#2786)
- Fixed LR finder and
hparams
compatibility (#2821) - Fixed
ModelCheckpoint
not saving the latest information whensave_last=True
(#2881) - Fixed ImageNet example: learning rate scheduler, number of workers and batch size when using DDP (#2889)
- Fixed apex gradient clipping (#2829)
- Fixed save apex scaler states (#2828)
- Fixed a model loading issue with inheritance and variable positional arguments (#2911)
- Fixed passing
non_blocking=True
when transferring a batch object that does not support it (#2910) - Fixed checkpointing to remote file paths (#2925)
- Fixed adding
val_step
argument to metrics (#2986) - Fixed an issue that caused
Trainer.test()
to stall in DDP mode (#2997) - Fixed gathering of results with tensors of varying shape (#3020)
- Fixed batch size auto-scaling feature to set the new value on the correct model attribute (#3043)
- Fixed automatic batch scaling not working with half-precision (#3045)
- Fixed setting device to root GPU (#3042)
Contributors
@ananthsub, @ananyahjha93, @awaelchli, @bkhakshoor, @Borda, @ethanwharris, @f4hy, @groadabike, @ibeltagy, @justusschock, @lezwon, @nateraw, @neighthan, @nsarang, @PhilJd, @pwwang, @rohitgr7, @romesco, @ruotianluo, @shijianjian, @SkafteNicki, @tgaddair, @thschaaf, @williamFalcon, @xmotli02, @ydcjeff, @yukw777, @zerogerc
If we forgot someone due to not matching commit email with GitHub account, let us know :]