github Lightning-AI/pytorch-lightning 1.1.0
Model Parallelism Training and More Logging Options

latest releases: 2.2.4, 2.2.3, 2.2.2...
3 years ago

Overview

Lightning 1.1 is out! You can now train models with twice the parameters and zero code changes with the new sharded model training! We also have a new plugin for sequential model parallelism, more logging options, and a lot of improvements!
Release highlights: https://bit.ly/3gyLZpP

Learn more about sharded training: https://bit.ly/2W3hgI0

Detail changes

Added

  • Added "monitor" key to saved ModelCheckpoints (#4383)
  • Added ConfusionMatrix class interface (#4348)
  • Added multiclass AUROC metric (#4236)
  • Added global step indexing to the checkpoint name for a better sub-epoch checkpointing experience (#3807)
  • Added optimizer hooks in callbacks (#4379)
  • Added option to log momentum (#4384)
  • Added current_score to ModelCheckpoint.on_save_checkpoint (#4721)
  • Added logging using self.log in train and evaluation for epoch end hooks (#4913)
  • Added ability for DDP plugin to modify optimizer state saving (#4675)
  • Added casting to python types for NumPy scalars when logging hparams (#4647)
  • Added prefix argument in loggers (#4557)
  • Added printing of total num of params, trainable and non-trainable params in ModelSummary (#4521)
  • Added PrecisionRecallCurve, ROC, AveragePrecision class metric (#4549)
  • Added custom Apex and NativeAMP as Precision plugins (#4355)
  • Added DALI MNIST example (#3721)
  • Added sharded plugin for DDP for multi-GPU training memory optimizations (#4773)
  • Added experiment_id to the NeptuneLogger (#3462)
  • Added Pytorch Geometric integration example with Lightning (#4568)
  • Added all_gather method to LightningModule which allows gradient-based tensor synchronizations for use-cases such as negative sampling. (#5012)
  • Enabled self.log in most functions (#4969)
  • Added changeable extension variable for ModelCheckpoint (#4977)

Changed

  • Removed multiclass_roc and multiclass_precision_recall_curve, use roc and precision_recall_curve instead (#4549)
  • Tuner algorithms will be skipped if fast_dev_run=True (#3903)
  • WandbLogger does not force wandb reinit arg to True anymore and creates a run only when needed (#4648)
  • Changed automatic_optimization to be a model attribute (#4602)
  • Changed Simple Profiler report to order by percentage time spent + num calls (#4880)
  • Simplify optimization Logic (#4984)
  • Classification metrics overhaul (#4837)
  • Updated fast_dev_run to accept integer representing num_batches (#4629)
  • Refactored optimizer (#4658)

Deprecated

  • Deprecated prefix argument in ModelCheckpoint (#4765)
  • Deprecated the old way of assigning hyper-parameters through self.hparams = ... (#4813)
  • Deprecated mode='auto' from ModelCheckpoint and EarlyStopping (#4695)

Removed

  • Removed reorder parameter of the auc metric (#5004)

Fixed

  • Added feature to move tensors to CPU before saving (#4309)
  • Fixed LoggerConnector to have logged metrics on root device in DP (#4138)
  • Auto convert tensors to contiguous format when gather_all (#4907)
  • Fixed PYTHONPATH for DDP test model (#4528)
  • Fixed allowing logger to support indexing (#4595)
  • Fixed DDP and manual_optimization (#4976)

Contributors

@ananyahjha93, @awaelchli, @blatr, @Borda, @borisdayma, @carmocca, @ddrevicky, @george-gca, @gianscarpe, @irustandi, @janhenriklambrechts, @jeremyjordan, @justusschock, @lezwon, @rohitgr7, @s-rog, @SeanNaren, @SkafteNicki, @tadejsv, @tchaton, @williamFalcon, @zippeurfou

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Don't miss a new pytorch-lightning release

NewReleases is sending notifications on new releases.