github Lightning-AI/pytorch-lightning 0.8.0
Metrics, speed improvements, new hooks and flags

latest releases: 2.4.0, 2.3.3, 2.3.2...
pre-release4 years ago

Overview

Highlights of this release are adding Metric package and new hooks and flags to customize your workflow.

Major features:

  • brand new Metrics package with built-in DDP support (by @justusschock and @SkafteNicki)
  • hparams can now be anything! (call self.save_hyperparameters() to register anything in the _init_
  • many speed improvements (how we move data, adjusted some flags & PL now adds 300ms overhead per epoch only!)
  • much faster ddp implementation. Old one was renamed ddp_spawn
  • better support for Hydra
  • added the overfit_batches flag and corrected some bugs with the limit_[train,val,test]_batches flag
  • added conda support
  • tons of bug fixes 😉

Detail changes

Added

  • Added overfit_batches, limit_{val|test}_batches flags (overfit now uses training set for all three) (#2213)
  • Added metrics
  • Added type hints in Trainer.fit() and Trainer.test() to reflect that also a list of dataloaders can be passed in (#1723)
  • Allow dataloaders without sampler field present (#1907)
  • Added option save_last to save the model at the end of every epoch in ModelCheckpoint (#1908)
  • Early stopping checks on_validation_end (#1458)
  • Attribute best_model_path to ModelCheckpoint for storing and later retrieving the path to the best saved model file (#1799)
  • Speed up single-core TPU training by loading data using ParallelLoader (#2033)
  • Added a model hook transfer_batch_to_device that enables moving custom data structures to the target device (#1756)
  • Added black formatter for the code with code-checker on pull (#1610)
  • Added back the slow spawn ddp implementation as ddp_spawn (#2115)
  • Added loading checkpoints from URLs (#1667)
  • Added a callback method on_keyboard_interrupt for handling KeyboardInterrupt events during training (#2134)
  • Added a decorator auto_move_data that moves data to the correct device when using the LightningModule for inference (#1905)
  • Added ckpt_path option to LightningModule.test(...) to load particular checkpoint (#2190)
  • Added setup and teardown hooks for model (#2229)

Changed

  • Allow user to select individual TPU core to train on (#1729)
  • Removed non-finite values from loss in LRFinder (#1862)
  • Allow passing model hyperparameters as complete kwarg list (#1896)
  • Renamed ModelCheckpoint's attributes best to best_model_score and kth_best_model to kth_best_model_path (#1799)
  • Re-Enable Logger's ImportErrors (#1938)
  • Changed the default value of the Trainer argument weights_summary from full to top (#2029)
  • Raise an error when lightning replaces an existing sampler (#2020)
  • Enabled prepare_data from correct processes - clarify local vs global rank (#2166)
  • Remove explicit flush from tensorboard logger (#2126)
  • Changed epoch indexing from 1 instead of 0 (#2206)

Deprecated

  • Deprecated flags: (#2213)
    • overfit_pct in favour of overfit_batches
    • val_percent_check in favour of limit_val_batches
    • test_percent_check in favour of limit_test_batches
  • Deprecated ModelCheckpoint's attributes best and kth_best_model (#1799)
  • Dropped official support/testing for older PyTorch versions <1.3 (#1917)

Removed

  • Removed unintended Trainer argument progress_bar_callback, the callback should be passed in by Trainer(callbacks=[...]) instead (#1855)
  • Removed obsolete self._device in Trainer (#1849)
  • Removed deprecated API (#2073)
    • Packages: pytorch_lightning.pt_overrides, pytorch_lightning.root_module
    • Modules: pytorch_lightning.logging.comet_logger, pytorch_lightning.logging.mlflow_logger, pytorch_lightning.logging.test_tube_logger, pytorch_lightning.overrides.override_data_parallel, pytorch_lightning.core.model_saving, pytorch_lightning.core.root_module
    • Trainer arguments: add_row_log_interval, default_save_path, gradient_clip, nb_gpu_nodes, max_nb_epochs, min_nb_epochs, nb_sanity_val_steps
    • Trainer attributes: nb_gpu_nodes, num_gpu_nodes, gradient_clip, max_nb_epochs, min_nb_epochs, nb_sanity_val_steps, default_save_path, tng_tqdm_dic

Fixed

  • Run graceful training teardown on interpreter exit (#1631)
  • Fixed user warning when apex was used together with learning rate schedulers (#1873)
  • Fixed multiple calls of EarlyStopping callback (#1863)
  • Fixed an issue with Trainer.from_argparse_args when passing in unknown Trainer args (#1932)
  • Fixed bug related to logger not being reset correctly for model after tuner algorithms (#1933)
  • Fixed root node resolution for SLURM cluster with dash in hostname (#1954)
  • Fixed LearningRateLogger in multi-scheduler setting (#1944)
  • Fixed test configuration check and testing (#1804)
  • Fixed an issue with Trainer constructor silently ignoring unknown/misspelt arguments (#1820)
  • Fixed save_weights_only in ModelCheckpoint (#1780)
  • Allow use of same WandbLogger instance for multiple training loops (#2055)
  • Fixed an issue with _auto_collect_arguments collecting local variables that are not constructor arguments and not working for signatures that have the instance not named self (#2048)
  • Fixed mistake in parameters' grad norm tracking (#2012)
  • Fixed CPU and hanging GPU crash (#2118)
  • Fixed an issue with the model summary and example_input_array depending on a specific ordering of the submodules in a LightningModule (#1773)
  • Fixed Tpu logging (#2230)
  • Fixed Pid port + duplicate rank_zero logging (#2140, #2231)

Contributors

@awaelchli, @baldassarreFe, @Borda, @borisdayma, @cuent, @devashishshankar, @ivannz, @j-dsouza, @justusschock, @kepler, @kumuji, @lezwon, @lgvaz, @LoicGrobol, @mateuszpieniak, @maximsch2, @moi90, @rohitgr7, @SkafteNicki, @tullie, @williamFalcon, @yukw777, @ZhaofengWu

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Don't miss a new pytorch-lightning release

NewReleases is sending notifications on new releases.