Skip to content

Releases: Lightning-AI/pytorch-lightning

Quick patch release

23 Apr 09:09
Compare
Choose a tag to compare

Fixing missing packaging package in dependencies, which was affecting the only installation to a very blank system.

Standard weekly patch release

22 Apr 20:56
Compare
Choose a tag to compare

[1.2.9] - 2021-04-20

Fixed

  • Fixed the order to call for world ranks & the root_device property in TPUSpawnPlugin (#7074)
  • Fixed multi-gpu join for Horovod (#6954)
  • Fixed parsing for pre-release package versions (#6999)

Contributors

@irasit @Borda @kaushikb11

Standard weekly patch release

14 Apr 19:56
Compare
Choose a tag to compare

[1.2.8] - 2021-04-14

Added

  • Added TPUSpawn + IterableDataset error message (#6875)

Fixed

  • Fixed process rank not being available right away after Trainer instantiation (#6941)
  • Fixed sync_dist for tpus (#6950)
  • Fixed AttributeError for require_backward_grad_sync` when running manual optimization with sharded plugin (#6915)
  • Fixed --gpus default for parser returned by Trainer.add_argparse_args (#6898)
  • Fixed TPU Spawn all gather (#6896)
  • Fixed EarlyStopping logic when min_epochs or min_steps requirement is not met (#6705)
  • Fixed csv extension check (#6436)
  • Fixed checkpoint issue when using Horovod distributed backend (#6958)
  • Fixed tensorboard exception raising (#6901)
  • Fixed setting the eval/train flag correctly on accelerator model (#6983)
  • Fixed DDP_SPAWN compatibility with bug_report_model.py (#6892)
  • Fixed bug where BaseFinetuning.flatten_modules() was duplicating leaf node parameters (#6879)
  • Set better defaults for rank_zero_only.rank when training is launched with SLURM and torchelastic:
    • Support SLURM and torchelastic global rank environment variables (#5715)
    • Remove hardcoding of local rank in accelerator connector (#6878)

Contributors

@ananthsub @awaelchli @ethanwharris @justusschock @kandluis @kaushikb11 @liob @SeanNaren @skmatz

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Standard weekly patch release

07 Apr 17:58
Compare
Choose a tag to compare

[1.2.7] - 2021-04-06

Fixed

  • Fixed resolve a bug with omegaconf and xm.save (#6741)
  • Fixed an issue with IterableDataset when len is not defined (#6828)
  • Sanitize None params during pruning (#6836)
  • Enforce an epoch scheduler interval when using SWA (#6588)
  • Fixed TPU Colab hang issue, post training (#6816])
  • Fixed a bug where TensorBoardLogger would give a warning and not log correctly to a symbolic link save_dir (#6730)

Contributors

@awaelchli, @ethanwharris, @karthikprasad, @kaushikb11, @mibaumgartner, @tchaton

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Standard weekly patch release

30 Mar 14:46
Compare
Choose a tag to compare

[1.2.6] - 2021-03-30

Changed

  • Changed the behavior of on_epoch_start to run at the beginning of validation & test epoch (#6498)

Removed

  • Removed legacy code to include step dictionary returns in callback_metrics. Use self.log_dict instead. (#6682)

Fixed

  • Fixed DummyLogger.log_hyperparams raising a TypeError when running with fast_dev_run=True (#6398)
  • Fixed error on TPUs when there was no ModelCheckpoint (#6654)
  • Fixed trainer.test freeze on TPUs (#6654)
  • Fixed a bug where gradients were disabled after calling Trainer.predict (#6657)
  • Fixed bug where no TPUs were detected in a TPU pod env (#6719)

Contributors

@awaelchli, @carmocca, @ethanwharris, @kaushikb11, @rohitgr7, @tchaton

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Weekly patch release - torchmetrics compatibility

24 Mar 15:17
Compare
Choose a tag to compare

[1.2.5] - 2021-03-23

Changed

  • Added Autocast in validation, test and predict modes for Native AMP (#6565)
  • Update Gradient Clipping for the TPU Accelerator (#6576)
  • Refactored setup for typing friendly (#6590)

Fixed

  • Fixed a bug where all_gather would not work correctly with tpu_cores=8 (#6587)
  • Fixed comparing required versions (#6434)
  • Fixed duplicate logs appearing in console when using the python logging module (#6275)

Contributors

@awaelchli, @Borda, @ethanwharris, @justusschock, @kaushikb11

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Standard weekly patch release

16 Mar 20:29
Compare
Choose a tag to compare

[1.2.4] - 2021-03-16

Changed

  • Changed the default of find_unused_parameters back to True in DDP and DDP Spawn (#6438)

Fixed

  • Expose DeepSpeed loss parameters to allow users to fix loss instability (#6115)
  • Fixed DP reduction with collection (#6324)
  • Fixed an issue where the tuner would not tune the learning rate if also tuning the batch size (#4688)
  • Fixed broadcast to use PyTorch broadcast_object_list and add reduce_decision (#6410)
  • Fixed logger creating directory structure too early in DDP (#6380)
  • Fixed DeepSpeed additional memory use on rank 0 when default device not set early enough (#6460)
  • Fixed DummyLogger.log_hyperparams raising a TypeError when running with fast_dev_run=True (#6398)
  • Fixed an issue with Tuner.scale_batch_size not finding the batch size attribute in the datamodule (#5968)
  • Fixed an exception in the layer summary when the model contains torch.jit scripted submodules (#6511)
  • Fixed when Train loop config was run during Trainer.predict (#6541)

Contributors

@awaelchli, @kaushikb11, @Palzer, @SeanNaren, @tchaton

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Standard weekly patch release

09 Mar 17:28
Compare
Choose a tag to compare

[1.2.3] - 2021-03-09

Fixed

  • Fixed ModelPruning(make_pruning_permanent=True) pruning buffers getting removed when saved during training (#6073)
  • Fixed when _stable_1d_sort to work when n >= N (#6177)
  • Fixed AttributeError when logger=None on TPU (#6221)
  • Fixed PyTorch Profiler with emit_nvtx (#6260)
  • Fixed trainer.test from best_path hangs after calling trainer.fit (#6272)
  • Fixed SingleTPU calling all_gather (#6296)
  • Ensure we check deepspeed/sharded in multinode DDP (#6297)
  • Check LightningOptimizer doesn't delete optimizer hooks (#6305)
  • Resolve memory leak for evaluation (#6326)
  • Ensure that clip gradients is only called if the value is greater than 0 (#6330)
  • Fixed Trainer not resetting lightning_optimizers when calling Trainer.fit() multiple times (#6372)

Contributors

@awaelchli, @carmocca, @chizuchizu, @frankier, @SeanNaren, @tchaton

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Standard weekly patch release

05 Mar 15:12
Compare
Choose a tag to compare

[1.2.2] - 2021-03-02

Added

  • Added checkpoint parameter to callback's on_save_checkpoint hook (#6072)

Changed

  • Changed the order of backward, step, zero_grad to zero_grad, backward, step (#6147)
  • Changed default for DeepSpeed CPU Offload to False, due to prohibitively slow speeds at smaller scale (#6262)

Fixed

  • Fixed epoch level schedulers not being called when val_check_interval < 1.0 (#6075)
  • Fixed multiple early stopping callbacks (#6197)
  • Fixed incorrect usage of detach(), cpu(), to() (#6216)
  • Fixed LBFGS optimizer support which didn't converge in automatic optimization (#6147)
  • Prevent WandbLogger from dropping values (#5931)
  • Fixed error thrown when using valid distributed mode in multi node (#6297)

Contributors

@akihironitta, @borisdayma, @carmocca, @dvolgyes, @SeanNaren, @SkafteNicki

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Standard weekly patch release

24 Feb 17:13
Compare
Choose a tag to compare

[1.2.1] - 2021-02-23

Fixed

  • Fixed incorrect yield logic for the amp autocast context manager (#6080)
  • Fixed priority of plugin/accelerator when setting distributed mode (#6089)
  • Fixed error message for AMP + CPU incompatibility (#6107)

Contributors

@awaelchli, @SeanNaren, @carmocca

If we forgot someone due to not matching commit email with GitHub account, let us know :]