Releases: Lightning-AI/pytorch-lightning
Quick patch release
Fixing missing packaging
package in dependencies, which was affecting the only installation to a very blank system.
Standard weekly patch release
Standard weekly patch release
[1.2.8] - 2021-04-14
Added
- Added TPUSpawn + IterableDataset error message (#6875)
Fixed
- Fixed process rank not being available right away after
Trainer
instantiation (#6941) - Fixed
sync_dist
for tpus (#6950) - Fixed
AttributeError for
require_backward_grad_sync` when running manual optimization with sharded plugin (#6915) - Fixed
--gpus
default for parser returned byTrainer.add_argparse_args
(#6898) - Fixed TPU Spawn all gather (#6896)
- Fixed
EarlyStopping
logic whenmin_epochs
ormin_steps
requirement is not met (#6705) - Fixed csv extension check (#6436)
- Fixed checkpoint issue when using Horovod distributed backend (#6958)
- Fixed tensorboard exception raising (#6901)
- Fixed setting the eval/train flag correctly on accelerator model (#6983)
- Fixed DDP_SPAWN compatibility with bug_report_model.py (#6892)
- Fixed bug where
BaseFinetuning.flatten_modules()
was duplicating leaf node parameters (#6879) - Set better defaults for
rank_zero_only.rank
when training is launched with SLURM and torchelastic:
Contributors
@ananthsub @awaelchli @ethanwharris @justusschock @kandluis @kaushikb11 @liob @SeanNaren @skmatz
If we forgot someone due to not matching commit email with GitHub account, let us know :]
Standard weekly patch release
[1.2.7] - 2021-04-06
Fixed
- Fixed resolve a bug with omegaconf and
xm.save
(#6741) - Fixed an issue with IterableDataset when len is not defined (#6828)
- Sanitize None params during pruning (#6836)
- Enforce an epoch scheduler interval when using SWA (#6588)
- Fixed TPU Colab hang issue, post training (#6816])
- Fixed a bug where
TensorBoardLogger
would give a warning and not log correctly to a symbolic linksave_dir
(#6730)
Contributors
@awaelchli, @ethanwharris, @karthikprasad, @kaushikb11, @mibaumgartner, @tchaton
If we forgot someone due to not matching commit email with GitHub account, let us know :]
Standard weekly patch release
[1.2.6] - 2021-03-30
Changed
- Changed the behavior of
on_epoch_start
to run at the beginning of validation & test epoch (#6498)
Removed
- Removed legacy code to include
step
dictionary returns incallback_metrics
. Useself.log_dict
instead. (#6682)
Fixed
- Fixed
DummyLogger.log_hyperparams
raising aTypeError
when running withfast_dev_run=True
(#6398) - Fixed error on TPUs when there was no
ModelCheckpoint
(#6654) - Fixed
trainer.test
freeze on TPUs (#6654) - Fixed a bug where gradients were disabled after calling
Trainer.predict
(#6657) - Fixed bug where no TPUs were detected in a TPU pod env (#6719)
Contributors
@awaelchli, @carmocca, @ethanwharris, @kaushikb11, @rohitgr7, @tchaton
If we forgot someone due to not matching commit email with GitHub account, let us know :]
Weekly patch release - torchmetrics compatibility
[1.2.5] - 2021-03-23
Changed
- Added Autocast in validation, test and predict modes for Native AMP (#6565)
- Update Gradient Clipping for the TPU Accelerator (#6576)
- Refactored setup for typing friendly (#6590)
Fixed
- Fixed a bug where
all_gather
would not work correctly withtpu_cores=8
(#6587) - Fixed comparing required versions (#6434)
- Fixed duplicate logs appearing in console when using the python logging module (#6275)
Contributors
@awaelchli, @Borda, @ethanwharris, @justusschock, @kaushikb11
If we forgot someone due to not matching commit email with GitHub account, let us know :]
Standard weekly patch release
[1.2.4] - 2021-03-16
Changed
- Changed the default of
find_unused_parameters
back toTrue
in DDP and DDP Spawn (#6438)
Fixed
- Expose DeepSpeed loss parameters to allow users to fix loss instability (#6115)
- Fixed DP reduction with collection (#6324)
- Fixed an issue where the tuner would not tune the learning rate if also tuning the batch size (#4688)
- Fixed broadcast to use PyTorch
broadcast_object_list
and addreduce_decision
(#6410) - Fixed logger creating directory structure too early in DDP (#6380)
- Fixed DeepSpeed additional memory use on rank 0 when default device not set early enough (#6460)
- Fixed
DummyLogger.log_hyperparams
raising aTypeError
when running withfast_dev_run=True
(#6398) - Fixed an issue with
Tuner.scale_batch_size
not finding the batch size attribute in the datamodule (#5968) - Fixed an exception in the layer summary when the model contains torch.jit scripted submodules (#6511)
- Fixed when Train loop config was run during
Trainer.predict
(#6541)
Contributors
@awaelchli, @kaushikb11, @Palzer, @SeanNaren, @tchaton
If we forgot someone due to not matching commit email with GitHub account, let us know :]
Standard weekly patch release
[1.2.3] - 2021-03-09
Fixed
- Fixed
ModelPruning(make_pruning_permanent=True)
pruning buffers getting removed when saved during training (#6073) - Fixed when
_stable_1d_sort
to work whenn >= N
(#6177) - Fixed
AttributeError
whenlogger=None
on TPU (#6221) - Fixed PyTorch Profiler with
emit_nvtx
(#6260) - Fixed
trainer.test
frombest_path
hangs after callingtrainer.fit
(#6272) - Fixed
SingleTPU
callingall_gather
(#6296) - Ensure we check deepspeed/sharded in multinode DDP (#6297)
- Check
LightningOptimizer
doesn't delete optimizer hooks (#6305) - Resolve memory leak for evaluation (#6326)
- Ensure that clip gradients is only called if the value is greater than 0 (#6330)
- Fixed
Trainer
not resettinglightning_optimizers
when callingTrainer.fit()
multiple times (#6372)
Contributors
@awaelchli, @carmocca, @chizuchizu, @frankier, @SeanNaren, @tchaton
If we forgot someone due to not matching commit email with GitHub account, let us know :]
Standard weekly patch release
[1.2.2] - 2021-03-02
Added
- Added
checkpoint
parameter to callback'son_save_checkpoint
hook (#6072)
Changed
- Changed the order of
backward
,step
,zero_grad
tozero_grad
,backward
,step
(#6147) - Changed default for DeepSpeed CPU Offload to False, due to prohibitively slow speeds at smaller scale (#6262)
Fixed
- Fixed epoch level schedulers not being called when
val_check_interval < 1.0
(#6075) - Fixed multiple early stopping callbacks (#6197)
- Fixed incorrect usage of
detach()
,cpu()
,to()
(#6216) - Fixed LBFGS optimizer support which didn't converge in automatic optimization (#6147)
- Prevent
WandbLogger
from dropping values (#5931) - Fixed error thrown when using valid distributed mode in multi node (#6297)
Contributors
@akihironitta, @borisdayma, @carmocca, @dvolgyes, @SeanNaren, @SkafteNicki
If we forgot someone due to not matching commit email with GitHub account, let us know :]
Standard weekly patch release
[1.2.1] - 2021-02-23
Fixed
- Fixed incorrect yield logic for the amp autocast context manager (#6080)
- Fixed priority of plugin/accelerator when setting distributed mode (#6089)
- Fixed error message for AMP + CPU incompatibility (#6107)
Contributors
@awaelchli, @SeanNaren, @carmocca
If we forgot someone due to not matching commit email with GitHub account, let us know :]