v0.3.0 Notebook launcher and multi-node training
v0.3.0 Notebook launcher and multi-node training
Notebook launcher
After doing all the data preprocessing in your notebook, you can launch your training loop using the new notebook_launcher
functionality. This is especially useful for Colab or Kaggle with TPUs! Here is an example on Colab (don't forget to select a TPU runtime).
This launcher also works if you have multiple GPUs on your machine. You just have to pass along num_processes=your_number_of_gpus
in the call to notebook_launcher
.
- Notebook launcher #44 (@sgugger)
- Add notebook/colab example #52 (@sgugger)
- Support for multi-GPU in notebook_launcher #56 (@sgugger)
Multi-node training
Our multi-node training test setup was flawed and the previous releases of 🤗 Accelerate were not working for multi-node distributed training. This is all fixed now and we have ensured to have more robust tests!
- fix cluster.py indent error #35 (@JTT94)
- Set all defaults from config in launcher #38 (@sgugger)
- Fix port in config creation #50 (@sgugger)
Various bug fixes
- Fix typos in examples README #28 (@arjunchandra)
- Fix load from config #31 (@sgugger)
- docs: minor spelling tweaks #33 (@brettkoonce)
- Add
set_to_none
to AcceleratedOptimizer.zero_grad #43 (@sgugger) - fix #53 #54 (@Guitaricet)
- update launch.py #58 (@Jesse1eung)