You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It would be good to separate those out for a couple reasons:
it lets us use multiple cores for data loading but not need to set a multiprocessing strategy for the trainer when only running on a single GPU
we've only trained models on a single GPU so it's not clear that multiprocessing for the model is fully and properly configured
pytorch lightning is making a lot of changes currently to their accelerators and strategies used for distributed training, so it would be nice to let those settle a bit before supporting multi GPU training in zamba
Implementation thoughts:
do not infer multiprocessing context from num workers (only use num workers for the dataloaders and to determine persistent_workers)
consider adding a multiprocessing strategy on the train config object with the PTL default. another option is to set this as a boolean and let zamba determine the best strategy / accelerator combo
The text was updated successfully, but these errors were encountered:
Hey, @sambujangfofana and I are students from the University of Michigan. We are currently working on a project wherein we have to contribute to a Github repository(https://eecs481.org/hw6.html). We are pretty interested in this issue and would want to work on it. We hope to submit a pull request this week. Could we be assigned this issue?
Right now, we set the
multiprocessing_context
for the Trainer based on thenum_workers
used for the data loaderhttps://github.com/drivendataorg/zamba/blob/master/zamba/pytorch_lightning/utils.py#L67-L71
https://github.com/drivendataorg/zamba/blob/master/zamba/models/model_manager.py#L283-L286
It would be good to separate those out for a couple reasons:
Implementation thoughts:
persistent_workers
)The text was updated successfully, but these errors were encountered: