You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I found all dpo, ppo and rm model inherit MegatronGPTModel but rewrite the get_forward_output_and_loss_func method without the adaption of context parallel, can you please add the support because it is efficient for long context training.
The text was updated successfully, but these errors were encountered:
Nemo suport context parallel in nature, with the adaption in [MegatronGPTModel] (https://github.com/NVIDIA/NeMo/blob/96187eac848ebf02c56e9fc658a57a500a56a842/nemo/collections/nlp/models/language_modeling/megatron_gpt_model.py#L1039)
get_forward_output_and_loss_func
method.I found all dpo, ppo and rm model inherit MegatronGPTModel but rewrite the
get_forward_output_and_loss_func
method without the adaption of context parallel, can you please add the support because it is efficient for long context training.The text was updated successfully, but these errors were encountered: