Can you please support context parallel? #162

DZ9 · 2024-04-23T02:33:37Z

Nemo suport context parallel in nature, with the adaption in [MegatronGPTModel] (https://github.com/NVIDIA/NeMo/blob/96187eac848ebf02c56e9fc658a57a500a56a842/nemo/collections/nlp/models/language_modeling/megatron_gpt_model.py#L1039) get_forward_output_and_loss_func method.

I found all dpo, ppo and rm model inherit MegatronGPTModel but rewrite the get_forward_output_and_loss_func method without the adaption of context parallel, can you please add the support because it is efficient for long context training.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can you please support context parallel? #162

Can you please support context parallel? #162

DZ9 commented Apr 23, 2024

Can you please support context parallel? #162

Can you please support context parallel? #162

Comments

DZ9 commented Apr 23, 2024