how to use tutel on Megatron Deepspeed #207

wangyuxin87 · 2023-07-15T02:39:55Z

can tutel be used with Megatron Deepspeed?

ghostplant · 2023-07-17T04:58:40Z

Do you mean Megatron and Deepspeed respectively, or working together for them all?

xcwanAndy · 2024-04-25T06:43:20Z

@ghostplant Can tutel work concurrently with Megatron or Deepspeed respectively?

ghostplant · 2024-04-25T06:56:47Z

Yes, Tutel is just an MoE layer implementation which is pluggable for any distributed frameworks. The way for other framework to use Tutel MoE layer is by passing distributed processing group properly, e.g.:

my_processing_group = deepspeed.new_group(..)

moe_layer = tutel_moe.moe_layer(
    ..,
    group=my_processing_group
)

If other frameworks are not available, Tutel itself also provides a 1-line initialization to generate groups you need, which works for both distributed gpu (i.e. nccl) and distributed cpu (i.e. gloo):

from tutel import system
parallel_env = system.init_data_model_parallel(backend='nccl' if args.device == 'cuda' else 'gloo')
my_processing_group = [ parallel_env.data_group | parallel_env.model_group | parallel_env.global_group ]
...

xcwanAndy · 2024-04-25T06:59:44Z

Thanks for your prompt response!

xcwanAndy mentioned this issue Apr 25, 2024

Can tutel support Pipeline Parallel? #233

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how to use tutel on Megatron Deepspeed #207

how to use tutel on Megatron Deepspeed #207

wangyuxin87 commented Jul 15, 2023

ghostplant commented Jul 17, 2023 •

edited

Loading

xcwanAndy commented Apr 25, 2024

ghostplant commented Apr 25, 2024

xcwanAndy commented Apr 25, 2024

how to use tutel on Megatron Deepspeed #207

how to use tutel on Megatron Deepspeed #207

Comments

wangyuxin87 commented Jul 15, 2023

ghostplant commented Jul 17, 2023 • edited Loading

xcwanAndy commented Apr 25, 2024

ghostplant commented Apr 25, 2024

xcwanAndy commented Apr 25, 2024

ghostplant commented Jul 17, 2023 •

edited

Loading