Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

apply z-loss and aux-loss only during training #64

Merged
merged 1 commit into from
Nov 12, 2024

Conversation

xffxff
Copy link
Collaborator

@xffxff xffxff commented Nov 12, 2024

image
After profiling, I discovered that the routing logic consumes a significant amount of time, primarily due to apply_z_loss and apply_aux_loss, which are unnecessary during inference´. Thus, we can eliminate these operations in the inference phase

@xffxff xffxff merged commit faf9edf into main Nov 12, 2024
1 check passed
@xffxff xffxff deleted the eliminate_z_loss_in_inference branch November 12, 2024 02:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant