Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MMLU example #50

Open
wants to merge 18 commits into
base: main
Choose a base branch
from
Open

MMLU example #50

wants to merge 18 commits into from

Conversation

carlofisicaro
Copy link

Adding inference example on the MMLU dataset. Tested only on google/gemma-2b-flax

carlofisicaro and others added 16 commits September 18, 2024 14:18
PiperOrigin-RevId: 663277444
Change-Id: I8d7030ce586577a433c48f32df7efa7c141b171a
…ormer_lib.make_causal_attn_mask(input_mask)`

PiperOrigin-RevId: 663692225
Change-Id: Ie2cb6229302087ea1ce5b5c7f442a088207ead07
PiperOrigin-RevId: 665414923
Change-Id: I42bc41074518e3065f85c7f1a3014fdd09cffe4c
Currently all weights in FeedForward layers are initialized to zero. This doesn't cause any issues when loading the module with pretrained weights, but if training from scratch it will result in all gradients being zero throughout training so no learning can occur. Changing w_gating be be initialized from a normal distribution fixes this.

PiperOrigin-RevId: 674306730
Change-Id: I90800dbe605cdf88f341d103f102357ff278a393
PiperOrigin-RevId: 674394389
Change-Id: I25ba5ad4769c3101c2bf572e33723d4a241e3895
…se errors for implicit rank promotion.

PiperOrigin-RevId: 675179053
Change-Id: I55459c1aa99c7d33ae3f03712eaed01ccc5fc9f2
Copy link

google-cla bot commented Sep 21, 2024

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@carlofisicaro
Copy link
Author

The GitHub CLA check doesn't recognize the noreply user @a-googler <no****ly​@google.com>.

How shall I proceed? Should I use an interactive rebase to edit the author of the related commits?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants