-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add llama 31 support updated #92
base: main
Are you sure you want to change the base?
Conversation
Add Llama 3.1 in test_decode.py Set generation_config._eos_token_tensor to None
Add llama 31 support updated
@@ -118,6 +118,7 @@ def create( | |||
) | |||
generation_config.max_length = max_seq_length | |||
|
|||
generation_config._eos_token_tensor = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As said here, could you replace this by this snippet?
generation_config._eos_token_tensor = None | |
generation_config._eos_token_tensor = getattr(generation_config, "_eos_token_tensor", None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tengomucho Above is review is from the old PR. It is already updated. 10f857c. I have replaced the snippet in here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, I somehow missed it, it's fine then!
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Add workaround for eos_token_tensor in jetstream_pt_support
What does this PR do?
This PR is an updated version closed PR. This includes recent updates made in main branch.