Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added a nb demonstrating efficient gpu utilizations #44

Merged
merged 1 commit into from
Mar 22, 2024

Conversation

suppathak
Copy link
Collaborator

Related #36

Exploring:
What happens when we load the models in a lower precision format like INT-8? How is the accuracy, CPU, and memory performance affected? Explain theoretically and show results in a notebook. Touch upon challenges of frameworks like bitsandbytes in production.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@suppathak suppathak changed the title Added a nb demonstrating efficient gpu utilizations [WIP] Added a nb demonstrating efficient gpu utilizations May 5, 2023
@Shreyanand
Copy link
Member

@suppathak What is pending in this PR?

@suppathak
Copy link
Collaborator Author

@suppathak What is pending in this PR?

Thanks @Shreyanand for reminding me . I will clean it a bit and will conclude it. I will add other topics related to model compression in different notebooks.

@suppathak suppathak changed the title [WIP] Added a nb demonstrating efficient gpu utilizations Added a nb demonstrating efficient gpu utilizations Jul 3, 2023
@suppathak suppathak force-pushed the compressed-models branch from 48f0e65 to 672f677 Compare July 3, 2023 17:47
@suppathak suppathak requested a review from Shreyanand July 3, 2023 17:47
@@ -0,0 +1,476 @@
{
Copy link
Member

@Shreyanand Shreyanand Jul 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

source?


Reply via ReviewNB

@@ -0,0 +1,476 @@
{
Copy link
Member

@Shreyanand Shreyanand Jul 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean with quantization?

Can you provide the percentage change in memory and inference time? it is more intuitive.

For inference time computation, can you run the cell 10 times and average the time? That makes the result more robust.


Reply via ReviewNB

Copy link
Member

@Shreyanand Shreyanand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requested some minor changes, majorly around adding source of the images and improving results.
Also, add this notebook in the resource section in the README.

@suppathak suppathak requested a review from Shreyanand August 8, 2023 13:34
@Shreyanand Shreyanand merged commit 48db5dc into redhat-et:master Mar 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants