-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When running set_finetuning_example.ipynb on a MacBook on the mps backend I seem to be hitting an OOM, is this expected? #67
Comments
Have you tried reducing batch size to reduce memory footprint? |
Hello, yes, it does reduce the overall memory footprint. I reduced the batch size and the sequence length to get it to run for around 1000 iterations. However, the total memory is not really what I was wondering about, its moreso the fact that the total memory usage keeps increasing almost as though there is a memory leak in the trainer when its saving the batch statistics. In fact I guess here its less a memory leak and moreso just the fact that since apple uses unified storage it is actually less memory than a typical Nvidia gpu setup where the system ram is separate and can store these values? Hence, why I wanted to know if this is an expected behaviour on Macs with this particular model. |
Thanks, I understand the question now. Your intuition makes sense the If you want to discuss the topic, you could join the discord. There are also mlx pros over there: smol-course discord channel: https://discord.com/channels/879548962464493619/1313889336907010110 |
Oh hey, thank you for the link, I will check it out. Also, thanks for the responses! |
i ran into the similar issue during fine tuning: |
Is it expected for the notebook to use upto 30gb of the shared memory? I checked on my resource activity monitor and the memory usage hits 30gb before it crashes. Just wanted to know if it is expected for this particular notebook since it seems to be a small model ("HuggingFaceTB/SmolLM2-135M") and a small dataset ("HuggingFaceTB/smoltalk").
The text was updated successfully, but these errors were encountered: