Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When running set_finetuning_example.ipynb on a MacBook on the mps backend I seem to be hitting an OOM, is this expected? #67

Open
Rajjeshwar opened this issue Dec 8, 2024 · 5 comments

Comments

@Rajjeshwar
Copy link

Is it expected for the notebook to use upto 30gb of the shared memory? I checked on my resource activity monitor and the memory usage hits 30gb before it crashes. Just wanted to know if it is expected for this particular notebook since it seems to be a small model ("HuggingFaceTB/SmolLM2-135M") and a small dataset ("HuggingFaceTB/smoltalk").

@burtenshaw
Copy link
Collaborator

Have you tried reducing batch size to reduce memory footprint?

@Rajjeshwar
Copy link
Author

Hello, yes, it does reduce the overall memory footprint. I reduced the batch size and the sequence length to get it to run for around 1000 iterations. However, the total memory is not really what I was wondering about, its moreso the fact that the total memory usage keeps increasing almost as though there is a memory leak in the trainer when its saving the batch statistics.

In fact I guess here its less a memory leak and moreso just the fact that since apple uses unified storage it is actually less memory than a typical Nvidia gpu setup where the system ram is separate and can store these values? Hence, why I wanted to know if this is an expected behaviour on Macs with this particular model.

@burtenshaw
Copy link
Collaborator

Thanks, I understand the question now. Your intuition makes sense the Trainer is going to use memory and due to the architecture of apple silicone, this would not be as clear as discrete devices.

If you want to discuss the topic, you could join the discord. There are also mlx pros over there:

smol-course discord channel: https://discord.com/channels/879548962464493619/1313889336907010110
HF invite link: https://discord.gg/hugging-face-879548962464493619

@Rajjeshwar
Copy link
Author

Oh hey, thank you for the link, I will check it out. Also, thanks for the responses!

@edfenergy-yuhang
Copy link

i ran into the similar issue during fine tuning:
RuntimeError: MPS backend out of memory (MPS allocated: 6.80 GB, other allocations: 11.25 GB, max allowed: 18.13 GB).
my machine is 16gb macbook

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants