Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does memray monitor GPU memory usage? #710

Open
1 task done
Neronjust2017 opened this issue Jan 6, 2025 · 3 comments
Open
1 task done

Does memray monitor GPU memory usage? #710

Neronjust2017 opened this issue Jan 6, 2025 · 3 comments
Labels
bug Something isn't working

Comments

@Neronjust2017
Copy link

Neronjust2017 commented Jan 6, 2025

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

I' m using memray to monitor my PyTorch model training and I found model inference step allocated most memory as you can see below. However, the inference is on GPU for my model, why did it cost so much memory? Does this frame include GPU memory usage? Also, I monitored GPU usage in my code and it's max value is ~4GB.
image
image
image

Expected Behavior

No response

Steps To Reproduce

No

Memray Version

1.15.0

Python Version

3.10

Operating System

Linux

Anything else?

No response

@Neronjust2017 Neronjust2017 added the bug Something isn't working label Jan 6, 2025
@godlygeek
Copy link
Contributor

No, Memray does not currently monitor GPU memory usage. It currently monitors allocations and deallocations made using these functions:

  • malloc
  • free
  • calloc
  • realloc
  • valloc
  • posix_memalign
  • aligned_alloc
  • mmap
  • munmap

@Neronjust2017
Copy link
Author

No, Memray does not currently monitor GPU memory usage. It currently monitors allocations and deallocations made using these functions:

  • malloc
  • free
  • calloc
  • realloc
  • valloc
  • posix_memalign
  • aligned_alloc
  • mmap
  • munmap

thanks!
image
what does 18.4GB means? the total memory allocated by this function? or the peak memory usage?
And how can i know the function (1. malloc, free, calloc, etc. 2. the functions in my code) proportion of memory usage at each timestamp in the below figure?
image

@godlygeek
Copy link
Contributor

what does 18.4GB means? the total memory allocated by this function? or the peak memory usage?

The total amount of memory allocated by that function and not yet freed at the moment in time when the process reached its peak memory usage. See https://bloomberg.github.io/memray/flamegraph.html#interpreting-flame-graphs

And how can i know the function (1. malloc, free, calloc, etc. 2. the functions in my code) proportion of memory usage at each timestamp in the below figure?

The default flame graphs only show you what stack allocated each chunk of memory at the moment in time when the process reached its peak memory usage. You can run with memray flamegraph --temporal to generate a flame graph that allows you to investigate other moments in time. See https://bloomberg.github.io/memray/flamegraph.html#temporal-flame-graphs

Our flame graphs don't distinguish between allocations from different allocators, but memray stats shows you the number of allocations made by each allocator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants