[QST] Using the pooled memory resource deallocate() method without specifying the number of bytes to free #1819

gartavanis · 2025-02-10T19:48:09Z

gartavanis
Feb 10, 2025

Is there a way to use the pooled memory resource deallocate() method without specifying the number of bytes to free?

I’m currently updating our software, which was using the old RMM C API.

Just for reference, the old code that was using the RMM C API (from branch 0.10) was doing the following:

    rmmOptions_t rmmOptions;
    rmmOptions.allocation_mode = (rmmAllocationMode_t)(PoolAllocation | CudaManagedMemory);
    rmmOptions.initial_pool_size = 1;   // size = 0 initializes half the device memory
    rmmOptions.enable_logging = false;
    RMM_ERR(rmmInitialize(&rmmOptions));

and then it was using the provided rmmAlloc() and rmmFree(). rmmFree() need not specify how many bytes to free. This pooled allocator led to ~2x speedup in our code, so I would like to keep using the pooled allocator with the new C++ API if possible.

This is what I have so far using the new C++ API:

  static rmm::mr::pool_memory_resource<rmm::mr::managed_memory_resource>* pool_mr = nullptr;
  void amps_rmmInit() {
    rmm::mr::managed_memory_resource cuda_mr;
    // Construct a resource that uses a coalescing best-fit pool allocator
    // With the pool initially all of available device memory
    auto initial_size = rmm::percent_of_free_device_memory(100);
    pool_mr = new rmm::mr::pool_memory_resource<rmm::mr::managed_memory_resource>(&cuda_mr, initial_size);
    rmm::mr::set_current_device_resource(pool_mr); // Updates the current device resource pointer to `pool_mr`
  }

  void* amps_rmmAlloc(size_t bytes) {
    return pool_mr->allocate(bytes);
  }

  void amps_rmmFree(void *p) {
    pool_mr->deallocate(p);
  }

The problem is with the last function, as the pooled deallocate() requires an additional argument (the number of bytes). However, in our code it’s not possible to know where the memory was allocated, and how many bytes, as the memory management module is separate from the rest of the code. Thanks for any suggestions!

Answered by harrism

Feb 10, 2025

Unfortunately this is not possible. The design of memory resources is such that the allocate and deallocate calls must match. One reason for this is so that we can have things like binning allocators that delegate allocation to different suballocators depending on the size of the allocation. For efficiency, you might want to use a very different algorithm and/or pool to allocate very large regions of memory or very small allocations.

My advice is to avoid raw allocation (e.g. calls to allocate/deallocate, any form of malloc or new) altogether. All raw allocation should be done inside containers. Containers should always know their size and therefore always know how to pass the size. You c…

View full answer

harrism · 2025-02-10T20:12:54Z

harrism
Feb 10, 2025
Collaborator

Unfortunately this is not possible. The design of memory resources is such that the allocate and deallocate calls must match. One reason for this is so that we can have things like binning allocators that delegate allocation to different suballocators depending on the size of the allocation. For efficiency, you might want to use a very different algorithm and/or pool to allocate very large regions of memory or very small allocations.

My advice is to avoid raw allocation (e.g. calls to allocate/deallocate, any form of malloc or new) altogether. All raw allocation should be done inside containers. Containers should always know their size and therefore always know how to pass the size. You can use RMM memory resources with RMM's own containers (device_buffer, device_uvector), and with Thrust containers like device_vector, or roll your own.

1 reply

gartavanis Feb 10, 2025
Author

Thanks for the quick reply! I will look into device_buffers more to see if they can be used with our code.

harrism · 2025-02-10T20:13:59Z

harrism
Feb 10, 2025
Collaborator

I'm going to convert this issue to a discussion.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QST] Using the pooled memory resource deallocate() method without specifying the number of bytes to free #1819

{{title}}

Replies: 2 comments 1 reply

{{title}}

{{title}}

{{title}}

Select a reply

[QST] Using the pooled memory resource deallocate() method without specifying the number of bytes to free #1819

gartavanis Feb 10, 2025

Replies: 2 comments · 1 reply

harrism Feb 10, 2025 Collaborator

gartavanis Feb 10, 2025 Author

harrism Feb 10, 2025 Collaborator

gartavanis
Feb 10, 2025

Replies: 2 comments 1 reply

harrism
Feb 10, 2025
Collaborator

gartavanis Feb 10, 2025
Author

harrism
Feb 10, 2025
Collaborator