[QST] Using the pooled memory resource deallocate() method without specifying the number of bytes to free #1819
-
Is there a way to use the pooled memory resource deallocate() method without specifying the number of bytes to free? I’m currently updating our software, which was using the old RMM C API. Just for reference, the old code that was using the RMM C API (from branch 0.10) was doing the following: rmmOptions_t rmmOptions;
rmmOptions.allocation_mode = (rmmAllocationMode_t)(PoolAllocation | CudaManagedMemory);
rmmOptions.initial_pool_size = 1; // size = 0 initializes half the device memory
rmmOptions.enable_logging = false;
RMM_ERR(rmmInitialize(&rmmOptions)); and then it was using the provided rmmAlloc() and rmmFree(). rmmFree() need not specify how many bytes to free. This pooled allocator led to ~2x speedup in our code, so I would like to keep using the pooled allocator with the new C++ API if possible. This is what I have so far using the new C++ API: static rmm::mr::pool_memory_resource<rmm::mr::managed_memory_resource>* pool_mr = nullptr;
void amps_rmmInit() {
rmm::mr::managed_memory_resource cuda_mr;
// Construct a resource that uses a coalescing best-fit pool allocator
// With the pool initially all of available device memory
auto initial_size = rmm::percent_of_free_device_memory(100);
pool_mr = new rmm::mr::pool_memory_resource<rmm::mr::managed_memory_resource>(&cuda_mr, initial_size);
rmm::mr::set_current_device_resource(pool_mr); // Updates the current device resource pointer to `pool_mr`
}
void* amps_rmmAlloc(size_t bytes) {
return pool_mr->allocate(bytes);
}
void amps_rmmFree(void *p) {
pool_mr->deallocate(p);
} The problem is with the last function, as the pooled deallocate() requires an additional argument (the number of bytes). However, in our code it’s not possible to know where the memory was allocated, and how many bytes, as the memory management module is separate from the rest of the code. Thanks for any suggestions! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
Unfortunately this is not possible. The design of memory resources is such that the allocate and deallocate calls must match. One reason for this is so that we can have things like binning allocators that delegate allocation to different suballocators depending on the size of the allocation. For efficiency, you might want to use a very different algorithm and/or pool to allocate very large regions of memory or very small allocations. My advice is to avoid raw allocation (e.g. calls to |
Beta Was this translation helpful? Give feedback.
-
I'm going to convert this issue to a discussion. |
Beta Was this translation helpful? Give feedback.
Unfortunately this is not possible. The design of memory resources is such that the allocate and deallocate calls must match. One reason for this is so that we can have things like binning allocators that delegate allocation to different suballocators depending on the size of the allocation. For efficiency, you might want to use a very different algorithm and/or pool to allocate very large regions of memory or very small allocations.
My advice is to avoid raw allocation (e.g. calls to
allocate
/deallocate
, any form ofmalloc
ornew
) altogether. All raw allocation should be done inside containers. Containers should always know their size and therefore always know how to pass the size. You c…