Support GPUArrays allocations cache #2593

pxl-th · 2024-12-15T14:24:00Z

Ref: JuliaGPU/GPUArrays.jl#576

src/array.jl

codecov · 2025-01-09T13:20:27Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 61.04%. Comparing base (234ec4c) to head (c4c3639).
Report is 3 commits behind head on master.

Additional details and impacted files

@@             Coverage Diff             @@
##           master    #2593       +/-   ##
===========================================
- Coverage   73.64%   61.04%   -12.60%     
===========================================
  Files         157      157               
  Lines       15220    14955      -265     
===========================================
- Hits        11209     9130     -2079     
- Misses       4011     5825     +1814

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

maleadt · 2025-01-09T13:50:30Z

It's also kinda weird that we're always attaching a finalizer to the array, even when it's put inside of cache (meaning it should get freed when the cache goes out of scope). Should we move the finalizer into cached_alloc? That starts to make it a slightly messy API, though.

pxl-th · 2025-01-09T14:19:36Z

It's also kinda weird that we're always attaching a finalizer to the array, even when it's put inside of cache (meaning it should get freed when the cache goes out of scope). Should we move the finalizer into cached_alloc? That starts to make it a slightly messy API, though.

But we need them anyway, so it might be fine to attach them in ctor and have a single place for this, rather than catching when they go out of cache.

pxl-th · 2025-01-09T19:32:11Z

Passing bytesize as a key results in errors, where arrays are of the same bytesize, but of different dims.

julia> GPUArrays.@cached cache ROCArray(zeros(Float32, 16));

julia> GPUArrays.@cached cache ROCArray(zeros(Float32, 4, 4))
ERROR: TypeError: in typeassert, expected ROCArray{Float32, 2, AMDGPU.Runtime.Mem.HIPBuffer}, got a value of type ROCArray{Float32, 1, AMDGPU.Runtime.Mem.HIPBuffer}
Stacktrace:
 [1] ROCArray
   @ ~/.julia/dev/AMDGPU/src/array.jl:9 [inlined]
 [2] ROCArray
   @ ~/.julia/dev/AMDGPU/src/array.jl:111 [inlined]
 [3] ROCArray
   @ ~/.julia/dev/AMDGPU/src/array.jl:116 [inlined]
 [4] ROCArray(A::Matrix{Float32})
   @ AMDGPU ~/.julia/dev/AMDGPU/src/array.jl:119
 [5] macro expansion
   @ ~/.julia/packages/GPUArrays/2FKRj/src/host/alloc_cache.jl:147 [inlined]
 [6] top-level scope
   @ REPL[4]:1

To fix this we can either revert to dims or reshape an array post GPUArrays.cached_alloc and only then do type assertion:

return if size(x) != dims
    reshape(x, dims)
else
    x
end::ROCArray{T, N, B}

maleadt · 2025-01-10T07:37:07Z

julia> GPUArrays.@cached cache ROCArray(zeros(Float32, 16));

julia> GPUArrays.@cached cache ROCArray(zeros(Float32, 4, 4))
ERROR: TypeError: in typeassert, expected ROCArray{Float32, 2, AMDGPU.Runtime.Mem.HIPBuffer}, got a value of type ROCArray{Float32, 1, AMDGPU.Runtime.Mem.HIPBuffer}

Yes, that's what I meant in #2593 (comment), and I have reverted the change before merging this PR already.

maleadt · 2025-01-10T07:42:33Z

That said, it would be nice if the cache could fulfill such requests. I considered storing the DataRef instead, which tracks a refcount (so we could check if it has been unsafe_freed!d). However, that does make finalization more tricky: Since the cache doesn't keep the array objects alive anymore, that means array finalizers can now affect the refcount during execution of the @cached expressions. I guess we could move registration of the finalizer to cached_alloc, skipping that when the cache is active (because the cache is now responsible for freeing contained data)...

Co-authored-by: Tim Besard <[email protected]>

pxl-th mentioned this pull request Dec 15, 2024

Add caching allocator interface JuliaGPU/GPUArrays.jl#576

Merged

5 tasks

maleadt added cuda array Stuff about CuArray. performance How fast can we go? labels Dec 16, 2024

maleadt force-pushed the master branch 15 times, most recently from 5d585c4 to c850163 Compare December 20, 2024 08:18

maleadt reviewed Jan 6, 2025

View reviewed changes

src/array.jl Outdated Show resolved Hide resolved

pxl-th and others added 8 commits January 9, 2025 11:31

Use GPUArrays cache alloc

a4ef3ab

upd

b8194cd

cleanup

6a275a7

Update

6c35632

Update

12497f5

Update to latest alloc cache

f65176a

Update

de11775

Bump GPUArrays.

56d9fcc

maleadt force-pushed the pxl-th/cache-alloc branch from 90ac2fa to 56d9fcc Compare January 9, 2025 12:10

maleadt reviewed Jan 9, 2025

View reviewed changes

src/array.jl Outdated Show resolved Hide resolved

Use bufsize in cache key

c4c3639

pxl-th changed the title ~~Use GPUArrays caching allocator~~ Support GPUArrays allocations cache Jan 9, 2025

pxl-th marked this pull request as ready for review January 9, 2025 12:35

maleadt mentioned this pull request Jan 9, 2025

Try fast linear indexes for KA #2612

Draft

maleadt reviewed Jan 9, 2025

View reviewed changes

src/array.jl Show resolved Hide resolved

maleadt merged commit 1389800 into JuliaGPU:master Jan 9, 2025
3 of 4 checks passed

pxl-th deleted the pxl-th/cache-alloc branch January 9, 2025 15:47

avik-pal pushed a commit to avik-pal/CUDA.jl that referenced this pull request Jan 11, 2025

Support GPUArrays allocations cache (JuliaGPU#2593)

d6628c3

Co-authored-by: Tim Besard <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support GPUArrays allocations cache #2593

Support GPUArrays allocations cache #2593

pxl-th commented Dec 15, 2024

codecov bot commented Jan 9, 2025 •

edited

Loading

maleadt commented Jan 9, 2025

pxl-th commented Jan 9, 2025

pxl-th commented Jan 9, 2025

maleadt commented Jan 10, 2025

maleadt commented Jan 10, 2025

Support GPUArrays allocations cache #2593

Support GPUArrays allocations cache #2593

Conversation

pxl-th commented Dec 15, 2024

codecov bot commented Jan 9, 2025 • edited Loading

Codecov Report

maleadt commented Jan 9, 2025

pxl-th commented Jan 9, 2025

pxl-th commented Jan 9, 2025

maleadt commented Jan 10, 2025

maleadt commented Jan 10, 2025

codecov bot commented Jan 9, 2025 •

edited

Loading