Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update cache.jl #2604

Merged
merged 1 commit into from
Dec 25, 2024
Merged

Update cache.jl #2604

merged 1 commit into from
Dec 25, 2024

Conversation

jarbus
Copy link
Contributor

@jarbus jarbus commented Dec 25, 2024

This seemed like a bug

This seemed like a bug
Copy link

codecov bot commented Dec 25, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 73.54%. Comparing base (972f3f0) to head (7328d3f).
Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2604      +/-   ##
==========================================
- Coverage   73.63%   73.54%   -0.10%     
==========================================
  Files         157      157              
  Lines       15207    15207              
==========================================
- Hits        11198    11184      -14     
- Misses       4009     4023      +14     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CUDA.jl Benchmarks

Benchmark suite Current: 7328d3f Previous: 62564aa Ratio
latency/precompile 45354733130 ns 45352888677 ns 1.00
latency/ttfp 6382441349.5 ns 6365236644.5 ns 1.00
latency/import 3033778244 ns 3028318014.5 ns 1.00
integration/volumerhs 9568316 ns 9566694 ns 1.00
integration/byval/slices=1 146684 ns 146977 ns 1.00
integration/byval/slices=3 425436 ns 425622 ns 1.00
integration/byval/reference 144736 ns 144997 ns 1.00
integration/byval/slices=2 286563 ns 286278 ns 1.00
integration/cudadevrt 103507 ns 103635 ns 1.00
kernel/indexing 14284 ns 14502 ns 0.98
kernel/indexing_checked 14901 ns 15491 ns 0.96
kernel/occupancy 717.431654676259 ns 685.9934210526316 ns 1.05
kernel/launch 2153.777777777778 ns 2211.5555555555557 ns 0.97
kernel/rand 18127 ns 15495 ns 1.17
array/reverse/1d 19518 ns 19288.5 ns 1.01
array/reverse/2d 23540 ns 25407 ns 0.93
array/reverse/1d_inplace 10128 ns 11091 ns 0.91
array/reverse/2d_inplace 11486 ns 12732 ns 0.90
array/copy 20470 ns 20971 ns 0.98
array/iteration/findall/int 158142 ns 159562 ns 0.99
array/iteration/findall/bool 138436 ns 139396 ns 0.99
array/iteration/findfirst/int 153157 ns 153501 ns 1.00
array/iteration/findfirst/bool 154954 ns 154850 ns 1.00
array/iteration/scalar 78049 ns 75791 ns 1.03
array/iteration/logical 215615.5 ns 216533 ns 1.00
array/iteration/findmin/1d 40922 ns 41336 ns 0.99
array/iteration/findmin/2d 94290 ns 94735 ns 1.00
array/reductions/reduce/1d 35579 ns 42301 ns 0.84
array/reductions/reduce/2d 50761.5 ns 45473.5 ns 1.12
array/reductions/mapreduce/1d 33633 ns 40507 ns 0.83
array/reductions/mapreduce/2d 41890.5 ns 47968 ns 0.87
array/broadcast 21627 ns 21824 ns 0.99
array/copyto!/gpu_to_gpu 11595 ns 13701 ns 0.85
array/copyto!/cpu_to_gpu 213648 ns 213728 ns 1.00
array/copyto!/gpu_to_cpu 245025 ns 245756 ns 1.00
array/accumulate/1d 108333 ns 108365 ns 1.00
array/accumulate/2d 79477 ns 80195 ns 0.99
array/construct 1184.75 ns 1204.7 ns 0.98
array/random/randn/Float32 43906 ns 43838 ns 1.00
array/random/randn!/Float32 26619 ns 26350 ns 1.01
array/random/rand!/Int64 27193 ns 27179 ns 1.00
array/random/rand!/Float32 8810.333333333334 ns 8916.333333333334 ns 0.99
array/random/rand/Int64 29829 ns 30059 ns 0.99
array/random/rand/Float32 12959 ns 12950 ns 1.00
array/permutedims/4d 67098 ns 67259 ns 1.00
array/permutedims/2d 56667 ns 56558.5 ns 1.00
array/permutedims/3d 59183 ns 59379 ns 1.00
array/sorting/1d 2932018 ns 2932110 ns 1.00
array/sorting/by 3500113 ns 3498470 ns 1.00
array/sorting/2d 1084817.5 ns 1085408 ns 1.00
cuda/synchronization/stream/auto 1042.6 ns 1025.5 ns 1.02
cuda/synchronization/stream/nonblocking 6466.8 ns 6545.6 ns 0.99
cuda/synchronization/stream/blocking 855.7872340425532 ns 796.0645161290323 ns 1.08
cuda/synchronization/context/auto 1198.6 ns 1190.9 ns 1.01
cuda/synchronization/context/nonblocking 6659.6 ns 6739.6 ns 0.99
cuda/synchronization/context/blocking 948.0487804878048 ns 903.25 ns 1.05

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Member

@maleadt maleadt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks!

@maleadt maleadt merged commit a0c2f4b into JuliaGPU:master Dec 25, 2024
2 checks passed
@jarbus jarbus deleted the patch-1 branch December 25, 2024 14:04
THargreaves pushed a commit to THargreaves/CUDA.jl that referenced this pull request Jan 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

'ArgumentError: array must be non-empty' when attempting to pop idle handles from HandleCache
2 participants