-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update cache.jl #2604
Update cache.jl #2604
Conversation
This seemed like a bug
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #2604 +/- ##
==========================================
- Coverage 73.63% 73.54% -0.10%
==========================================
Files 157 157
Lines 15207 15207
==========================================
- Hits 11198 11184 -14
- Misses 4009 4023 +14 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CUDA.jl Benchmarks
Benchmark suite | Current: 7328d3f | Previous: 62564aa | Ratio |
---|---|---|---|
latency/precompile |
45354733130 ns |
45352888677 ns |
1.00 |
latency/ttfp |
6382441349.5 ns |
6365236644.5 ns |
1.00 |
latency/import |
3033778244 ns |
3028318014.5 ns |
1.00 |
integration/volumerhs |
9568316 ns |
9566694 ns |
1.00 |
integration/byval/slices=1 |
146684 ns |
146977 ns |
1.00 |
integration/byval/slices=3 |
425436 ns |
425622 ns |
1.00 |
integration/byval/reference |
144736 ns |
144997 ns |
1.00 |
integration/byval/slices=2 |
286563 ns |
286278 ns |
1.00 |
integration/cudadevrt |
103507 ns |
103635 ns |
1.00 |
kernel/indexing |
14284 ns |
14502 ns |
0.98 |
kernel/indexing_checked |
14901 ns |
15491 ns |
0.96 |
kernel/occupancy |
717.431654676259 ns |
685.9934210526316 ns |
1.05 |
kernel/launch |
2153.777777777778 ns |
2211.5555555555557 ns |
0.97 |
kernel/rand |
18127 ns |
15495 ns |
1.17 |
array/reverse/1d |
19518 ns |
19288.5 ns |
1.01 |
array/reverse/2d |
23540 ns |
25407 ns |
0.93 |
array/reverse/1d_inplace |
10128 ns |
11091 ns |
0.91 |
array/reverse/2d_inplace |
11486 ns |
12732 ns |
0.90 |
array/copy |
20470 ns |
20971 ns |
0.98 |
array/iteration/findall/int |
158142 ns |
159562 ns |
0.99 |
array/iteration/findall/bool |
138436 ns |
139396 ns |
0.99 |
array/iteration/findfirst/int |
153157 ns |
153501 ns |
1.00 |
array/iteration/findfirst/bool |
154954 ns |
154850 ns |
1.00 |
array/iteration/scalar |
78049 ns |
75791 ns |
1.03 |
array/iteration/logical |
215615.5 ns |
216533 ns |
1.00 |
array/iteration/findmin/1d |
40922 ns |
41336 ns |
0.99 |
array/iteration/findmin/2d |
94290 ns |
94735 ns |
1.00 |
array/reductions/reduce/1d |
35579 ns |
42301 ns |
0.84 |
array/reductions/reduce/2d |
50761.5 ns |
45473.5 ns |
1.12 |
array/reductions/mapreduce/1d |
33633 ns |
40507 ns |
0.83 |
array/reductions/mapreduce/2d |
41890.5 ns |
47968 ns |
0.87 |
array/broadcast |
21627 ns |
21824 ns |
0.99 |
array/copyto!/gpu_to_gpu |
11595 ns |
13701 ns |
0.85 |
array/copyto!/cpu_to_gpu |
213648 ns |
213728 ns |
1.00 |
array/copyto!/gpu_to_cpu |
245025 ns |
245756 ns |
1.00 |
array/accumulate/1d |
108333 ns |
108365 ns |
1.00 |
array/accumulate/2d |
79477 ns |
80195 ns |
0.99 |
array/construct |
1184.75 ns |
1204.7 ns |
0.98 |
array/random/randn/Float32 |
43906 ns |
43838 ns |
1.00 |
array/random/randn!/Float32 |
26619 ns |
26350 ns |
1.01 |
array/random/rand!/Int64 |
27193 ns |
27179 ns |
1.00 |
array/random/rand!/Float32 |
8810.333333333334 ns |
8916.333333333334 ns |
0.99 |
array/random/rand/Int64 |
29829 ns |
30059 ns |
0.99 |
array/random/rand/Float32 |
12959 ns |
12950 ns |
1.00 |
array/permutedims/4d |
67098 ns |
67259 ns |
1.00 |
array/permutedims/2d |
56667 ns |
56558.5 ns |
1.00 |
array/permutedims/3d |
59183 ns |
59379 ns |
1.00 |
array/sorting/1d |
2932018 ns |
2932110 ns |
1.00 |
array/sorting/by |
3500113 ns |
3498470 ns |
1.00 |
array/sorting/2d |
1084817.5 ns |
1085408 ns |
1.00 |
cuda/synchronization/stream/auto |
1042.6 ns |
1025.5 ns |
1.02 |
cuda/synchronization/stream/nonblocking |
6466.8 ns |
6545.6 ns |
0.99 |
cuda/synchronization/stream/blocking |
855.7872340425532 ns |
796.0645161290323 ns |
1.08 |
cuda/synchronization/context/auto |
1198.6 ns |
1190.9 ns |
1.01 |
cuda/synchronization/context/nonblocking |
6659.6 ns |
6739.6 ns |
0.99 |
cuda/synchronization/context/blocking |
948.0487804878048 ns |
903.25 ns |
1.05 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, thanks!
This seemed like a bug