Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic Reduced handling in parallel reduce #172

Merged
merged 3 commits into from
Jan 20, 2020
Merged

Conversation

tkf
Copy link
Member

@tkf tkf commented Jan 19, 2020

Commit Message

Generic Reduced handling in parallel reduce (#172)

Previously, Reduced did not have a meaningful behavior in parallel
reduce when the reducing function is not right:

julia> tcollect(TakeWhile(x -> x < 5), 1:10; basesize=1)
Empty{Array{T,1} where T}()

julia> tcollect(TakeWhile(x -> x < 5), 1:10; basesize=2)
1-element Array{Int64,1}:
 4

julia> tcollect(TakeWhile(x -> x < 5), 1:10; basesize=4)
2-element Array{Int64,1}:
 3
 4

julia> tcollect(TakeWhile(x -> x < 5), 1:10; basesize=5)
4-element Array{Int64,1}:
 1
 2
 3
 4

This PR fixes it by properly formulating how to execute the reducing
function when combined with Reduced. This is done by "augmenting"
the reducing function *:

Given a semigroup *(::T, ::T) :: T where !(Reduced <: T), fold
functions in Transducers.jl act on an "augmented" semigroup
*′(::T′, ::T′) :: T′ where T′ = Union{T, Reduced{T}} defined by

*′(a::Reduced, _) = a
*′(a::T, b::Reduced) = reduced(a * unreduced(b))
*′(a::T, b::T) = a * b

If * is a monoid with the identity element e, the "augmented"
semigroup *′ is also a monoid with the identity element e′.

Given a semigroup `*(::T, ::T) :: T` where `!(Reduced <: T)`, fold
functions in Transducers.jl act on an "augmented" semigroup
`*′(::T′, ::T′) :: T′` defined by

    *′(a::Reduced, _) = a
    *′(a::T, b::Reduced) = Reduced(a * unreduced(b))
    *′(a::T, b::T) = a * b
@github-actions
Copy link
Contributor

Multi-thread benchmark result

Judge result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

  • Time of benchmarks:
    • Target: 19 Jan 2020 - 22:56
    • Baseline: 19 Jan 2020 - 22:59
  • Package commits:
    • Target: 54e0c3
    • Baseline: 149fe1
  • Julia commits:
    • Target: 2d5741
    • Baseline: 2d5741
  • Julia command flags:
    • Target: None
    • Baseline: None
  • Environment variables:
    • Target: JULIA_NUM_THREADS => 2
    • Baseline: JULIA_NUM_THREADS => 2

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID time ratio memory ratio
["findfirst", "n=1000", "reduce", "basesize=128"] 1.20 (5%) ❌ 1.18 (1%) ❌
["findfirst", "n=1000", "reduce", "basesize=256"] 0.91 (5%) ✅ 0.91 (1%) ✅
["findfirst", "n=1000", "reduce", "basesize=512"] 1.19 (5%) ❌ 1.20 (1%) ❌
["findfirst", "n=400", "reduce", "basesize=256"] 1.00 (5%) 1.01 (1%) ❌
["findfirst", "n=400", "reduce", "basesize=512"] 1.04 (5%) 1.11 (1%) ❌
["findfirst", "n=500", "reduce", "basesize=128"] 0.51 (5%) ✅ 0.54 (1%) ✅
["findfirst", "n=500", "reduce", "basesize=256"] 0.72 (5%) ✅ 0.89 (1%) ✅
["findfirst", "n=500", "reduce", "basesize=512"] 0.98 (5%) 0.96 (1%) ✅
["parallel_histogram", "assoc", "basesize=4096"] 1.04 (5%) 1.15 (1%) ❌
["parallel_histogram", "comm", "basesize=4096"] 1.00 (5%) 0.99 (1%) ✅
["parallel_histogram", "comm", "basesize=8192"] 0.84 (5%) ✅ 1.00 (1%)
["unordered", "unordered", "basesize=1"] 1.09 (5%) ❌ 1.00 (1%)
["unordered", "unordered", "basesize=1024"] 0.81 (5%) ✅ 0.83 (1%) ✅
["words", "nthreads=1"] 1.05 (5%) ❌ 1.00 (1%)
["words", "nthreads=4"] 0.91 (5%) ✅ 1.00 (1%)

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["findfirst", "n=1000"]
  • ["findfirst", "n=1000", "reduce"]
  • ["findfirst", "n=400"]
  • ["findfirst", "n=400", "reduce"]
  • ["findfirst", "n=500"]
  • ["findfirst", "n=500", "reduce"]
  • ["parallel_histogram", "assoc"]
  • ["parallel_histogram", "comm"]
  • ["parallel_histogram"]
  • ["unordered"]
  • ["unordered", "unordered"]
  • ["words"]

Julia versioninfo

Target

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz: 
              speed         user         nice          sys         idle          irq
       #1  2294 MHz      29661 s          0 s       1644 s      14978 s          0 s
       #2  2294 MHz      34265 s          0 s       1762 s      11489 s          0 s
       
  Memory: 6.782737731933594 GB (3622.51171875 MB free)
  Uptime: 487.0 sec
  Load Avg:  1.6806640625  1.31298828125  0.68505859375
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, broadwell)

Baseline

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz: 
              speed         user         nice          sys         idle          irq
       #1  2294 MHz      44754 s          0 s       1984 s      19211 s          0 s
       #2  2294 MHz      48431 s          0 s       2088 s      16617 s          0 s
       
  Memory: 6.782737731933594 GB (3652.2109375 MB free)
  Uptime: 685.0 sec
  Load Avg:  1.591796875  1.427734375  0.85546875
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, broadwell)

Target result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

  • Time of benchmark: 19 Jan 2020 - 22:56
  • Package commit: 54e0c3
  • Julia commit: 2d5741
  • Julia command flags: None
  • Environment variables: JULIA_NUM_THREADS => 2

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["findfirst", "n=1000", "foldl"] 709.627 ms (5%)
["findfirst", "n=1000", "reduce", "basesize=128"] 516.398 ms (5%) 1000.53 KiB (1%) 19745
["findfirst", "n=1000", "reduce", "basesize=256"] 505.530 ms (5%) 537.22 KiB (1%) 10574
["findfirst", "n=1000", "reduce", "basesize=512"] 651.684 ms (5%) 354.00 KiB (1%) 6961
["findfirst", "n=400", "foldl"] 532.568 ms (5%)
["findfirst", "n=400", "reduce", "basesize=128"] 293.539 ms (5%) 1.26 MiB (1%) 25628
["findfirst", "n=400", "reduce", "basesize=256"] 296.775 ms (5%) 666.81 KiB (1%) 13294
["findfirst", "n=400", "reduce", "basesize=512"] 327.179 ms (5%) 429.02 KiB (1%) 8496
["findfirst", "n=500", "foldl"] 90.142 ms (5%)
["findfirst", "n=500", "reduce", "basesize=128"] 151.035 ms (5%) 654.03 KiB (1%) 12720
["findfirst", "n=500", "reduce", "basesize=256"] 118.810 ms (5%) 357.59 KiB (1%) 6924
["findfirst", "n=500", "reduce", "basesize=512"] 156.522 ms (5%) 228.17 KiB (1%) 4451
["parallel_histogram", "assoc", "basesize=16384"] 5.055 ms (5%) 732.22 KiB (1%) 109
["parallel_histogram", "assoc", "basesize=4096"] 6.400 ms (5%) 2.07 MiB (1%) 544
["parallel_histogram", "assoc", "basesize=8192"] 5.555 ms (5%) 1.43 MiB (1%) 260
["parallel_histogram", "comm", "basesize=16384"] 14.230 ms (5%) 1.22 MiB (1%) 392
["parallel_histogram", "comm", "basesize=4096"] 20.822 ms (5%) 1.02 MiB (1%) 2819
["parallel_histogram", "comm", "basesize=8192"] 16.029 ms (5%) 1.23 MiB (1%) 834
["parallel_histogram", "seq"] 9.430 ms (5%) 364.63 KiB (1%) 25
["unordered", "collect"] 467.358 ms (5%) 513.00 KiB (1%) 23
["unordered", "unordered", "basesize=1"] 597.021 ms (5%) 30.27 MiB (1%) 476084
["unordered", "unordered", "basesize=1024"] 294.418 ms (5%) 850.73 KiB (1%) 7562
["unordered", "unordered", "basesize=32"] 276.143 ms (5%) 1.57 MiB (1%) 23281
["words", "nthreads=1"] 42.093 ms (5%) 7.516 ms 64.71 MiB (1%) 2093526
["words", "nthreads=2"] 24.474 ms (5%) 65.43 MiB (1%) 2093688
["words", "nthreads=4"] 24.410 ms (5%) 65.88 MiB (1%) 2093843

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["findfirst", "n=1000"]
  • ["findfirst", "n=1000", "reduce"]
  • ["findfirst", "n=400"]
  • ["findfirst", "n=400", "reduce"]
  • ["findfirst", "n=500"]
  • ["findfirst", "n=500", "reduce"]
  • ["parallel_histogram", "assoc"]
  • ["parallel_histogram", "comm"]
  • ["parallel_histogram"]
  • ["unordered"]
  • ["unordered", "unordered"]
  • ["words"]

Julia versioninfo

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz: 
              speed         user         nice          sys         idle          irq
       #1  2294 MHz      29661 s          0 s       1644 s      14978 s          0 s
       #2  2294 MHz      34265 s          0 s       1762 s      11489 s          0 s
       
  Memory: 6.782737731933594 GB (3622.51171875 MB free)
  Uptime: 487.0 sec
  Load Avg:  1.6806640625  1.31298828125  0.68505859375
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, broadwell)

Baseline result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

  • Time of benchmark: 19 Jan 2020 - 22:59
  • Package commit: 149fe1
  • Julia commit: 2d5741
  • Julia command flags: None
  • Environment variables: JULIA_NUM_THREADS => 2

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["findfirst", "n=1000", "foldl"] 698.633 ms (5%)
["findfirst", "n=1000", "reduce", "basesize=128"] 429.645 ms (5%) 849.58 KiB (1%) 16572
["findfirst", "n=1000", "reduce", "basesize=256"] 556.577 ms (5%) 588.50 KiB (1%) 11470
["findfirst", "n=1000", "reduce", "basesize=512"] 548.261 ms (5%) 294.63 KiB (1%) 5752
["findfirst", "n=400", "foldl"] 523.978 ms (5%)
["findfirst", "n=400", "reduce", "basesize=128"] 288.792 ms (5%) 1.25 MiB (1%) 25041
["findfirst", "n=400", "reduce", "basesize=256"] 296.954 ms (5%) 660.17 KiB (1%) 12982
["findfirst", "n=400", "reduce", "basesize=512"] 314.824 ms (5%) 387.03 KiB (1%) 7601
["findfirst", "n=500", "foldl"] 90.234 ms (5%)
["findfirst", "n=500", "reduce", "basesize=128"] 298.223 ms (5%) 1.19 MiB (1%) 23557
["findfirst", "n=500", "reduce", "basesize=256"] 165.749 ms (5%) 403.36 KiB (1%) 7798
["findfirst", "n=500", "reduce", "basesize=512"] 160.322 ms (5%) 237.13 KiB (1%) 4567
["parallel_histogram", "assoc", "basesize=16384"] 5.167 ms (5%) 732.25 KiB (1%) 110
["parallel_histogram", "assoc", "basesize=4096"] 6.174 ms (5%) 1.80 MiB (1%) 540
["parallel_histogram", "assoc", "basesize=8192"] 5.691 ms (5%) 1.43 MiB (1%) 262
["parallel_histogram", "comm", "basesize=16384"] 14.420 ms (5%) 1.22 MiB (1%) 257
["parallel_histogram", "comm", "basesize=4096"] 20.833 ms (5%) 1.03 MiB (1%) 4091
["parallel_histogram", "comm", "basesize=8192"] 19.152 ms (5%) 1.24 MiB (1%) 1261
["parallel_histogram", "seq"] 9.590 ms (5%) 364.63 KiB (1%) 25
["unordered", "collect"] 467.035 ms (5%) 513.00 KiB (1%) 23
["unordered", "unordered", "basesize=1"] 550.168 ms (5%) 30.26 MiB (1%) 475091
["unordered", "unordered", "basesize=1024"] 361.595 ms (5%) 1019.53 KiB (1%) 18320
["unordered", "unordered", "basesize=32"] 277.901 ms (5%) 1.56 MiB (1%) 22649
["words", "nthreads=1"] 40.021 ms (5%) 6.915 ms 64.91 MiB (1%) 2100564
["words", "nthreads=2"] 24.556 ms (5%) 65.63 MiB (1%) 2100725
["words", "nthreads=4"] 26.756 ms (5%) 66.08 MiB (1%) 2100881

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["findfirst", "n=1000"]
  • ["findfirst", "n=1000", "reduce"]
  • ["findfirst", "n=400"]
  • ["findfirst", "n=400", "reduce"]
  • ["findfirst", "n=500"]
  • ["findfirst", "n=500", "reduce"]
  • ["parallel_histogram", "assoc"]
  • ["parallel_histogram", "comm"]
  • ["parallel_histogram"]
  • ["unordered"]
  • ["unordered", "unordered"]
  • ["words"]

Julia versioninfo

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz: 
              speed         user         nice          sys         idle          irq
       #1  2294 MHz      44754 s          0 s       1984 s      19211 s          0 s
       #2  2294 MHz      48431 s          0 s       2088 s      16617 s          0 s
       
  Memory: 6.782737731933594 GB (3652.2109375 MB free)
  Uptime: 685.0 sec
  Load Avg:  1.591796875  1.427734375  0.85546875
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, broadwell)

Runtime information

Runtime Info
BLAS #threads 2
BLAS.vendor() openblas64
Sys.CPU_THREADS 2

lscpu output:

Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              2
On-line CPU(s) list: 0,1
Thread(s) per core:  1
Core(s) per socket:  2
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               79
Model name:          Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
Stepping:            1
CPU MHz:             2294.686
BogoMIPS:            4589.37
Hypervisor vendor:   Microsoft
Virtualization type: full
L1d cache:           32K
L1i cache:           32K
L2 cache:            256K
L3 cache:            51200K
NUMA node0 CPU(s):   0,1
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx smap xsaveopt md_clear
Cpu Property Value
Brand Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
Vendor :Intel
Architecture :Broadwell
Model Family: 0x06, Model: 0x4f, Stepping: 0x01, Type: 0x00
Cores 2 physical cores, 2 logical cores (on executing CPU)
No Hyperthreading detected
Clock Frequencies Not supported by CPU
Data Cache Level 1:3 : (32, 256, 51200) kbytes
64 byte cache line size
Address Size 48 bits virtual, 44 bits physical
SIMD 256 bit = 32 byte max. SIMD vector size
Time Stamp Counter TSC is accessible via rdtsc
TSC increased at every clock cycle (non-invariant TSC)
Perf. Monitoring Performance Monitoring Counters (PMC) are not supported
Hypervisor Yes, Microsoft

@github-actions
Copy link
Contributor

Benchmark result

Judge result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

  • Time of benchmarks:
    • Target: 19 Jan 2020 - 22:58
    • Baseline: 19 Jan 2020 - 23:01
  • Package commits:
    • Target: 54e0c3
    • Baseline: 149fe1
  • Julia commits:
    • Target: 2d5741
    • Baseline: 2d5741
  • Julia command flags:
    • Target: None
    • Baseline: None
  • Environment variables:
    • Target: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 1
    • Baseline: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 1

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID time ratio memory ratio
["cat", "base"] 0.94 (5%) ✅ 1.00 (1%)
["cat", "xf"] 1.06 (5%) ❌ 1.00 (1%)
["collect", "identity-union"] 1.07 (5%) ❌ 1.00 (1%)
["gemm", "mul", "linalg", "8"] 0.74 (5%) ✅ 1.00 (1%)
["gemm", "mul", "man", "false", "32"] 0.89 (5%) ✅ 1.00 (1%)
["gemm", "mul", "man", "false", "8"] 0.75 (5%) ✅ 1.00 (1%)
["gemm", "mul", "man", "ivdep", "8"] 0.92 (5%) ✅ 1.00 (1%)
["gemm", "mul", "man", "true", "8"] 1.36 (5%) ❌ 1.00 (1%)
["gemm", "mul", "xf", "false", "8"] 0.84 (5%) ✅ 1.00 (1%)
["gemm", "mul", "xf", "ivdep", "8"] 0.92 (5%) ✅ 1.00 (1%)
["gemm", "mul", "xf", "true", "8"] 1.23 (5%) ❌ 1.00 (1%)
["missing_argmax", "rf"] 0.95 (5%) ✅ 1.00 (1%)
["missing_argmax", "xf"] 0.92 (5%) ✅ 1.00 (1%)
["missing_dot", "equiv"] 1.11 (5%) ❌ 1.00 (1%)
["missing_dot", "man"] 0.91 (5%) ✅ 1.00 (1%)
["missing_dot", "rf_nota"] 0.91 (5%) ✅ 1.00 (1%)

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["cat"]
  • ["collect"]
  • ["dot"]
  • ["filter_map_map!"]
  • ["filter_map_reduce"]
  • ["gemm", "fusedmul", "blas"]
  • ["gemm", "fusedmul", "xf"]
  • ["gemm", "mul", "linalg"]
  • ["gemm", "mul", "man", "false"]
  • ["gemm", "mul", "man", "ivdep"]
  • ["gemm", "mul", "man", "true"]
  • ["gemm", "mul", "xf", "false"]
  • ["gemm", "mul", "xf", "ivdep"]
  • ["gemm", "mul", "xf", "true"]
  • ["missing_argmax"]
  • ["missing_dot"]
  • ["partition_by"]

Julia versioninfo

Target

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz: 
              speed         user         nice          sys         idle          irq
       #1  2294 MHz      14495 s          0 s       1250 s      39548 s          0 s
       #2  2294 MHz      38758 s          0 s        950 s      14891 s          0 s
       
  Memory: 6.782924652099609 GB (3516.8203125 MB free)
  Uptime: 564.0 sec
  Load Avg:  1.03564453125  0.93359375  0.54443359375
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, broadwell)

Baseline

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz: 
              speed         user         nice          sys         idle          irq
       #1  2294 MHz      20167 s          0 s       1353 s      57195 s          0 s
       #2  2294 MHz      56659 s          0 s       1078 s      20331 s          0 s
       
  Memory: 6.782924652099609 GB (3536.46875 MB free)
  Uptime: 799.0 sec
  Load Avg:  1.0205078125  0.99169921875  0.65966796875
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, broadwell)

Target result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

  • Time of benchmark: 19 Jan 2020 - 22:58
  • Package commit: 54e0c3
  • Julia commit: 2d5741
  • Julia command flags: None
  • Environment variables: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 1

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["cat", "base"] 210.301 μs (5%)
["cat", "xf"] 1.550 μs (5%)
["collect", "filter-missing"] 82.500 μs (5%) 33.05 KiB (1%) 20
["collect", "identity-float"] 65.000 μs (5%) 256.91 KiB (1%) 20
["collect", "identity-union"] 333.501 μs (5%) 285.88 KiB (1%) 6683
["dot", "blas"] 2.278 μs (5%)
["dot", "man"] 2.278 μs (5%)
["dot", "rf"] 2.667 μs (5%)
["dot", "xf"] 2.678 μs (5%)
["filter_map_map!", "man"] 68.301 μs (5%)
["filter_map_map!", "xf"] 70.401 μs (5%) 144 bytes (1%) 8
["filter_map_reduce", "man"] 195.000 μs (5%)
["filter_map_reduce", "xf"] 195.000 μs (5%)
["gemm", "fusedmul", "blas", "16"] 5.800 ms (5%)
["gemm", "fusedmul", "blas", "2"] 4.049 ms (5%)
["gemm", "fusedmul", "blas", "32"] 8.357 ms (5%)
["gemm", "fusedmul", "blas", "8"] 4.359 ms (5%)
["gemm", "fusedmul", "xf", "16"] 5.482 ms (5%) 160 bytes (1%) 6
["gemm", "fusedmul", "xf", "2"] 652.406 μs (5%) 160 bytes (1%) 6
["gemm", "fusedmul", "xf", "32"] 11.140 ms (5%) 160 bytes (1%) 6
["gemm", "fusedmul", "xf", "8"] 2.742 ms (5%) 160 bytes (1%) 6
["gemm", "mul", "linalg", "256"] 1.379 ms (5%)
["gemm", "mul", "linalg", "32"] 3.729 μs (5%)
["gemm", "mul", "linalg", "8"] 295.434 ns (5%)
["gemm", "mul", "man", "false", "256"] 4.901 ms (5%)
["gemm", "mul", "man", "false", "32"] 7.133 μs (5%)
["gemm", "mul", "man", "false", "8"] 375.500 ns (5%)
["gemm", "mul", "man", "ivdep", "256"] 4.870 ms (5%)
["gemm", "mul", "man", "ivdep", "32"] 6.280 μs (5%)
["gemm", "mul", "man", "ivdep", "8"] 368.599 ns (5%)
["gemm", "mul", "man", "true", "256"] 4.926 ms (5%)
["gemm", "mul", "man", "true", "32"] 7.467 μs (5%)
["gemm", "mul", "man", "true", "8"] 408.081 ns (5%)
["gemm", "mul", "xf", "false", "256"] 4.881 ms (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "false", "32"] 6.920 μs (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "false", "8"] 420.812 ns (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "ivdep", "256"] 4.853 ms (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "ivdep", "32"] 5.720 μs (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "ivdep", "8"] 367.157 ns (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "true", "256"] 4.929 ms (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "true", "32"] 6.900 μs (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "true", "8"] 368.473 ns (5%) 48 bytes (1%) 2
["missing_argmax", "man"] 950.050 ns (5%) 32 bytes (1%) 1
["missing_argmax", "rf"] 2.222 μs (5%) 32 bytes (1%) 1
["missing_argmax", "xf"] 2.189 μs (5%) 32 bytes (1%) 1
["missing_dot", "equiv"] 1.470 μs (5%) 16 bytes (1%) 1
["missing_dot", "man"] 993.939 ns (5%) 16 bytes (1%) 1
["missing_dot", "naive"] 4.314 μs (5%) 16 bytes (1%) 1
["missing_dot", "rf"] 1.000 μs (5%) 16 bytes (1%) 1
["missing_dot", "rf_nota"] 1.380 μs (5%) 16 bytes (1%) 1
["missing_dot", "xf"] 189.101 μs (5%) 74.14 KiB (1%) 3867
["missing_dot", "xf_nota"] 189.501 μs (5%) 74.14 KiB (1%) 3868
["partition_by", "man"] 2.033 ms (5%) 352 bytes (1%) 4
["partition_by", "xf"] 1.990 ms (5%) 576 bytes (1%) 7

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["cat"]
  • ["collect"]
  • ["dot"]
  • ["filter_map_map!"]
  • ["filter_map_reduce"]
  • ["gemm", "fusedmul", "blas"]
  • ["gemm", "fusedmul", "xf"]
  • ["gemm", "mul", "linalg"]
  • ["gemm", "mul", "man", "false"]
  • ["gemm", "mul", "man", "ivdep"]
  • ["gemm", "mul", "man", "true"]
  • ["gemm", "mul", "xf", "false"]
  • ["gemm", "mul", "xf", "ivdep"]
  • ["gemm", "mul", "xf", "true"]
  • ["missing_argmax"]
  • ["missing_dot"]
  • ["partition_by"]

Julia versioninfo

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz: 
              speed         user         nice          sys         idle          irq
       #1  2294 MHz      14495 s          0 s       1250 s      39548 s          0 s
       #2  2294 MHz      38758 s          0 s        950 s      14891 s          0 s
       
  Memory: 6.782924652099609 GB (3516.8203125 MB free)
  Uptime: 564.0 sec
  Load Avg:  1.03564453125  0.93359375  0.54443359375
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, broadwell)

Baseline result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

  • Time of benchmark: 19 Jan 2020 - 23:1
  • Package commit: 149fe1
  • Julia commit: 2d5741
  • Julia command flags: None
  • Environment variables: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 1

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["cat", "base"] 222.701 μs (5%)
["cat", "xf"] 1.460 μs (5%)
["collect", "filter-missing"] 85.400 μs (5%) 33.05 KiB (1%) 20
["collect", "identity-float"] 65.700 μs (5%) 256.91 KiB (1%) 20
["collect", "identity-union"] 311.002 μs (5%) 285.72 KiB (1%) 6700
["dot", "blas"] 2.289 μs (5%)
["dot", "man"] 2.289 μs (5%)
["dot", "rf"] 2.667 μs (5%)
["dot", "xf"] 2.678 μs (5%)
["filter_map_map!", "man"] 69.100 μs (5%)
["filter_map_map!", "xf"] 70.701 μs (5%) 144 bytes (1%) 8
["filter_map_reduce", "man"] 195.001 μs (5%)
["filter_map_reduce", "xf"] 194.901 μs (5%)
["gemm", "fusedmul", "blas", "16"] 5.904 ms (5%)
["gemm", "fusedmul", "blas", "2"] 4.109 ms (5%)
["gemm", "fusedmul", "blas", "32"] 8.497 ms (5%)
["gemm", "fusedmul", "blas", "8"] 4.365 ms (5%)
["gemm", "fusedmul", "xf", "16"] 5.605 ms (5%) 160 bytes (1%) 6
["gemm", "fusedmul", "xf", "2"] 647.003 μs (5%) 160 bytes (1%) 6
["gemm", "fusedmul", "xf", "32"] 11.228 ms (5%) 160 bytes (1%) 6
["gemm", "fusedmul", "xf", "8"] 2.741 ms (5%) 160 bytes (1%) 6
["gemm", "mul", "linalg", "256"] 1.401 ms (5%)
["gemm", "mul", "linalg", "32"] 3.800 μs (5%)
["gemm", "mul", "linalg", "8"] 400.000 ns (5%)
["gemm", "mul", "man", "false", "256"] 4.969 ms (5%)
["gemm", "mul", "man", "false", "32"] 8.000 μs (5%)
["gemm", "mul", "man", "false", "8"] 500.000 ns (5%)
["gemm", "mul", "man", "ivdep", "256"] 4.914 ms (5%)
["gemm", "mul", "man", "ivdep", "32"] 6.300 μs (5%)
["gemm", "mul", "man", "ivdep", "8"] 400.000 ns (5%)
["gemm", "mul", "man", "true", "256"] 4.923 ms (5%)
["gemm", "mul", "man", "true", "32"] 7.400 μs (5%)
["gemm", "mul", "man", "true", "8"] 300.000 ns (5%)
["gemm", "mul", "xf", "false", "256"] 4.928 ms (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "false", "32"] 7.000 μs (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "false", "8"] 500.000 ns (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "ivdep", "256"] 4.910 ms (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "ivdep", "32"] 5.700 μs (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "ivdep", "8"] 400.000 ns (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "true", "256"] 4.912 ms (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "true", "32"] 7.000 μs (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "true", "8"] 300.000 ns (5%) 48 bytes (1%) 2
["missing_argmax", "man"] 955.000 ns (5%) 32 bytes (1%) 1
["missing_argmax", "rf"] 2.344 μs (5%) 32 bytes (1%) 1
["missing_argmax", "xf"] 2.367 μs (5%) 32 bytes (1%) 1
["missing_dot", "equiv"] 1.320 μs (5%) 16 bytes (1%) 1
["missing_dot", "man"] 1.088 μs (5%) 16 bytes (1%) 1
["missing_dot", "naive"] 4.429 μs (5%) 16 bytes (1%) 1
["missing_dot", "rf"] 980.808 ns (5%) 16 bytes (1%) 1
["missing_dot", "rf_nota"] 1.520 μs (5%) 16 bytes (1%) 1
["missing_dot", "xf"] 196.202 μs (5%) 74.08 KiB (1%) 3865
["missing_dot", "xf_nota"] 197.101 μs (5%) 73.77 KiB (1%) 3857
["partition_by", "man"] 2.092 ms (5%) 352 bytes (1%) 4
["partition_by", "xf"] 2.051 ms (5%) 576 bytes (1%) 7

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["cat"]
  • ["collect"]
  • ["dot"]
  • ["filter_map_map!"]
  • ["filter_map_reduce"]
  • ["gemm", "fusedmul", "blas"]
  • ["gemm", "fusedmul", "xf"]
  • ["gemm", "mul", "linalg"]
  • ["gemm", "mul", "man", "false"]
  • ["gemm", "mul", "man", "ivdep"]
  • ["gemm", "mul", "man", "true"]
  • ["gemm", "mul", "xf", "false"]
  • ["gemm", "mul", "xf", "ivdep"]
  • ["gemm", "mul", "xf", "true"]
  • ["missing_argmax"]
  • ["missing_dot"]
  • ["partition_by"]

Julia versioninfo

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz: 
              speed         user         nice          sys         idle          irq
       #1  2294 MHz      20167 s          0 s       1353 s      57195 s          0 s
       #2  2294 MHz      56659 s          0 s       1078 s      20331 s          0 s
       
  Memory: 6.782924652099609 GB (3536.46875 MB free)
  Uptime: 799.0 sec
  Load Avg:  1.0205078125  0.99169921875  0.65966796875
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, broadwell)

Runtime information

Runtime Info
BLAS #threads 2
BLAS.vendor() openblas64
Sys.CPU_THREADS 2

lscpu output:

Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              2
On-line CPU(s) list: 0,1
Thread(s) per core:  1
Core(s) per socket:  2
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               79
Model name:          Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
Stepping:            1
CPU MHz:             2294.685
BogoMIPS:            4589.37
Hypervisor vendor:   Microsoft
Virtualization type: full
L1d cache:           32K
L1i cache:           32K
L2 cache:            256K
L3 cache:            51200K
NUMA node0 CPU(s):   0,1
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx smap xsaveopt md_clear
Cpu Property Value
Brand Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
Vendor :Intel
Architecture :Broadwell
Model Family: 0x06, Model: 0x4f, Stepping: 0x01, Type: 0x00
Cores 2 physical cores, 2 logical cores (on executing CPU)
No Hyperthreading detected
Clock Frequencies Not supported by CPU
Data Cache Level 1:3 : (32, 256, 51200) kbytes
64 byte cache line size
Address Size 48 bits virtual, 44 bits physical
SIMD 256 bit = 32 byte max. SIMD vector size
Time Stamp Counter TSC is accessible via rdtsc
TSC increased at every clock cycle (non-invariant TSC)
Perf. Monitoring Performance Monitoring Counters (PMC) are not supported
Hypervisor Yes, Microsoft

@github-actions
Copy link
Contributor

Multi-thread benchmark result

Judge result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

  • Time of benchmarks:
    • Target: 19 Jan 2020 - 23:40
    • Baseline: 19 Jan 2020 - 23:44
  • Package commits:
    • Target: 5cb3ae
    • Baseline: 149fe1
  • Julia commits:
    • Target: 2d5741
    • Baseline: 2d5741
  • Julia command flags:
    • Target: None
    • Baseline: None
  • Environment variables:
    • Target: JULIA_NUM_THREADS => 2
    • Baseline: JULIA_NUM_THREADS => 2

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID time ratio memory ratio
["findfirst", "n=1000", "reduce", "basesize=128"] 0.95 (5%) ✅ 1.01 (1%)
["findfirst", "n=1000", "reduce", "basesize=256"] 1.07 (5%) ❌ 0.98 (1%) ✅
["findfirst", "n=1000", "reduce", "basesize=512"] 1.01 (5%) 1.05 (1%) ❌
["findfirst", "n=400", "reduce", "basesize=128"] 0.96 (5%) 0.98 (1%) ✅
["findfirst", "n=400", "reduce", "basesize=512"] 1.03 (5%) 1.08 (1%) ❌
["findfirst", "n=500", "reduce", "basesize=128"] 0.82 (5%) ✅ 0.88 (1%) ✅
["findfirst", "n=500", "reduce", "basesize=256"] 2.30 (5%) ❌ 1.50 (1%) ❌
["findfirst", "n=500", "reduce", "basesize=512"] 0.47 (5%) ✅ 0.57 (1%) ✅
["parallel_histogram", "comm", "basesize=4096"] 0.89 (5%) ✅ 1.03 (1%) ❌
["parallel_histogram", "comm", "basesize=8192"] 1.11 (5%) ❌ 1.00 (1%)
["unordered", "unordered", "basesize=1024"] 0.92 (5%) ✅ 0.92 (1%) ✅
["words", "nthreads=2"] 1.07 (5%) ❌ 0.99 (1%) ✅
["words", "nthreads=4"] 1.05 (5%) ❌ 0.99 (1%)

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["findfirst", "n=1000"]
  • ["findfirst", "n=1000", "reduce"]
  • ["findfirst", "n=400"]
  • ["findfirst", "n=400", "reduce"]
  • ["findfirst", "n=500"]
  • ["findfirst", "n=500", "reduce"]
  • ["parallel_histogram", "assoc"]
  • ["parallel_histogram", "comm"]
  • ["parallel_histogram"]
  • ["unordered"]
  • ["unordered", "unordered"]
  • ["words"]

Julia versioninfo

Target

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz: 
              speed         user         nice          sys         idle          irq
       #1  2397 MHz      35822 s          0 s       1690 s      13041 s          0 s
       #2  2397 MHz      29354 s          0 s       1642 s      18759 s          0 s
       
  Memory: 6.782737731933594 GB (3653.85546875 MB free)
  Uptime: 557.0 sec
  Load Avg:  1.7587890625  1.4482421875  0.80029296875
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, haswell)

Baseline

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz: 
              speed         user         nice          sys         idle          irq
       #1  2397 MHz      49808 s          0 s       2017 s      18500 s          0 s
       #2  2397 MHz      44622 s          0 s       1995 s      22853 s          0 s
       
  Memory: 6.782737731933594 GB (3666.640625 MB free)
  Uptime: 756.0 sec
  Load Avg:  1.56689453125  1.4853515625  0.9462890625
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, haswell)

Target result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

  • Time of benchmark: 19 Jan 2020 - 23:40
  • Package commit: 5cb3ae
  • Julia commit: 2d5741
  • Julia command flags: None
  • Environment variables: JULIA_NUM_THREADS => 2

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["findfirst", "n=1000", "foldl"] 773.461 ms (5%)
["findfirst", "n=1000", "reduce", "basesize=128"] 796.143 ms (5%) 1.30 MiB (1%) 26302
["findfirst", "n=1000", "reduce", "basesize=256"] 599.055 ms (5%) 542.69 KiB (1%) 10696
["findfirst", "n=1000", "reduce", "basesize=512"] 720.747 ms (5%) 371.03 KiB (1%) 7266
["findfirst", "n=400", "foldl"] 578.641 ms (5%)
["findfirst", "n=400", "reduce", "basesize=128"] 321.306 ms (5%) 1.19 MiB (1%) 24296
["findfirst", "n=400", "reduce", "basesize=256"] 328.212 ms (5%) 669.42 KiB (1%) 13346
["findfirst", "n=400", "reduce", "basesize=512"] 371.484 ms (5%) 419.14 KiB (1%) 8303
["findfirst", "n=500", "foldl"] 98.896 ms (5%)
["findfirst", "n=500", "reduce", "basesize=128"] 201.712 ms (5%) 810.58 KiB (1%) 15808
["findfirst", "n=500", "reduce", "basesize=256"] 425.520 ms (5%) 615.33 KiB (1%) 11956
["findfirst", "n=500", "reduce", "basesize=512"] 163.080 ms (5%) 210.31 KiB (1%) 4102
["parallel_histogram", "assoc", "basesize=16384"] 5.601 ms (5%) 732.22 KiB (1%) 109
["parallel_histogram", "assoc", "basesize=4096"] 6.387 ms (5%) 1.80 MiB (1%) 539
["parallel_histogram", "assoc", "basesize=8192"] 5.993 ms (5%) 1.43 MiB (1%) 261
["parallel_histogram", "comm", "basesize=16384"] 13.339 ms (5%) 1.22 MiB (1%) 285
["parallel_histogram", "comm", "basesize=4096"] 18.106 ms (5%) 1.09 MiB (1%) 2532
["parallel_histogram", "comm", "basesize=8192"] 16.726 ms (5%) 1.23 MiB (1%) 876
["parallel_histogram", "seq"] 10.299 ms (5%) 364.63 KiB (1%) 25
["unordered", "collect"] 563.577 ms (5%) 513.00 KiB (1%) 23
["unordered", "unordered", "basesize=1"] 563.518 ms (5%) 9.048 ms 30.26 MiB (1%) 475066
["unordered", "unordered", "basesize=1024"] 387.970 ms (5%) 908.20 KiB (1%) 11240
["unordered", "unordered", "basesize=32"] 320.271 ms (5%) 1.57 MiB (1%) 23341
["words", "nthreads=1"] 45.016 ms (5%) 8.607 ms 64.49 MiB (1%) 2087319
["words", "nthreads=2"] 24.811 ms (5%) 64.85 MiB (1%) 2087399
["words", "nthreads=4"] 25.456 ms (5%) 65.57 MiB (1%) 2087563

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["findfirst", "n=1000"]
  • ["findfirst", "n=1000", "reduce"]
  • ["findfirst", "n=400"]
  • ["findfirst", "n=400", "reduce"]
  • ["findfirst", "n=500"]
  • ["findfirst", "n=500", "reduce"]
  • ["parallel_histogram", "assoc"]
  • ["parallel_histogram", "comm"]
  • ["parallel_histogram"]
  • ["unordered"]
  • ["unordered", "unordered"]
  • ["words"]

Julia versioninfo

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz: 
              speed         user         nice          sys         idle          irq
       #1  2397 MHz      35822 s          0 s       1690 s      13041 s          0 s
       #2  2397 MHz      29354 s          0 s       1642 s      18759 s          0 s
       
  Memory: 6.782737731933594 GB (3653.85546875 MB free)
  Uptime: 557.0 sec
  Load Avg:  1.7587890625  1.4482421875  0.80029296875
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, haswell)

Baseline result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

  • Time of benchmark: 19 Jan 2020 - 23:44
  • Package commit: 149fe1
  • Julia commit: 2d5741
  • Julia command flags: None
  • Environment variables: JULIA_NUM_THREADS => 2

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["findfirst", "n=1000", "foldl"] 768.183 ms (5%)
["findfirst", "n=1000", "reduce", "basesize=128"] 841.719 ms (5%) 1.29 MiB (1%) 25854
["findfirst", "n=1000", "reduce", "basesize=256"] 562.471 ms (5%) 555.25 KiB (1%) 10823
["findfirst", "n=1000", "reduce", "basesize=512"] 711.928 ms (5%) 352.50 KiB (1%) 6865
["findfirst", "n=400", "foldl"] 575.130 ms (5%)
["findfirst", "n=400", "reduce", "basesize=128"] 334.161 ms (5%) 1.22 MiB (1%) 24512
["findfirst", "n=400", "reduce", "basesize=256"] 325.026 ms (5%) 668.33 KiB (1%) 13163
["findfirst", "n=400", "reduce", "basesize=512"] 360.256 ms (5%) 387.08 KiB (1%) 7604
["findfirst", "n=500", "foldl"] 98.470 ms (5%)
["findfirst", "n=500", "reduce", "basesize=128"] 245.515 ms (5%) 923.08 KiB (1%) 17868
["findfirst", "n=500", "reduce", "basesize=256"] 185.327 ms (5%) 411.08 KiB (1%) 7950
["findfirst", "n=500", "reduce", "basesize=512"] 345.829 ms (5%) 366.48 KiB (1%) 7088
["parallel_histogram", "assoc", "basesize=16384"] 5.591 ms (5%) 732.25 KiB (1%) 110
["parallel_histogram", "assoc", "basesize=4096"] 6.380 ms (5%) 1.80 MiB (1%) 541
["parallel_histogram", "assoc", "basesize=8192"] 5.984 ms (5%) 1.43 MiB (1%) 261
["parallel_histogram", "comm", "basesize=16384"] 13.336 ms (5%) 1.22 MiB (1%) 263
["parallel_histogram", "comm", "basesize=4096"] 20.338 ms (5%) 1.06 MiB (1%) 5511
["parallel_histogram", "comm", "basesize=8192"] 15.042 ms (5%) 1.23 MiB (1%) 645
["parallel_histogram", "seq"] 10.388 ms (5%) 364.63 KiB (1%) 25
["unordered", "collect"] 558.886 ms (5%) 513.00 KiB (1%) 23
["unordered", "unordered", "basesize=1"] 561.537 ms (5%) 9.716 ms 30.26 MiB (1%) 475224
["unordered", "unordered", "basesize=1024"] 423.333 ms (5%) 989.63 KiB (1%) 16406
["unordered", "unordered", "basesize=32"] 315.741 ms (5%) 1.57 MiB (1%) 23172
["words", "nthreads=1"] 45.857 ms (5%) 7.715 ms 64.81 MiB (1%) 2097422
["words", "nthreads=2"] 23.253 ms (5%) 65.53 MiB (1%) 2097584
["words", "nthreads=4"] 24.164 ms (5%) 66.17 MiB (1%) 2097893

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["findfirst", "n=1000"]
  • ["findfirst", "n=1000", "reduce"]
  • ["findfirst", "n=400"]
  • ["findfirst", "n=400", "reduce"]
  • ["findfirst", "n=500"]
  • ["findfirst", "n=500", "reduce"]
  • ["parallel_histogram", "assoc"]
  • ["parallel_histogram", "comm"]
  • ["parallel_histogram"]
  • ["unordered"]
  • ["unordered", "unordered"]
  • ["words"]

Julia versioninfo

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz: 
              speed         user         nice          sys         idle          irq
       #1  2397 MHz      49808 s          0 s       2017 s      18500 s          0 s
       #2  2397 MHz      44622 s          0 s       1995 s      22853 s          0 s
       
  Memory: 6.782737731933594 GB (3666.640625 MB free)
  Uptime: 756.0 sec
  Load Avg:  1.56689453125  1.4853515625  0.9462890625
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, haswell)

Runtime information

Runtime Info
BLAS #threads 2
BLAS.vendor() openblas64
Sys.CPU_THREADS 2

lscpu output:

Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              2
On-line CPU(s) list: 0,1
Thread(s) per core:  1
Core(s) per socket:  2
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               63
Model name:          Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
Stepping:            2
CPU MHz:             2397.227
BogoMIPS:            4794.45
Hypervisor vendor:   Microsoft
Virtualization type: full
L1d cache:           32K
L1i cache:           32K
L2 cache:            256K
L3 cache:            30720K
NUMA node0 CPU(s):   0,1
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm invpcid_single pti fsgsbase bmi1 avx2 smep bmi2 erms invpcid xsaveopt md_clear
Cpu Property Value
Brand Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
Vendor :Intel
Architecture :Haswell
Model Family: 0x06, Model: 0x3f, Stepping: 0x02, Type: 0x00
Cores 2 physical cores, 2 logical cores (on executing CPU)
No Hyperthreading detected
Clock Frequencies Not supported by CPU
Data Cache Level 1:3 : (32, 256, 30720) kbytes
64 byte cache line size
Address Size 48 bits virtual, 44 bits physical
SIMD 256 bit = 32 byte max. SIMD vector size
Time Stamp Counter TSC is accessible via rdtsc
TSC increased at every clock cycle (non-invariant TSC)
Perf. Monitoring Performance Monitoring Counters (PMC) are not supported
Hypervisor Yes, Microsoft

@codecov-io
Copy link

codecov-io commented Jan 19, 2020

Codecov Report

Merging #172 into master will increase coverage by 0.08%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #172      +/-   ##
==========================================
+ Coverage   93.67%   93.75%   +0.08%     
==========================================
  Files          19       19              
  Lines        1264     1265       +1     
==========================================
+ Hits         1184     1186       +2     
+ Misses         80       79       -1
Impacted Files Coverage Δ
src/reduce.jl 93.67% <100%> (+1.77%) ⬆️
src/dreduce.jl 100% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 149fe18...7a6d8ad. Read the comment docs.

@github-actions
Copy link
Contributor

Benchmark result

Judge result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

  • Time of benchmarks:
    • Target: 19 Jan 2020 - 23:42
    • Baseline: 19 Jan 2020 - 23:46
  • Package commits:
    • Target: 5cb3ae
    • Baseline: 149fe1
  • Julia commits:
    • Target: 2d5741
    • Baseline: 2d5741
  • Julia command flags:
    • Target: None
    • Baseline: None
  • Environment variables:
    • Target: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 1
    • Baseline: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 1

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID time ratio memory ratio
["cat", "xf"] 1.10 (5%) ❌ 1.00 (1%)
["collect", "identity-float"] 1.09 (5%) ❌ 1.00 (1%)
["gemm", "mul", "man", "false", "8"] 0.73 (5%) ✅ 1.00 (1%)
["gemm", "mul", "man", "ivdep", "32"] 0.92 (5%) ✅ 1.00 (1%)
["gemm", "mul", "man", "ivdep", "8"] 0.92 (5%) ✅ 1.00 (1%)
["gemm", "mul", "xf", "false", "8"] 0.84 (5%) ✅ 1.00 (1%)
["gemm", "mul", "xf", "ivdep", "32"] 0.89 (5%) ✅ 1.00 (1%)
["gemm", "mul", "xf", "ivdep", "8"] 0.85 (5%) ✅ 1.00 (1%)
["gemm", "mul", "xf", "true", "32"] 0.95 (5%) ✅ 1.00 (1%)
["gemm", "mul", "xf", "true", "8"] 0.93 (5%) ✅ 1.00 (1%)
["missing_dot", "equiv"] 1.12 (5%) ❌ 1.00 (1%)
["missing_dot", "man"] 0.93 (5%) ✅ 1.00 (1%)
["missing_dot", "xf"] 1.05 (5%) ❌ 1.00 (1%)

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["cat"]
  • ["collect"]
  • ["dot"]
  • ["filter_map_map!"]
  • ["filter_map_reduce"]
  • ["gemm", "fusedmul", "blas"]
  • ["gemm", "fusedmul", "xf"]
  • ["gemm", "mul", "linalg"]
  • ["gemm", "mul", "man", "false"]
  • ["gemm", "mul", "man", "ivdep"]
  • ["gemm", "mul", "man", "true"]
  • ["gemm", "mul", "xf", "false"]
  • ["gemm", "mul", "xf", "ivdep"]
  • ["gemm", "mul", "xf", "true"]
  • ["missing_argmax"]
  • ["missing_dot"]
  • ["partition_by"]

Julia versioninfo

Target

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz: 
              speed         user         nice          sys         idle          irq
       #1  2294 MHz      22732 s          0 s       1068 s      31617 s          0 s
       #2  2294 MHz      31514 s          0 s       1216 s      23798 s          0 s
       
  Memory: 6.782737731933594 GB (3525.578125 MB free)
  Uptime: 574.0 sec
  Load Avg:  1.1171875  1.01611328125  0.62109375
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, broadwell)

Baseline

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz: 
              speed         user         nice          sys         idle          irq
       #1  2294 MHz      23495 s          0 s       1187 s      54428 s          0 s
       #2  2294 MHz      54633 s          0 s       1308 s      24364 s          0 s
       
  Memory: 6.782737731933594 GB (3530.71875 MB free)
  Uptime: 811.0 sec
  Load Avg:  1.0  1.0  0.71826171875
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, broadwell)

Target result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

  • Time of benchmark: 19 Jan 2020 - 23:42
  • Package commit: 5cb3ae
  • Julia commit: 2d5741
  • Julia command flags: None
  • Environment variables: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 1

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["cat", "base"] 210.299 μs (5%)
["cat", "xf"] 1.600 μs (5%)
["collect", "filter-missing"] 83.299 μs (5%) 33.05 KiB (1%) 20
["collect", "identity-float"] 71.300 μs (5%) 256.91 KiB (1%) 20
["collect", "identity-union"] 309.300 μs (5%) 285.45 KiB (1%) 6678
["dot", "blas"] 2.289 μs (5%)
["dot", "man"] 2.267 μs (5%)
["dot", "rf"] 2.656 μs (5%)
["dot", "xf"] 2.667 μs (5%)
["filter_map_map!", "man"] 68.200 μs (5%)
["filter_map_map!", "xf"] 71.200 μs (5%) 144 bytes (1%) 8
["filter_map_reduce", "man"] 194.899 μs (5%)
["filter_map_reduce", "xf"] 194.899 μs (5%)
["gemm", "fusedmul", "blas", "16"] 5.807 ms (5%)
["gemm", "fusedmul", "blas", "2"] 3.973 ms (5%)
["gemm", "fusedmul", "blas", "32"] 8.326 ms (5%)
["gemm", "fusedmul", "blas", "8"] 4.357 ms (5%)
["gemm", "fusedmul", "xf", "16"] 5.561 ms (5%) 160 bytes (1%) 6
["gemm", "fusedmul", "xf", "2"] 645.000 μs (5%) 160 bytes (1%) 6
["gemm", "fusedmul", "xf", "32"] 11.157 ms (5%) 160 bytes (1%) 6
["gemm", "fusedmul", "xf", "8"] 2.733 ms (5%) 160 bytes (1%) 6
["gemm", "mul", "linalg", "256"] 1.388 ms (5%)
["gemm", "mul", "linalg", "32"] 3.743 μs (5%)
["gemm", "mul", "linalg", "8"] 289.855 ns (5%)
["gemm", "mul", "man", "false", "256"] 4.911 ms (5%)
["gemm", "mul", "man", "false", "32"] 7.440 μs (5%)
["gemm", "mul", "man", "false", "8"] 365.500 ns (5%)
["gemm", "mul", "man", "ivdep", "256"] 4.876 ms (5%)
["gemm", "mul", "man", "ivdep", "32"] 6.280 μs (5%)
["gemm", "mul", "man", "ivdep", "8"] 368.594 ns (5%)
["gemm", "mul", "man", "true", "256"] 4.898 ms (5%)
["gemm", "mul", "man", "true", "32"] 7.100 μs (5%)
["gemm", "mul", "man", "true", "8"] 408.035 ns (5%)
["gemm", "mul", "xf", "false", "256"] 4.859 ms (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "false", "32"] 6.875 μs (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "false", "8"] 421.827 ns (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "ivdep", "256"] 4.846 ms (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "ivdep", "32"] 5.700 μs (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "ivdep", "8"] 340.094 ns (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "true", "256"] 4.869 ms (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "true", "32"] 6.625 μs (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "true", "8"] 371.859 ns (5%) 48 bytes (1%) 2
["missing_argmax", "man"] 950.000 ns (5%) 32 bytes (1%) 1
["missing_argmax", "rf"] 2.200 μs (5%) 32 bytes (1%) 1
["missing_argmax", "xf"] 2.178 μs (5%) 32 bytes (1%) 1
["missing_dot", "equiv"] 1.390 μs (5%) 16 bytes (1%) 1
["missing_dot", "man"] 954.839 ns (5%) 16 bytes (1%) 1
["missing_dot", "naive"] 4.057 μs (5%) 16 bytes (1%) 1
["missing_dot", "rf"] 888.710 ns (5%) 16 bytes (1%) 1
["missing_dot", "rf_nota"] 1.390 μs (5%) 16 bytes (1%) 1
["missing_dot", "xf"] 193.301 μs (5%) 73.95 KiB (1%) 3860
["missing_dot", "xf_nota"] 190.600 μs (5%) 74.05 KiB (1%) 3865
["partition_by", "man"] 2.002 ms (5%) 352 bytes (1%) 4
["partition_by", "xf"] 1.865 ms (5%) 576 bytes (1%) 7

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["cat"]
  • ["collect"]
  • ["dot"]
  • ["filter_map_map!"]
  • ["filter_map_reduce"]
  • ["gemm", "fusedmul", "blas"]
  • ["gemm", "fusedmul", "xf"]
  • ["gemm", "mul", "linalg"]
  • ["gemm", "mul", "man", "false"]
  • ["gemm", "mul", "man", "ivdep"]
  • ["gemm", "mul", "man", "true"]
  • ["gemm", "mul", "xf", "false"]
  • ["gemm", "mul", "xf", "ivdep"]
  • ["gemm", "mul", "xf", "true"]
  • ["missing_argmax"]
  • ["missing_dot"]
  • ["partition_by"]

Julia versioninfo

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz: 
              speed         user         nice          sys         idle          irq
       #1  2294 MHz      22732 s          0 s       1068 s      31617 s          0 s
       #2  2294 MHz      31514 s          0 s       1216 s      23798 s          0 s
       
  Memory: 6.782737731933594 GB (3525.578125 MB free)
  Uptime: 574.0 sec
  Load Avg:  1.1171875  1.01611328125  0.62109375
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, broadwell)

Baseline result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

  • Time of benchmark: 19 Jan 2020 - 23:46
  • Package commit: 149fe1
  • Julia commit: 2d5741
  • Julia command flags: None
  • Environment variables: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 1

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["cat", "base"] 210.300 μs (5%)
["cat", "xf"] 1.460 μs (5%)
["collect", "filter-missing"] 83.800 μs (5%) 33.05 KiB (1%) 20
["collect", "identity-float"] 65.600 μs (5%) 256.91 KiB (1%) 20
["collect", "identity-union"] 305.000 μs (5%) 285.30 KiB (1%) 6642
["dot", "blas"] 2.289 μs (5%)
["dot", "man"] 2.256 μs (5%)
["dot", "rf"] 2.667 μs (5%)
["dot", "xf"] 2.667 μs (5%)
["filter_map_map!", "man"] 67.900 μs (5%)
["filter_map_map!", "xf"] 70.500 μs (5%) 144 bytes (1%) 8
["filter_map_reduce", "man"] 194.900 μs (5%)
["filter_map_reduce", "xf"] 194.900 μs (5%)
["gemm", "fusedmul", "blas", "16"] 5.814 ms (5%)
["gemm", "fusedmul", "blas", "2"] 4.048 ms (5%)
["gemm", "fusedmul", "blas", "32"] 8.408 ms (5%)
["gemm", "fusedmul", "blas", "8"] 4.383 ms (5%)
["gemm", "fusedmul", "xf", "16"] 5.524 ms (5%) 160 bytes (1%) 6
["gemm", "fusedmul", "xf", "2"] 678.200 μs (5%) 160 bytes (1%) 6
["gemm", "fusedmul", "xf", "32"] 11.658 ms (5%) 160 bytes (1%) 6
["gemm", "fusedmul", "xf", "8"] 2.748 ms (5%) 160 bytes (1%) 6
["gemm", "mul", "linalg", "256"] 1.374 ms (5%)
["gemm", "mul", "linalg", "32"] 3.800 μs (5%)
["gemm", "mul", "linalg", "8"] 300.000 ns (5%)
["gemm", "mul", "man", "false", "256"] 4.934 ms (5%)
["gemm", "mul", "man", "false", "32"] 7.700 μs (5%)
["gemm", "mul", "man", "false", "8"] 500.000 ns (5%)
["gemm", "mul", "man", "ivdep", "256"] 4.901 ms (5%)
["gemm", "mul", "man", "ivdep", "32"] 6.800 μs (5%)
["gemm", "mul", "man", "ivdep", "8"] 400.000 ns (5%)
["gemm", "mul", "man", "true", "256"] 4.892 ms (5%)
["gemm", "mul", "man", "true", "32"] 7.400 μs (5%)
["gemm", "mul", "man", "true", "8"] 400.000 ns (5%)
["gemm", "mul", "xf", "false", "256"] 4.876 ms (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "false", "32"] 7.000 μs (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "false", "8"] 500.000 ns (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "ivdep", "256"] 4.881 ms (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "ivdep", "32"] 6.400 μs (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "ivdep", "8"] 400.000 ns (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "true", "256"] 4.910 ms (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "true", "32"] 7.000 μs (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "true", "8"] 400.000 ns (5%) 48 bytes (1%) 2
["missing_argmax", "man"] 950.000 ns (5%) 32 bytes (1%) 1
["missing_argmax", "rf"] 2.200 μs (5%) 32 bytes (1%) 1
["missing_argmax", "xf"] 2.233 μs (5%) 32 bytes (1%) 1
["missing_dot", "equiv"] 1.240 μs (5%) 16 bytes (1%) 1
["missing_dot", "man"] 1.029 μs (5%) 16 bytes (1%) 1
["missing_dot", "naive"] 4.057 μs (5%) 16 bytes (1%) 1
["missing_dot", "rf"] 864.516 ns (5%) 16 bytes (1%) 1
["missing_dot", "rf_nota"] 1.360 μs (5%) 16 bytes (1%) 1
["missing_dot", "xf"] 183.800 μs (5%) 74.20 KiB (1%) 3869
["missing_dot", "xf_nota"] 189.300 μs (5%) 74.11 KiB (1%) 3866
["partition_by", "man"] 1.932 ms (5%) 352 bytes (1%) 4
["partition_by", "xf"] 1.893 ms (5%) 576 bytes (1%) 7

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["cat"]
  • ["collect"]
  • ["dot"]
  • ["filter_map_map!"]
  • ["filter_map_reduce"]
  • ["gemm", "fusedmul", "blas"]
  • ["gemm", "fusedmul", "xf"]
  • ["gemm", "mul", "linalg"]
  • ["gemm", "mul", "man", "false"]
  • ["gemm", "mul", "man", "ivdep"]
  • ["gemm", "mul", "man", "true"]
  • ["gemm", "mul", "xf", "false"]
  • ["gemm", "mul", "xf", "ivdep"]
  • ["gemm", "mul", "xf", "true"]
  • ["missing_argmax"]
  • ["missing_dot"]
  • ["partition_by"]

Julia versioninfo

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz: 
              speed         user         nice          sys         idle          irq
       #1  2294 MHz      23495 s          0 s       1187 s      54428 s          0 s
       #2  2294 MHz      54633 s          0 s       1308 s      24364 s          0 s
       
  Memory: 6.782737731933594 GB (3530.71875 MB free)
  Uptime: 811.0 sec
  Load Avg:  1.0  1.0  0.71826171875
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, broadwell)

Runtime information

Runtime Info
BLAS #threads 2
BLAS.vendor() openblas64
Sys.CPU_THREADS 2

lscpu output:

Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              2
On-line CPU(s) list: 0,1
Thread(s) per core:  1
Core(s) per socket:  2
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               79
Model name:          Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
Stepping:            1
CPU MHz:             2294.689
BogoMIPS:            4589.37
Hypervisor vendor:   Microsoft
Virtualization type: full
L1d cache:           32K
L1i cache:           32K
L2 cache:            256K
L3 cache:            51200K
NUMA node0 CPU(s):   0,1
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx smap xsaveopt md_clear
Cpu Property Value
Brand Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
Vendor :Intel
Architecture :Broadwell
Model Family: 0x06, Model: 0x4f, Stepping: 0x01, Type: 0x00
Cores 2 physical cores, 2 logical cores (on executing CPU)
No Hyperthreading detected
Clock Frequencies Not supported by CPU
Data Cache Level 1:3 : (32, 256, 51200) kbytes
64 byte cache line size
Address Size 48 bits virtual, 44 bits physical
SIMD 256 bit = 32 byte max. SIMD vector size
Time Stamp Counter TSC is accessible via rdtsc
TSC increased at every clock cycle (non-invariant TSC)
Perf. Monitoring Performance Monitoring Counters (PMC) are not supported
Hypervisor Yes, Microsoft

@tkf
Copy link
Member Author

tkf commented Jan 20, 2020

where !(Reduced <: T)

This is not respected always; e.g., reduced(c) in

https://github.com/tkf/Transducers.jl/blob/149fe189a171d38d962b36970d8640a0c875ceab/examples/tutorial_parallel.jl#L413-L421

hence the commit 7a6d8ad. (But maybe it's worth reverting it in v0.5?)

@github-actions
Copy link
Contributor

Multi-thread benchmark result

Judge result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

  • Time of benchmarks:
    • Target: 20 Jan 2020 - 00:03
    • Baseline: 20 Jan 2020 - 00:06
  • Package commits:
    • Target: 1606ff
    • Baseline: 149fe1
  • Julia commits:
    • Target: 2d5741
    • Baseline: 2d5741
  • Julia command flags:
    • Target: None
    • Baseline: None
  • Environment variables:
    • Target: JULIA_NUM_THREADS => 2
    • Baseline: JULIA_NUM_THREADS => 2

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID time ratio memory ratio
["findfirst", "n=1000", "reduce", "basesize=128"] 0.56 (5%) ✅ 0.59 (1%) ✅
["findfirst", "n=1000", "reduce", "basesize=256"] 1.04 (5%) 1.03 (1%) ❌
["findfirst", "n=1000", "reduce", "basesize=512"] 1.09 (5%) ❌ 1.06 (1%) ❌
["findfirst", "n=400", "reduce", "basesize=128"] 1.20 (5%) ❌ 1.12 (1%) ❌
["findfirst", "n=400", "reduce", "basesize=256"] 1.07 (5%) ❌ 1.01 (1%) ❌
["findfirst", "n=400", "reduce", "basesize=512"] 1.04 (5%) 1.08 (1%) ❌
["findfirst", "n=500", "reduce", "basesize=128"] 1.02 (5%) 1.01 (1%) ❌
["findfirst", "n=500", "reduce", "basesize=256"] 0.66 (5%) ✅ 0.71 (1%) ✅
["findfirst", "n=500", "reduce", "basesize=512"] 0.86 (5%) ✅ 0.89 (1%) ✅
["parallel_histogram", "assoc", "basesize=4096"] 0.98 (5%) 0.77 (1%) ✅
["parallel_histogram", "comm", "basesize=4096"] 1.17 (5%) ❌ 0.98 (1%) ✅
["unordered", "unordered", "basesize=1"] 0.93 (5%) ✅ 1.00 (1%)
["unordered", "unordered", "basesize=1024"] 0.93 (5%) ✅ 1.12 (1%) ❌
["words", "nthreads=2"] 0.93 (5%) ✅ 0.99 (1%)

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["findfirst", "n=1000"]
  • ["findfirst", "n=1000", "reduce"]
  • ["findfirst", "n=400"]
  • ["findfirst", "n=400", "reduce"]
  • ["findfirst", "n=500"]
  • ["findfirst", "n=500", "reduce"]
  • ["parallel_histogram", "assoc"]
  • ["parallel_histogram", "comm"]
  • ["parallel_histogram"]
  • ["unordered"]
  • ["unordered", "unordered"]
  • ["words"]

Julia versioninfo

Target

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz: 
              speed         user         nice          sys         idle          irq
       #1  2397 MHz      31048 s          0 s       1538 s      29400 s          0 s
       #2  2397 MHz      32392 s          0 s       1791 s      28873 s          0 s
       
  Memory: 6.782737731933594 GB (3643.703125 MB free)
  Uptime: 669.0 sec
  Load Avg:  1.69091796875  1.287109375  0.68994140625
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, haswell)

Baseline

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz: 
              speed         user         nice          sys         idle          irq
       #1  2397 MHz      47960 s          0 s       1930 s      31540 s          0 s
       #2  2397 MHz      44212 s          0 s       2149 s      36123 s          0 s
       
  Memory: 6.782737731933594 GB (3684.73828125 MB free)
  Uptime: 865.0 sec
  Load Avg:  1.6806640625  1.42724609375  0.86181640625
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, haswell)

Target result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

  • Time of benchmark: 20 Jan 2020 - 0:3
  • Package commit: 1606ff
  • Julia commit: 2d5741
  • Julia command flags: None
  • Environment variables: JULIA_NUM_THREADS => 2

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["findfirst", "n=1000", "foldl"] 763.758 ms (5%)
["findfirst", "n=1000", "reduce", "basesize=128"] 510.591 ms (5%) 894.38 KiB (1%) 17612
["findfirst", "n=1000", "reduce", "basesize=256"] 548.149 ms (5%) 529.17 KiB (1%) 10400
["findfirst", "n=1000", "reduce", "basesize=512"] 773.637 ms (5%) 372.36 KiB (1%) 7345
["findfirst", "n=400", "foldl"] 571.520 ms (5%)
["findfirst", "n=400", "reduce", "basesize=128"] 354.451 ms (5%) 1.32 MiB (1%) 26933
["findfirst", "n=400", "reduce", "basesize=256"] 337.066 ms (5%) 669.36 KiB (1%) 13343
["findfirst", "n=400", "reduce", "basesize=512"] 346.378 ms (5%) 418.97 KiB (1%) 8296
["findfirst", "n=500", "foldl"] 97.773 ms (5%)
["findfirst", "n=500", "reduce", "basesize=128"] 151.221 ms (5%) 576.64 KiB (1%) 11301
["findfirst", "n=500", "reduce", "basesize=256"] 231.760 ms (5%) 490.41 KiB (1%) 9617
["findfirst", "n=500", "reduce", "basesize=512"] 289.197 ms (5%) 366.13 KiB (1%) 7067
["parallel_histogram", "assoc", "basesize=16384"] 5.489 ms (5%) 732.22 KiB (1%) 109
["parallel_histogram", "assoc", "basesize=4096"] 6.249 ms (5%) 1.80 MiB (1%) 539
["parallel_histogram", "assoc", "basesize=8192"] 5.866 ms (5%) 1.43 MiB (1%) 260
["parallel_histogram", "comm", "basesize=16384"] 13.233 ms (5%) 1.22 MiB (1%) 251
["parallel_histogram", "comm", "basesize=4096"] 22.603 ms (5%) 1.04 MiB (1%) 4244
["parallel_histogram", "comm", "basesize=8192"] 15.090 ms (5%) 1.23 MiB (1%) 594
["parallel_histogram", "seq"] 10.101 ms (5%) 364.63 KiB (1%) 25
["unordered", "collect"] 551.144 ms (5%) 513.00 KiB (1%) 23
["unordered", "unordered", "basesize=1"] 540.841 ms (5%) 8.969 ms 30.26 MiB (1%) 475224
["unordered", "unordered", "basesize=1024"] 429.188 ms (5%) 1.20 MiB (1%) 31614
["unordered", "unordered", "basesize=32"] 312.862 ms (5%) 1.57 MiB (1%) 23281
["words", "nthreads=1"] 43.462 ms (5%) 7.660 ms 64.37 MiB (1%) 2083164
["words", "nthreads=2"] 23.730 ms (5%) 65.09 MiB (1%) 2083325
["words", "nthreads=4"] 24.824 ms (5%) 65.54 MiB (1%) 2083481

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["findfirst", "n=1000"]
  • ["findfirst", "n=1000", "reduce"]
  • ["findfirst", "n=400"]
  • ["findfirst", "n=400", "reduce"]
  • ["findfirst", "n=500"]
  • ["findfirst", "n=500", "reduce"]
  • ["parallel_histogram", "assoc"]
  • ["parallel_histogram", "comm"]
  • ["parallel_histogram"]
  • ["unordered"]
  • ["unordered", "unordered"]
  • ["words"]

Julia versioninfo

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz: 
              speed         user         nice          sys         idle          irq
       #1  2397 MHz      31048 s          0 s       1538 s      29400 s          0 s
       #2  2397 MHz      32392 s          0 s       1791 s      28873 s          0 s
       
  Memory: 6.782737731933594 GB (3643.703125 MB free)
  Uptime: 669.0 sec
  Load Avg:  1.69091796875  1.287109375  0.68994140625
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, haswell)

Baseline result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

  • Time of benchmark: 20 Jan 2020 - 0:6
  • Package commit: 149fe1
  • Julia commit: 2d5741
  • Julia command flags: None
  • Environment variables: JULIA_NUM_THREADS => 2

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["findfirst", "n=1000", "foldl"] 773.702 ms (5%)
["findfirst", "n=1000", "reduce", "basesize=128"] 914.560 ms (5%) 1.47 MiB (1%) 29424
["findfirst", "n=1000", "reduce", "basesize=256"] 528.524 ms (5%) 513.30 KiB (1%) 10026
["findfirst", "n=1000", "reduce", "basesize=512"] 708.315 ms (5%) 350.14 KiB (1%) 6828
["findfirst", "n=400", "foldl"] 570.837 ms (5%)
["findfirst", "n=400", "reduce", "basesize=128"] 294.667 ms (5%) 1.18 MiB (1%) 23793
["findfirst", "n=400", "reduce", "basesize=256"] 316.106 ms (5%) 662.56 KiB (1%) 13030
["findfirst", "n=400", "reduce", "basesize=512"] 332.504 ms (5%) 387.09 KiB (1%) 7605
["findfirst", "n=500", "foldl"] 96.747 ms (5%)
["findfirst", "n=500", "reduce", "basesize=128"] 147.719 ms (5%) 568.75 KiB (1%) 11024
["findfirst", "n=500", "reduce", "basesize=256"] 353.812 ms (5%) 686.48 KiB (1%) 13333
["findfirst", "n=500", "reduce", "basesize=512"] 336.172 ms (5%) 409.33 KiB (1%) 7911
["parallel_histogram", "assoc", "basesize=16384"] 5.503 ms (5%) 732.25 KiB (1%) 110
["parallel_histogram", "assoc", "basesize=4096"] 6.371 ms (5%) 2.33 MiB (1%) 552
["parallel_histogram", "assoc", "basesize=8192"] 5.867 ms (5%) 1.43 MiB (1%) 261
["parallel_histogram", "comm", "basesize=16384"] 13.114 ms (5%) 1.22 MiB (1%) 241
["parallel_histogram", "comm", "basesize=4096"] 19.284 ms (5%) 1.05 MiB (1%) 4916
["parallel_histogram", "comm", "basesize=8192"] 14.889 ms (5%) 1.23 MiB (1%) 546
["parallel_histogram", "seq"] 10.108 ms (5%) 364.63 KiB (1%) 25
["unordered", "collect"] 551.253 ms (5%) 513.00 KiB (1%) 23
["unordered", "unordered", "basesize=1"] 579.887 ms (5%) 30.26 MiB (1%) 475235
["unordered", "unordered", "basesize=1024"] 462.617 ms (5%) 1.07 MiB (1%) 23145
["unordered", "unordered", "basesize=32"] 310.735 ms (5%) 1.57 MiB (1%) 23568
["words", "nthreads=1"] 44.702 ms (5%) 7.357 ms 64.85 MiB (1%) 2098438
["words", "nthreads=2"] 25.548 ms (5%) 65.57 MiB (1%) 2098599
["words", "nthreads=4"] 25.841 ms (5%) 66.02 MiB (1%) 2098756

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["findfirst", "n=1000"]
  • ["findfirst", "n=1000", "reduce"]
  • ["findfirst", "n=400"]
  • ["findfirst", "n=400", "reduce"]
  • ["findfirst", "n=500"]
  • ["findfirst", "n=500", "reduce"]
  • ["parallel_histogram", "assoc"]
  • ["parallel_histogram", "comm"]
  • ["parallel_histogram"]
  • ["unordered"]
  • ["unordered", "unordered"]
  • ["words"]

Julia versioninfo

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz: 
              speed         user         nice          sys         idle          irq
       #1  2397 MHz      47960 s          0 s       1930 s      31540 s          0 s
       #2  2397 MHz      44212 s          0 s       2149 s      36123 s          0 s
       
  Memory: 6.782737731933594 GB (3684.73828125 MB free)
  Uptime: 865.0 sec
  Load Avg:  1.6806640625  1.42724609375  0.86181640625
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, haswell)

Runtime information

Runtime Info
BLAS #threads 2
BLAS.vendor() openblas64
Sys.CPU_THREADS 2

lscpu output:

Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              2
On-line CPU(s) list: 0,1
Thread(s) per core:  1
Core(s) per socket:  2
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               63
Model name:          Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
Stepping:            2
CPU MHz:             2397.226
BogoMIPS:            4794.45
Hypervisor vendor:   Microsoft
Virtualization type: full
L1d cache:           32K
L1i cache:           32K
L2 cache:            256K
L3 cache:            30720K
NUMA node0 CPU(s):   0,1
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm invpcid_single pti fsgsbase bmi1 avx2 smep bmi2 erms invpcid xsaveopt md_clear
Cpu Property Value
Brand Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
Vendor :Intel
Architecture :Haswell
Model Family: 0x06, Model: 0x3f, Stepping: 0x02, Type: 0x00
Cores 2 physical cores, 2 logical cores (on executing CPU)
No Hyperthreading detected
Clock Frequencies Not supported by CPU
Data Cache Level 1:3 : (32, 256, 30720) kbytes
64 byte cache line size
Address Size 48 bits virtual, 44 bits physical
SIMD 256 bit = 32 byte max. SIMD vector size
Time Stamp Counter TSC is accessible via rdtsc
TSC increased at every clock cycle (non-invariant TSC)
Perf. Monitoring Performance Monitoring Counters (PMC) are not supported
Hypervisor Yes, Microsoft

@github-actions
Copy link
Contributor

Benchmark result

Judge result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

  • Time of benchmarks:
    • Target: 20 Jan 2020 - 00:04
    • Baseline: 20 Jan 2020 - 00:08
  • Package commits:
    • Target: 1606ff
    • Baseline: 149fe1
  • Julia commits:
    • Target: 2d5741
    • Baseline: 2d5741
  • Julia command flags:
    • Target: None
    • Baseline: None
  • Environment variables:
    • Target: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 1
    • Baseline: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 1

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID time ratio memory ratio
["gemm", "fusedmul", "blas", "16"] 0.91 (5%) ✅ 1.00 (1%)
["gemm", "fusedmul", "blas", "8"] 0.89 (5%) ✅ 1.00 (1%)
["gemm", "fusedmul", "xf", "16"] 0.94 (5%) ✅ 1.00 (1%)
["gemm", "fusedmul", "xf", "8"] 0.89 (5%) ✅ 1.00 (1%)
["gemm", "mul", "man", "false", "8"] 0.93 (5%) ✅ 1.00 (1%)
["gemm", "mul", "man", "ivdep", "8"] 1.23 (5%) ❌ 1.00 (1%)
["gemm", "mul", "man", "true", "8"] 1.36 (5%) ❌ 1.00 (1%)
["gemm", "mul", "xf", "false", "8"] 1.05 (5%) ❌ 1.00 (1%)
["gemm", "mul", "xf", "ivdep", "8"] 0.91 (5%) ✅ 1.00 (1%)
["gemm", "mul", "xf", "true", "8"] 1.16 (5%) ❌ 1.00 (1%)
["missing_dot", "equiv"] 1.11 (5%) ❌ 1.00 (1%)
["missing_dot", "man"] 0.88 (5%) ✅ 1.00 (1%)
["partition_by", "xf"] 0.90 (5%) ✅ 1.00 (1%)

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["cat"]
  • ["collect"]
  • ["dot"]
  • ["filter_map_map!"]
  • ["filter_map_reduce"]
  • ["gemm", "fusedmul", "blas"]
  • ["gemm", "fusedmul", "xf"]
  • ["gemm", "mul", "linalg"]
  • ["gemm", "mul", "man", "false"]
  • ["gemm", "mul", "man", "ivdep"]
  • ["gemm", "mul", "man", "true"]
  • ["gemm", "mul", "xf", "false"]
  • ["gemm", "mul", "xf", "ivdep"]
  • ["gemm", "mul", "xf", "true"]
  • ["missing_argmax"]
  • ["missing_dot"]
  • ["partition_by"]

Julia versioninfo

Target

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz: 
              speed         user         nice          sys         idle          irq
       #1  2294 MHz      32782 s          0 s       1022 s      32691 s          0 s
       #2  2294 MHz      19012 s          0 s       1166 s      47401 s          0 s
       
  Memory: 6.782737731933594 GB (3507.2890625 MB free)
  Uptime: 686.0 sec
  Load Avg:  1.02880859375  0.91552734375  0.51953125
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, broadwell)

Baseline

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz: 
              speed         user         nice          sys         idle          irq
       #1  2294 MHz      34419 s          0 s       1117 s      53792 s          0 s
       #2  2294 MHz      40342 s          0 s       1278 s      48860 s          0 s
       
  Memory: 6.782737731933594 GB (3567.83203125 MB free)
  Uptime: 915.0 sec
  Load Avg:  1.10888671875  1.0126953125  0.65380859375
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, broadwell)

Target result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

  • Time of benchmark: 20 Jan 2020 - 0:4
  • Package commit: 1606ff
  • Julia commit: 2d5741
  • Julia command flags: None
  • Environment variables: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 1

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["cat", "base"] 210.299 μs (5%)
["cat", "xf"] 1.470 μs (5%)
["collect", "filter-missing"] 81.700 μs (5%) 33.05 KiB (1%) 20
["collect", "identity-float"] 61.400 μs (5%) 256.91 KiB (1%) 20
["collect", "identity-union"] 308.700 μs (5%) 285.06 KiB (1%) 6675
["dot", "blas"] 2.267 μs (5%)
["dot", "man"] 2.256 μs (5%)
["dot", "rf"] 2.656 μs (5%)
["dot", "xf"] 2.667 μs (5%)
["filter_map_map!", "man"] 66.300 μs (5%)
["filter_map_map!", "xf"] 69.400 μs (5%) 144 bytes (1%) 8
["filter_map_reduce", "man"] 194.900 μs (5%)
["filter_map_reduce", "xf"] 194.899 μs (5%)
["gemm", "fusedmul", "blas", "16"] 5.263 ms (5%)
["gemm", "fusedmul", "blas", "2"] 3.995 ms (5%)
["gemm", "fusedmul", "blas", "32"] 8.321 ms (5%)
["gemm", "fusedmul", "blas", "8"] 3.823 ms (5%)
["gemm", "fusedmul", "xf", "16"] 5.075 ms (5%) 160 bytes (1%) 6
["gemm", "fusedmul", "xf", "2"] 600.800 μs (5%) 160 bytes (1%) 6
["gemm", "fusedmul", "xf", "32"] 10.891 ms (5%) 160 bytes (1%) 6
["gemm", "fusedmul", "xf", "8"] 2.412 ms (5%) 160 bytes (1%) 6
["gemm", "mul", "linalg", "256"] 1.342 ms (5%)
["gemm", "mul", "linalg", "32"] 3.729 μs (5%)
["gemm", "mul", "linalg", "8"] 289.702 ns (5%)
["gemm", "mul", "man", "false", "256"] 4.752 ms (5%)
["gemm", "mul", "man", "false", "32"] 6.960 μs (5%)
["gemm", "mul", "man", "false", "8"] 369.231 ns (5%)
["gemm", "mul", "man", "ivdep", "256"] 4.756 ms (5%)
["gemm", "mul", "man", "ivdep", "32"] 6.240 μs (5%)
["gemm", "mul", "man", "ivdep", "8"] 368.594 ns (5%)
["gemm", "mul", "man", "true", "256"] 4.706 ms (5%)
["gemm", "mul", "man", "true", "32"] 7.150 μs (5%)
["gemm", "mul", "man", "true", "8"] 408.495 ns (5%)
["gemm", "mul", "xf", "false", "256"] 4.677 ms (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "false", "32"] 6.840 μs (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "false", "8"] 421.106 ns (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "ivdep", "256"] 4.636 ms (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "ivdep", "32"] 5.667 μs (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "ivdep", "8"] 363.158 ns (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "true", "256"] 4.703 ms (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "true", "32"] 6.575 μs (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "true", "8"] 348.309 ns (5%) 48 bytes (1%) 2
["missing_argmax", "man"] 921.739 ns (5%) 32 bytes (1%) 1
["missing_argmax", "rf"] 2.156 μs (5%) 32 bytes (1%) 1
["missing_argmax", "xf"] 2.211 μs (5%) 32 bytes (1%) 1
["missing_dot", "equiv"] 1.360 μs (5%) 16 bytes (1%) 1
["missing_dot", "man"] 894.737 ns (5%) 16 bytes (1%) 1
["missing_dot", "naive"] 4.057 μs (5%) 16 bytes (1%) 1
["missing_dot", "rf"] 907.692 ns (5%) 16 bytes (1%) 1
["missing_dot", "rf_nota"] 1.350 μs (5%) 16 bytes (1%) 1
["missing_dot", "xf"] 187.400 μs (5%) 74.02 KiB (1%) 3860
["missing_dot", "xf_nota"] 187.100 μs (5%) 74.09 KiB (1%) 3863
["partition_by", "man"] 1.807 ms (5%) 352 bytes (1%) 4
["partition_by", "xf"] 1.628 ms (5%) 576 bytes (1%) 7

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["cat"]
  • ["collect"]
  • ["dot"]
  • ["filter_map_map!"]
  • ["filter_map_reduce"]
  • ["gemm", "fusedmul", "blas"]
  • ["gemm", "fusedmul", "xf"]
  • ["gemm", "mul", "linalg"]
  • ["gemm", "mul", "man", "false"]
  • ["gemm", "mul", "man", "ivdep"]
  • ["gemm", "mul", "man", "true"]
  • ["gemm", "mul", "xf", "false"]
  • ["gemm", "mul", "xf", "ivdep"]
  • ["gemm", "mul", "xf", "true"]
  • ["missing_argmax"]
  • ["missing_dot"]
  • ["partition_by"]

Julia versioninfo

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz: 
              speed         user         nice          sys         idle          irq
       #1  2294 MHz      32782 s          0 s       1022 s      32691 s          0 s
       #2  2294 MHz      19012 s          0 s       1166 s      47401 s          0 s
       
  Memory: 6.782737731933594 GB (3507.2890625 MB free)
  Uptime: 686.0 sec
  Load Avg:  1.02880859375  0.91552734375  0.51953125
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, broadwell)

Baseline result

Benchmark Report for /home/runner/work/Transducers.jl/Transducers.jl

Job Properties

  • Time of benchmark: 20 Jan 2020 - 0:8
  • Package commit: 149fe1
  • Julia commit: 2d5741
  • Julia command flags: None
  • Environment variables: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 1

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["cat", "base"] 210.200 μs (5%)
["cat", "xf"] 1.460 μs (5%)
["collect", "filter-missing"] 81.700 μs (5%) 33.05 KiB (1%) 20
["collect", "identity-float"] 60.300 μs (5%) 256.91 KiB (1%) 20
["collect", "identity-union"] 296.200 μs (5%) 285.81 KiB (1%) 6702
["dot", "blas"] 2.300 μs (5%)
["dot", "man"] 2.278 μs (5%)
["dot", "rf"] 2.667 μs (5%)
["dot", "xf"] 2.667 μs (5%)
["filter_map_map!", "man"] 67.200 μs (5%)
["filter_map_map!", "xf"] 70.500 μs (5%) 144 bytes (1%) 8
["filter_map_reduce", "man"] 194.899 μs (5%)
["filter_map_reduce", "xf"] 194.899 μs (5%)
["gemm", "fusedmul", "blas", "16"] 5.758 ms (5%)
["gemm", "fusedmul", "blas", "2"] 4.008 ms (5%)
["gemm", "fusedmul", "blas", "32"] 8.321 ms (5%)
["gemm", "fusedmul", "blas", "8"] 4.298 ms (5%)
["gemm", "fusedmul", "xf", "16"] 5.411 ms (5%) 160 bytes (1%) 6
["gemm", "fusedmul", "xf", "2"] 621.500 μs (5%) 160 bytes (1%) 6
["gemm", "fusedmul", "xf", "32"] 11.159 ms (5%) 160 bytes (1%) 6
["gemm", "fusedmul", "xf", "8"] 2.710 ms (5%) 160 bytes (1%) 6
["gemm", "mul", "linalg", "256"] 1.314 ms (5%)
["gemm", "mul", "linalg", "32"] 3.800 μs (5%)
["gemm", "mul", "linalg", "8"] 300.000 ns (5%)
["gemm", "mul", "man", "false", "256"] 4.903 ms (5%)
["gemm", "mul", "man", "false", "32"] 7.000 μs (5%)
["gemm", "mul", "man", "false", "8"] 399.000 ns (5%)
["gemm", "mul", "man", "ivdep", "256"] 4.853 ms (5%)
["gemm", "mul", "man", "ivdep", "32"] 6.299 μs (5%)
["gemm", "mul", "man", "ivdep", "8"] 300.000 ns (5%)
["gemm", "mul", "man", "true", "256"] 4.897 ms (5%)
["gemm", "mul", "man", "true", "32"] 7.200 μs (5%)
["gemm", "mul", "man", "true", "8"] 300.000 ns (5%)
["gemm", "mul", "xf", "false", "256"] 4.888 ms (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "false", "32"] 6.900 μs (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "false", "8"] 400.000 ns (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "ivdep", "256"] 4.815 ms (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "ivdep", "32"] 5.700 μs (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "ivdep", "8"] 400.000 ns (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "true", "256"] 4.857 ms (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "true", "32"] 6.700 μs (5%) 48 bytes (1%) 2
["gemm", "mul", "xf", "true", "8"] 300.000 ns (5%) 48 bytes (1%) 2
["missing_argmax", "man"] 908.696 ns (5%) 32 bytes (1%) 1
["missing_argmax", "rf"] 2.211 μs (5%) 32 bytes (1%) 1
["missing_argmax", "xf"] 2.200 μs (5%) 32 bytes (1%) 1
["missing_dot", "equiv"] 1.230 μs (5%) 16 bytes (1%) 1
["missing_dot", "man"] 1.012 μs (5%) 16 bytes (1%) 1
["missing_dot", "naive"] 4.043 μs (5%) 16 bytes (1%) 1
["missing_dot", "rf"] 884.615 ns (5%) 16 bytes (1%) 1
["missing_dot", "rf_nota"] 1.330 μs (5%) 16 bytes (1%) 1
["missing_dot", "xf"] 183.400 μs (5%) 74.05 KiB (1%) 3865
["missing_dot", "xf_nota"] 193.500 μs (5%) 74.02 KiB (1%) 3865
["partition_by", "man"] 1.832 ms (5%) 352 bytes (1%) 4
["partition_by", "xf"] 1.799 ms (5%) 576 bytes (1%) 7

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["cat"]
  • ["collect"]
  • ["dot"]
  • ["filter_map_map!"]
  • ["filter_map_reduce"]
  • ["gemm", "fusedmul", "blas"]
  • ["gemm", "fusedmul", "xf"]
  • ["gemm", "mul", "linalg"]
  • ["gemm", "mul", "man", "false"]
  • ["gemm", "mul", "man", "ivdep"]
  • ["gemm", "mul", "man", "true"]
  • ["gemm", "mul", "xf", "false"]
  • ["gemm", "mul", "xf", "ivdep"]
  • ["gemm", "mul", "xf", "true"]
  • ["missing_argmax"]
  • ["missing_dot"]
  • ["partition_by"]

Julia versioninfo

Julia Version 1.3.1
Commit 2d5741174c (2019-12-30 21:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.3 LTS
  uname: Linux 5.0.0-1027-azure #29~18.04.1-Ubuntu SMP Mon Nov 25 21:18:57 UTC 2019 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz: 
              speed         user         nice          sys         idle          irq
       #1  2294 MHz      34419 s          0 s       1117 s      53792 s          0 s
       #2  2294 MHz      40342 s          0 s       1278 s      48860 s          0 s
       
  Memory: 6.782737731933594 GB (3567.83203125 MB free)
  Uptime: 915.0 sec
  Load Avg:  1.10888671875  1.0126953125  0.65380859375
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, broadwell)

Runtime information

Runtime Info
BLAS #threads 2
BLAS.vendor() openblas64
Sys.CPU_THREADS 2

lscpu output:

Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              2
On-line CPU(s) list: 0,1
Thread(s) per core:  1
Core(s) per socket:  2
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               79
Model name:          Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
Stepping:            1
CPU MHz:             2294.685
BogoMIPS:            4589.37
Hypervisor vendor:   Microsoft
Virtualization type: full
L1d cache:           32K
L1i cache:           32K
L2 cache:            256K
L3 cache:            51200K
NUMA node0 CPU(s):   0,1
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx smap xsaveopt md_clear
Cpu Property Value
Brand Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
Vendor :Intel
Architecture :Broadwell
Model Family: 0x06, Model: 0x4f, Stepping: 0x01, Type: 0x00
Cores 2 physical cores, 2 logical cores (on executing CPU)
No Hyperthreading detected
Clock Frequencies Not supported by CPU
Data Cache Level 1:3 : (32, 256, 51200) kbytes
64 byte cache line size
Address Size 48 bits virtual, 44 bits physical
SIMD 256 bit = 32 byte max. SIMD vector size
Time Stamp Counter TSC is accessible via rdtsc
TSC increased at every clock cycle (non-invariant TSC)
Perf. Monitoring Performance Monitoring Counters (PMC) are not supported
Hypervisor Yes, Microsoft

@tkf tkf changed the title More useful Reduced handling in parallel reduce Generic Reduced handling in parallel reduce Jan 20, 2020
@mergify mergify bot merged commit 9a84bb3 into master Jan 20, 2020
@delete-merged-branch delete-merged-branch bot deleted the parallel-reduced branch January 20, 2020 01:40
@tkf
Copy link
Member Author

tkf commented Jan 20, 2020

@jw3126 Hi, just pining as I have a hunch that you might enjoy this PR :) Also, let me know if you noticed something I missed.


FYI, an implementation of the "augmentation" I mentioned in the OP is this:

https://github.com/tkf/Transducers.jl/blob/7a6d8ad0b29701cae05ac5ed1302cf743b0075ac/src/reduce.jl#L157-L163

tkf added a commit that referenced this pull request Jan 20, 2020
mergify bot pushed a commit that referenced this pull request Jan 20, 2020
@jw3126
Copy link
Contributor

jw3126 commented Jan 20, 2020

@tkf your hunch was right, this looks pretty cool! BTW, are you happy with the benchmark action? Are these benchmarks "stable" or is there a lot of noise? When I tried last time (using travis) benchmarks results were not really useful.

@tkf
Copy link
Member Author

tkf commented Jan 20, 2020

I think single-thread benchmarks are somewhat useful (though I agree it's still noisy). For example, it helped me find a regression here #153 (comment) and confirm that improvement works here JuliaFolds/BangBang.jl#96 (comment). Multi-thread benchmarks are much noisier especially the "findfirst" ones as their computation time is non-deterministic. They are at least useful as additional smoke tests to make it always runnable.

If you are thinking about the perf tests in Setfield.jl, I think the noisiness of that one was due to the coverage and noinline flags.

If you are interested in the setup of the benchmarks, see:
JuliaCI/PkgBenchmark.jl#92 (comment)
https://github.com/tkf/BenchmarkCI.jl

mergify bot pushed a commit that referenced this pull request Jan 21, 2020
This is a bug introduced while implementing generic `Reduced` handling
#172.  If `b` is `Reduced`, all the "private" states of transducers
are stripped off.  So, `combine` should be called only for the bottom
reducing function.  This is implemented in `combine_right_reduced`.
@jw3126
Copy link
Contributor

jw3126 commented Jan 22, 2020

Mhh from what I remember, Setfield.jl benchmarks were less noisy locally then on travis. But it is a good point that flags can make benchmarks useless anyway. I am setting things up for ImageFiltering, following your instructions JuliaImages/ImageFiltering.jl#148 thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants