Odd performance characteristics of using inputs vs. not using inputs #282

PragTob · 2019-03-23T15:48:54Z

Reworking the README I came upon an interesting performance difference. So let's do our standard example first:

list = Enum.to_list(1..10_000)
map_fun = fn i -> [i, i * i] end


Benchee.run(
  %{
    "flat_map" => fn -> Enum.flat_map(list, map_fun) end,
    "map.flatten" => fn -> list |> Enum.map(map_fun) |> List.flatten() end
},
# inputs: %{"lol" => Enum.to_list(1..10_000) },
formatters: [{Benchee.Formatters.Console, extended_statistics: true}]
)

results in:

(...)

Name                  ips        average  deviation         median         99th %
flat_map           2.37 K      421.36 μs    ±14.80%      406.32 μs      725.85 μs
map.flatten        1.23 K      812.20 μs    ±18.43%      770.32 μs     1244.07 μs

Comparison: 
flat_map           2.37 K
map.flatten        1.23 K - 1.93x slower +390.84 μs

Extended statistics: 

Name                minimum        maximum    sample size                     mode
flat_map          345.36 μs     1199.50 μs        11.84 K                406.20 μs
map.flatten       520.54 μs     1552.89 μs         6.14 K     765.82 μs, 765.75 μs

Now let's switch it up slightly and hand the list in through inputs:

# list = Enum.to_list(1..10_000)
map_fun = fn i -> [i, i * i] end

Benchee.run(
  %{
    "flat_map" => fn list -> Enum.flat_map(list, map_fun) end,
    "map.flatten" => fn list -> list |> Enum.map(map_fun) |> List.flatten() end
},
inputs: %{"lol" => Enum.to_list(1..10_000) },
formatters: [{Benchee.Formatters.Console, extended_statistics: true}]
)

results in:

Operating System: Linux
CPU Information: Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz
Number of Available Cores: 8
Available memory: 15.61 GB
Elixir 1.8.1
Erlang 21.2.7

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 0 ns
parallel: 1
inputs: lol
Estimated total run time: 14 s

Benchmarking flat_map with input lol...
Benchmarking map.flatten with input lol...

##### With input lol #####
Name                  ips        average  deviation         median         99th %
flat_map           2.37 K      421.27 μs     ±9.58%      410.08 μs      663.05 μs
map.flatten        1.63 K      612.90 μs    ±12.73%      626.79 μs      900.98 μs

Comparison: 
flat_map           2.37 K
map.flatten        1.63 K - 1.45x slower +191.63 μs

Extended statistics: 

Name                minimum        maximum    sample size                     mode
flat_map          353.47 μs     1068.95 μs        11.84 K                409.76 μs
map.flatten       519.13 μs     1235.24 μs         8.13 K                519.95 μs

All of a sudden average and median for map.flatten are a lot better (~200μs better). The performance of flat_map stays about the same 🤷‍♀️

I tried this multiple times. It's no coincedence, the question is why - if anything I'd have thought that the version without inputs would be faster not the other way around.

Trying to summon @michalmuskala who maybe has some compiler input 🤷‍♂️

The text was updated successfully, but these errors were encountered:

PragTob · 2019-03-23T16:18:29Z

Run time histograms, with input:

without input:

Full HTML reports:

with input:
http://www.pragtob.info/benchee/input-no-input/map_inputs_with_inputs_lol_map_flatten.html

without input:
http://www.pragtob.info/benchee/input-no-input/map_inputs_without_inputs_map_flatten.html

PragTob · 2019-03-23T16:22:44Z

To summarize really quickly, without input performance seems a lot less stable and frequently jumping between top notch performance and bad performance:

whereas with input it still seems jumpy, but at a much smaller/consistent scale:

PragTob · 2019-03-23T17:23:57Z

If anyone looks at this and says "SHOW ME THE CODE" - it's really not that much different, you can check https://github.com/bencheeorg/benchee/blob/master/lib/benchee/benchmark/runner.ex

The relevant parts that take the run time measurement:

  def collect(
        scenario = %Scenario{function: function},
        scenario_context = %ScenarioContext{
          num_iterations: 1
        },
        collector
      ) do
    new_input = Hooks.run_before_each(scenario, scenario_context)
    function = main_function(function, new_input)

    {measurement, return_value} = collector.collect(function)

    Hooks.run_after_each(return_value, scenario, scenario_context)
    measurement
  end

  def main_function(function, @no_input), do: function
  def main_function(function, input), do: fn -> function.(input) end

So code wise the only difference that I can think of is main_function what the function that is used looks like. Somewhat ironically, it is wrapped in another function which should make it slower... or so one would guess.

Maybe it's that the input isn't local to the function and hence other optimizations can kick in? @michalmuskala

PragTob · 2019-03-23T18:13:25Z

I know I have my own level of fun talking to myself here, something that just came to my mind: With inputs the input/data is already "inside" the process we use for benchmarking, no idea how access is handled in the other case? 🤷‍♂️

devonestes · 2019-03-24T09:02:54Z

Based on those performance characteristics, it looks like maybe garbage collection is triggered with the inputs version?

Want to measure memory usage as well on there and see what the difference is? It should be the exact same, but if it’s not then that’s a good clue.

PragTob · 2019-03-24T09:15:37Z

memory consumption is measured in the HTML samples I posted, and its exactly the same - was also my hope that this would show something. Maybe amount of GCs could still be different? Dunno.

PragTob added the question label Mar 23, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Odd performance characteristics of using inputs vs. not using inputs #282

Odd performance characteristics of using inputs vs. not using inputs #282

PragTob commented Mar 23, 2019 •

edited

Loading

PragTob commented Mar 23, 2019

PragTob commented Mar 23, 2019

PragTob commented Mar 23, 2019

PragTob commented Mar 23, 2019

devonestes commented Mar 24, 2019

PragTob commented Mar 24, 2019

Odd performance characteristics of using inputs vs. not using inputs #282

Odd performance characteristics of using inputs vs. not using inputs #282

Comments

PragTob commented Mar 23, 2019 • edited Loading

PragTob commented Mar 23, 2019

PragTob commented Mar 23, 2019

PragTob commented Mar 23, 2019

PragTob commented Mar 23, 2019

devonestes commented Mar 24, 2019

PragTob commented Mar 24, 2019

PragTob commented Mar 23, 2019 •

edited

Loading