Skip to content

Latest commit

 

History

History
673 lines (491 loc) · 24.5 KB

findings.md

File metadata and controls

673 lines (491 loc) · 24.5 KB

Benchee output.

Below is the output from various benchee runs with descriptions of what we did for each.

xmerl -> data_schema VS Saxy -> DataSchema

This approach attempted to compare the current approach of xml -> xmerl -> data_schema (by way of SweetXML queries) VS going straight from Saxy to data_schema structs.

There are actually a few ways we can approach the latter, this first approach took the simplest idea and created a SaxyDataAccessor. The upshot being that we "Saxy" the entire XML document once per field in the schema. Each time though we look for one very specific value (the value pointed to by the field in the schema). This means we stop as soon as we find it AND that for the vast majority of events we ignore them completely. This approach would get faster if we could skip (as soon as possible) a subtree - like if we know the node doesn't appear in our path the sooner we can skip the better.

The big problem of this is the has_many's. They become very tricky because you need to be able to iterate through a subset of the XML, which is actually harder than it sounds. Either you "parse" the list of things by reconstructing the string from the saxy events - which seems mental. OR you parse the has_manys to a different representation and have clauses in the accessor for all fns for that representation. One which handles it by querying it in some way.

For this experiment we hacked a thing which isn't completely accurate, but it gives us enough information to know whether we should continue the effort to make has_many work. SO

(Note this one is harder to bench with large XML because it requires a lot more work).

❯ mix run bench.exs

Operating System: macOS
CPU Information: Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Number of Available Cores: 16
Available memory: 32 GB
Elixir 1.13.1
Erlang 24.1.7

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 1 s
reduction time: 1 s
parallel: 1
inputs: none specified
Estimated total run time: 18 s

Benchmarking experiment parse many times ...
Benchmarking xmerl -> data_schema ...

Name                                  ips        average  deviation         median         99th %
xmerl -> data_schema              12.93 K       77.35 μs    ±23.90%          74 μs         165 μs
experiment parse many times       11.53 K       86.71 μs    ±15.81%          85 μs         148 μs

Comparison:
xmerl -> data_schema              12.93 K
experiment parse many times       11.53 K - 1.12x slower +9.37 μs

Memory usage statistics:

Name                           Memory usage
xmerl -> data_schema              107.41 KB
experiment parse many times        60.83 KB - 0.57x memory usage -46.58594 KB

**All measurements for memory usage were the same**

Reduction count statistics:

Name                        Reduction count
xmerl -> data_schema                60.43 K
experiment parse many times          6.27 K - 0.10x reduction count -54.15800 K

**All measurements for reduction count were the same**

What's super interesting is although it is slower it uses like half the memory, on very small input. We should really try on a larger input to explore that more.. but it does require getting a solution for the "has_many" problem...

Larger Input

Chose to use a Jetstar Seat map as an example because it's fairly easy to copy the schemas over and the XML is a decent enough size. File size is: 1.8mbs.

This is a benchmark for just the XMERL -> DataSchema approach:

Operating System: macOS
CPU Information: Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Number of Available Cores: 16
Available memory: 32 GB
Elixir 1.13.1
Erlang 24.1.7

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 1 s
reduction time: 1 s
parallel: 1
inputs: none specified
Estimated total run time: 9 s

Benchmarking xmerl -> data_schema ...

Name                           ips        average  deviation         median         99th %
xmerl -> data_schema          3.59      278.83 ms    ±15.18%      299.79 ms      347.19 ms

Memory usage statistics:

Name                    Memory usage
xmerl -> data_schema       153.61 MB

**All measurements for memory usage were the same**

Reduction count statistics:

Name                         average  deviation         median         99th %
xmerl -> data_schema         27.45 M     ±0.03%        27.46 M        27.46 M

What's nuts is look at that memory! 153mbs! That's like 10x what our original is! This is probably because there is a lot of nesting I guess meaning a lot of parents. Or really a lot of tags inside tags. IN FACT! I know why easyjet is so fast! They use attrs for everything, so there are like no parents! They also return way fewer options too but yea....

Using current approach but maps instead of erlang records:

This approach uses almost the exact same approach as now, but uses maps for the xml elements rather than xmerl records. This used the large response fixture of request_id FuOOEnVBFfAVGy8AeRED

❯ mix run bench.exs
Compiling 1 file (.ex)
Operating System: macOS
CPU Information: Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Number of Available Cores: 16
Available memory: 32 GB
Elixir 1.13.1
Erlang 24.1.7

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 1 s
reduction time: 1 s
parallel: 1
inputs: none specified
Estimated total run time: 18 s

Benchmarking current to map ...
Benchmarking current to xmerl (records) ...

Name                                 ips        average  deviation         median         99th %
current to map                      1.13         0.88 s    ±10.04%         0.85 s         1.06 s
current to xmerl (records)          0.98         1.02 s     ±8.22%         0.98 s         1.17 s

Comparison:
current to map                      1.13
current to xmerl (records)          0.98 - 1.15x slower +0.133 s

Memory usage statistics:

Name                          Memory usage
current to map                   253.50 MB
current to xmerl (records)       290.61 MB - 1.15x memory usage +37.10 MB

**All measurements for memory usage were the same**

Reduction count statistics:

Name                       Reduction count
current to map                    388.46 M
current to xmerl (records)        389.00 M - 1.00x reduction count +0.54 M

**All measurements for reduction count were the same**

What's crazy about this is the memory used in both. It's like 253mbs, for a ~9.1mb file!

Using current approach but maps instead of erlang records - Maps have no atoms

This approach uses almost the exact same approach as now, but uses maps for the xml elements rather than xmerl records. This used the large response fixture of request_id FuOOEnVBFfAVGy8AeRED

BUT we keep the element names as strings....

❯ mix run bench.exs
Compiling 1 file (.ex)
warning: variable "atom_fun" is unused (if the variable is not meant to be used, prefix it with an underscore)
  lib/saxy/experiment/xmerl_map.ex:101: Saxy.XmerlMap.make_name/2

Operating System: macOS
CPU Information: Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Number of Available Cores: 16
Available memory: 32 GB
Elixir 1.13.1
Erlang 24.1.7

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 1 s
reduction time: 1 s
parallel: 1
inputs: none specified
Estimated total run time: 18 s

Benchmarking current to map ...
Benchmarking current to xmerl (records) ...

Name                                 ips        average  deviation         median         99th %
current to map                      1.08         0.93 s     ±6.77%         0.90 s         1.05 s
current to xmerl (records)          0.93         1.08 s    ±12.15%         1.03 s         1.31 s

Comparison:
current to map                      1.08
current to xmerl (records)          0.93 - 1.16x slower +0.148 s

Memory usage statistics:

Name                          Memory usage
current to map                   253.16 MB
current to xmerl (records)       290.61 MB - 1.15x memory usage +37.45 MB

**All measurements for memory usage were the same**

Reduction count statistics:

Name                       Reduction count
current to map                    388.73 M
current to xmerl (records)        389.19 M - 1.00x reduction count +0.46 M

**All measurements for reduction count were the same**

What's crazy about this is the memory is basically the same as with atoms... which I guess makes sense because the real saving is if we did a second XML with the same binaries, we'd be able to re-use atoms but not the binaries I would guess...

A simpler map

This attempts to slim down the map that gets created to make it simpler to query. We will need to bench the query performance because it might be that sweet XML is just way quicker than access anyway. But for example, the current "xmerl" approach saves all the parents on every node. That's like a HUGE amount of data, and I cannot think why we'd need it. (Though I am a simpleton).

Similarly we do shit like save the namespace as separate, not sure why we can't just but the namespace as the element name. That's actually what we want really.

And finally the "count" seems just absolutely useless because we already order the elements by the how they appear in the XML it seems, so surely we can just index off of that. IN FACT it's better because I'm pretty sure xpath is 1-indexed, which is confusing.

❯ mix run bench.exs
Operating System: macOS
CPU Information: Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Number of Available Cores: 16
Available memory: 32 GB
Elixir 1.13.1
Erlang 24.1.7

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 1 s
reduction time: 1 s
parallel: 1
inputs: none specified
Estimated total run time: 18 s

Benchmarking current to map ...
Benchmarking current to xmerl (records) ...

Name                                 ips        average  deviation         median         99th %
current to map                      2.68         0.37 s     ±3.34%         0.37 s         0.40 s
current to xmerl (records)          0.96         1.04 s     ±9.10%         0.99 s         1.20 s

Comparison:
current to map                      2.68
current to xmerl (records)          0.96 - 2.78x slower +0.66 s

Memory usage statistics:

Name                          Memory usage
current to map                   131.41 MB
current to xmerl (records)       290.61 MB - 2.21x memory usage +159.20 MB

**All measurements for memory usage were the same**

Reduction count statistics:

Name                               average  deviation         median         99th %
current to map                     16.36 M     ±0.03%        16.36 M        16.36 M
current to xmerl (records)        389.22 M     ±0.00%       389.22 M       389.22 M

Comparison:
current to map                     16.36 M
current to xmerl (records)        389.22 M - 23.80x reduction count +372.86 M

We have halved the memory usage here, though it is still suspiciously high!

Same again, but this time with no count being reduced at all:

Here we do the same but remove the "count" from the accumulator as we don't use it:

❯ mix run bench.exs
Compiling 1 file (.ex)
Operating System: macOS
CPU Information: Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Number of Available Cores: 16
Available memory: 32 GB
Elixir 1.13.1
Erlang 24.1.7

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 1 s
reduction time: 1 s
parallel: 1
inputs: none specified
Estimated total run time: 18 s

Benchmarking current to map ...
Benchmarking current to xmerl (records) ...

Name                                 ips        average  deviation         median         99th %
current to map                      2.51         0.40 s     ±2.71%         0.40 s         0.42 s
current to xmerl (records)          0.95         1.06 s     ±7.47%         1.03 s         1.19 s

Comparison:
current to map                      2.51
current to xmerl (records)          0.95 - 2.65x slower +0.66 s

Memory usage statistics:

Name                          Memory usage
current to map                   127.25 MB
current to xmerl (records)       290.61 MB - 2.28x memory usage +163.36 MB

**All measurements for memory usage were the same**

Reduction count statistics:

Name                               average  deviation         median         99th %
current to map                     16.43 M     ±0.07%        16.43 M        16.43 M
current to xmerl (records)        389.29 M     ±0.00%       389.29 M       389.29 M

Comparison:
current to map                     16.43 M
current to xmerl (records)        389.29 M - 23.70x reduction count +372.87 M

Even better memory impact which is nice, still high but whatcha gonna do. I'll take a halving for sure!

But now we need to see if we can query efficiently...

Saxy Maps - Ready for xpath

This approach compared the current xmerl handler, the newer "slimmed down" map and a new version of the slimmed down map approach that aims to make it really easy to query for the data inside it. Check out the Saxy.XmerlMapDynamic module but we do that by making the keys the names of the nodes in the XML. The hope is that we can then take the normal xpath and translate it into what is essentially a lens into the data.... We can now test which is quicker / smaller mems overall including querying the result of saxy.

JUST SAXY BIT

This below is just the saxy bit and not the querying for data inside it. It uses request ID FuOOEnVBFfAVGy8AeRED

❯ mix run bench.exs
Operating System: macOS
CPU Information: Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Number of Available Cores: 16
Available memory: 32 GB
Elixir 1.13.4
Erlang 24.1.7

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 1 s
reduction time: 1 s
parallel: 1
inputs: none specified
Estimated total run time: 27 s

Benchmarking XML to PATHED UP map ...
Benchmarking XML to map (trimmed) ...
Benchmarking current to xmerl (records) ...

Name                                 ips        average  deviation         median         99th %
XML to map (trimmed)                2.69      371.50 ms     ±4.59%      368.88 ms      417.18 ms
XML to PATHED UP map                1.95      512.49 ms     ±5.76%      521.68 ms      554.67 ms
current to xmerl (records)          0.71     1413.57 ms    ±14.84%     1492.59 ms     1558.00 ms

Comparison:
XML to map (trimmed)                2.69
XML to PATHED UP map                1.95 - 1.38x slower +140.99 ms
current to xmerl (records)          0.71 - 3.80x slower +1042.06 ms

Memory usage statistics:

Name                          Memory usage
XML to map (trimmed)             127.25 MB
XML to PATHED UP map             190.24 MB - 1.50x memory usage +62.99 MB
current to xmerl (records)       290.61 MB - 2.28x memory usage +163.36 MB

**All measurements for memory usage were the same**

Reduction count statistics:

Name                               average  deviation         median         99th %
XML to map (trimmed)               16.02 M     ±0.04%        16.02 M        16.02 M
XML to PATHED UP map               20.64 M     ±0.06%        20.64 M        20.65 M
current to xmerl (records)        389.36 M     ±0.00%       389.36 M       389.36 M

Comparison:
XML to map (trimmed)               16.02 M
XML to PATHED UP map               20.64 M - 1.29x reduction count +4.62 M
current to xmerl (records)        389.36 M - 24.31x reduction count +373.34 M

That. Is. Dope. We use less memory and it's faster, even for a large input, so either map approach is promising, we just need to see which fares better when it comes to querying data.

The memory is still huge though.

Interestingly the Saxy.XmerlMap uses less memory... I think it would use even less memory if we used a struct, and we can for that approach as it doesn't require dynamic keys.

Let's add into the mix 2 more approaches:

  • XML to map (trimmed) with Struct instead of a map
  • Using tuples for XmerlMapDynamic

Saxy With Tuple Map

This is an example that compares current with a "dynamic map tuple" approach

❯ mix run bench.exs
Operating System: macOS
CPU Information: Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Number of Available Cores: 16
Available memory: 32 GB
Elixir 1.13.4
Erlang 24.1.7

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 1 s
reduction time: 1 s
parallel: 1
inputs: none specified
Estimated total run time: 18 s

Benchmarking XmerlMapDynamicTuple ...
Benchmarking current to xmerl (records) ...

Name                                 ips        average  deviation         median         99th %
XmerlMapDynamicTuple                2.81         0.36 s     ±2.35%         0.36 s         0.37 s
current to xmerl (records)          0.98         1.02 s     ±9.58%         0.98 s         1.20 s

Comparison:
XmerlMapDynamicTuple                2.81
current to xmerl (records)          0.98 - 2.88x slower +0.67 s

Memory usage statistics:

Name                          Memory usage
XmerlMapDynamicTuple             122.48 MB
current to xmerl (records)       290.61 MB - 2.37x memory usage +168.13 MB

**All measurements for memory usage were the same**

Reduction count statistics:

Name                               average  deviation         median         99th %
XmerlMapDynamicTuple               14.89 M     ±0.04%        14.89 M        14.89 M
current to xmerl (records)        389.15 M     ±0.00%       389.15 M       389.15 M

Comparison:
XmerlMapDynamicTuple               14.89 M
current to xmerl (records)        389.15 M - 26.14x reduction count +374.26 M

This compares all of them tried so far.

❯ mix run bench.exs
Operating System: macOS
CPU Information: Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Number of Available Cores: 16
Available memory: 32 GB
Elixir 1.13.4
Erlang 24.1.7

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 1 s
reduction time: 1 s
parallel: 1
inputs: none specified
Estimated total run time: 36 s

Benchmarking XML to PATHED UP map ...
Benchmarking XML to map (trimmed) ...
Benchmarking XmerlMapDynamicTuple ...
Benchmarking current to xmerl (records) ...

Name                                 ips        average  deviation         median         99th %
XML to map (trimmed)                2.82      354.16 ms     ±1.63%      353.17 ms      365.30 ms
XmerlMapDynamicTuple                2.80      357.30 ms     ±2.14%      360.02 ms      370.90 ms
XML to PATHED UP map                1.99      501.74 ms     ±3.58%      504.36 ms      528.71 ms
current to xmerl (records)          0.97     1033.55 ms     ±8.28%      995.71 ms     1183.84 ms

Comparison:
XML to map (trimmed)                2.82
XmerlMapDynamicTuple                2.80 - 1.01x slower +3.14 ms
XML to PATHED UP map                1.99 - 1.42x slower +147.57 ms
current to xmerl (records)          0.97 - 2.92x slower +679.39 ms

Memory usage statistics:

Name                          Memory usage
XML to map (trimmed)             127.25 MB
XmerlMapDynamicTuple             122.48 MB - 0.96x memory usage -4.77155 MB
XML to PATHED UP map             190.24 MB - 1.50x memory usage +62.99 MB
current to xmerl (records)       290.61 MB - 2.28x memory usage +163.36 MB

**All measurements for memory usage were the same**

Reduction count statistics:

Name                               average  deviation         median         99th %
XML to map (trimmed)               16.26 M     ±0.08%        16.27 M        16.27 M
XmerlMapDynamicTuple               14.89 M     ±0.05%        14.89 M        14.89 M
XML to PATHED UP map               20.22 M     ±0.03%        20.22 M        20.22 M
current to xmerl (records)        389.12 M     ±0.00%       389.12 M       389.12 M

Comparison:
XML to map (trimmed)               16.27 M
XmerlMapDynamicTuple               14.89 M - 0.92x reduction count -1.37824 M
XML to PATHED UP map               20.22 M - 1.24x reduction count +3.95 M
current to xmerl (records)        389.12 M - 23.92x reduction count +372.86 M

This is with DynamicTuple using an atom for the node names:

Operating System: macOS
CPU Information: Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Number of Available Cores: 16
Available memory: 32 GB
Elixir 1.13.4
Erlang 24.1.7

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 1 s
reduction time: 1 s
parallel: 1
inputs: none specified
Estimated total run time: 36 s

Benchmarking XML to PATHED UP map ...
Benchmarking XML to map (trimmed) ...
Benchmarking XmerlMapDynamicTuple ...
Benchmarking current to xmerl (records) ...

Name                                 ips        average  deviation         median         99th %
XmerlMapDynamicTuple                2.71      368.36 ms     ±5.70%      364.50 ms      427.84 ms
XML to map (trimmed)                2.66      375.69 ms     ±5.08%      372.45 ms      413.53 ms
XML to PATHED UP map                1.95      512.60 ms     ±5.92%      504.64 ms      582.14 ms
current to xmerl (records)          0.90     1116.80 ms    ±10.40%     1071.30 ms     1318.31 ms

Comparison:
XmerlMapDynamicTuple                2.71
XML to map (trimmed)                2.66 - 1.02x slower +7.34 ms
XML to PATHED UP map                1.95 - 1.39x slower +144.24 ms
current to xmerl (records)          0.90 - 3.03x slower +748.45 ms

Memory usage statistics:

Name                          Memory usage
XmerlMapDynamicTuple             122.48 MB
XML to map (trimmed)             127.25 MB - 1.04x memory usage +4.77 MB
XML to PATHED UP map             190.24 MB - 1.55x memory usage +67.77 MB
current to xmerl (records)       290.61 MB - 2.37x memory usage +168.13 MB

**All measurements for memory usage were the same**

Reduction count statistics:

Name                               average  deviation         median         99th %
XmerlMapDynamicTuple               15.13 M     ±0.07%        15.13 M        15.14 M
XML to map (trimmed)               16.34 M     ±0.30%        16.36 M        16.37 M
XML to PATHED UP map               20.24 M     ±0.11%        20.24 M        20.26 M
current to xmerl (records)        389.40 M     ±0.00%       389.40 M       389.40 M

Comparison:
XmerlMapDynamicTuple               15.13 M
XML to map (trimmed)               16.34 M - 1.08x reduction count +1.20 M
XML to PATHED UP map               20.24 M - 1.34x reduction count +5.11 M
current to xmerl (records)        389.40 M - 25.73x reduction count +374.27 M

Interesting that the tuple approach is faster. NOW LETS ADD QUERYING.

XmerlMapDynamicTuple -> Data Schema VS xmerl -> DataSchema

This test run was done on a this very small XML sample:

<SteamedHam price="1">
  <ReadyDate>2021-09-11</ReadyDate>
  <ReadyTime>15:50:07.123Z</ReadyTime>
  <Sauce Name="burger sauce">spicy</Sauce>
  <Type>medium rare</Type>
  <Salads>
    <Salad Name="ceasar">
      <Cheese Mouldy="true">Blue</Cheese>
    </Salad>
    <Salad Name="cob">
      <Leaf type="lambs lettuce">washed</Leaf>
    </Salad>
  </Salads>
</SteamedHam>

The results are pretty astonishing!

❯ mix run bench.exs
Operating System: macOS
CPU Information: Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Number of Available Cores: 16
Available memory: 32 GB
Elixir 1.13.4
Erlang 24.1.7

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 1 s
reduction time: 1 s
parallel: 1
inputs: none specified
Estimated total run time: 18 s

Benchmarking XmerlMapDynamicTuple -> DataSchema ...
Benchmarking xmerl -> data_schema ...

Name                                         ips        average  deviation         median         99th %
XmerlMapDynamicTuple -> DataSchema       59.57 K       16.79 μs    ±45.60%          15 μs          51 μs
xmerl -> data_schema                     13.19 K       75.80 μs    ±19.63%          74 μs         149 μs

Comparison:
XmerlMapDynamicTuple -> DataSchema       59.57 K
xmerl -> data_schema                     13.19 K - 4.52x slower +59.01 μs

Memory usage statistics:

Name                                  Memory usage
XmerlMapDynamicTuple -> DataSchema        19.59 KB
xmerl -> data_schema                     107.33 KB - 5.48x memory usage +87.73 KB

**All measurements for memory usage were the same**

Reduction count statistics:

Name                               Reduction count
XmerlMapDynamicTuple -> DataSchema          1.43 K
xmerl -> data_schema                       60.43 K - 42.14x reduction count +58.99 K

**All measurements for reduction count were the same**

Now there might not be complete feature parity between the two approaches here... But currently my approach is about 4.5 times faster and uses about 5.5 times less memory.... WILD.

It is still an insane amount of memory for the size of XML.... but yea.