Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't use cstruct - use string/bytes! #223

Closed
wants to merge 8 commits into from
Closed

Don't use cstruct - use string/bytes! #223

wants to merge 8 commits into from

Conversation

reynir
Copy link
Contributor

@reynir reynir commented Apr 4, 2024

No description provided.

Co-authored-by: Hannes Mehnert <[email protected]>
Co-authored-by: Reynir Björnsson <[email protected]>
@reynir
Copy link
Contributor Author

reynir commented Apr 5, 2024

With this branch and no-cstruct opam repository I see a significant speedup:

cc628d5 (this branch + bench marking):
Screenshot from 2024-04-05 12-33-36

fe92b0a (main):
Screenshot from 2024-04-05 12-34-01

There is as well a higher number of reported allocations. It is unclear how much this is due to cstructs being underreported and how much it is an increase in allocations.

@reynir
Copy link
Contributor Author

reynir commented Apr 8, 2024

After @hannesm fixed the bug in ocaml-tls#no-cstruct that caused invalid MAC I can now test against a VM on my machine running OpenVPN.?? and iperf3:

Main branch

$ iperf3 -c 10.8.0.1
Connecting to host 10.8.0.1, port 5201
[  5] local 10.8.0.3 port 32914 connected to 10.8.0.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  88.4 MBytes   741 Mbits/sec   27    480 KBytes       
[  5]   1.00-2.00   sec  84.2 MBytes   707 Mbits/sec    0    596 KBytes       
[  5]   2.00-3.00   sec  54.4 MBytes   456 Mbits/sec    0    659 KBytes       
[  5]   3.00-4.00   sec  57.2 MBytes   479 Mbits/sec    0    721 KBytes       
[  5]   4.00-5.00   sec  57.1 MBytes   479 Mbits/sec    0    776 KBytes       
[  5]   5.00-6.00   sec  65.1 MBytes   546 Mbits/sec    0    835 KBytes       
[  5]   6.00-7.00   sec  63.9 MBytes   536 Mbits/sec    0    889 KBytes       
[  5]   7.00-8.00   sec  64.4 MBytes   540 Mbits/sec    0    941 KBytes       
[  5]   8.00-9.00   sec  64.1 MBytes   538 Mbits/sec    0    990 KBytes       
[  5]   9.00-10.00  sec  65.0 MBytes   545 Mbits/sec    0   1.01 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   664 MBytes   557 Mbits/sec   27             sender
[  5]   0.00-10.01  sec   662 MBytes   555 Mbits/sec                  receiver

iperf Done.
$ iperf3 -c 10.8.0.1 -R
Connecting to host 10.8.0.1, port 5201
Reverse mode, remote host 10.8.0.1 is sending
[  5] local 10.8.0.3 port 44052 connected to 10.8.0.1 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec   108 MBytes   908 Mbits/sec                  
[  5]   1.00-2.00   sec   112 MBytes   937 Mbits/sec                  
[  5]   2.00-3.00   sec   113 MBytes   951 Mbits/sec                  
[  5]   3.00-4.00   sec   130 MBytes  1.09 Gbits/sec                  
[  5]   4.00-5.00   sec   149 MBytes  1.25 Gbits/sec                  
[  5]   5.00-6.00   sec   151 MBytes  1.27 Gbits/sec                  
[  5]   6.00-7.00   sec   153 MBytes  1.28 Gbits/sec                  
[  5]   7.00-8.00   sec   153 MBytes  1.29 Gbits/sec                  
[  5]   8.00-9.00   sec   146 MBytes  1.23 Gbits/sec                  
[  5]   9.00-10.00  sec   151 MBytes  1.27 Gbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  1.34 GBytes  1.15 Gbits/sec   18             sender
[  5]   0.00-10.00  sec  1.33 GBytes  1.15 Gbits/sec                  receiver

iperf Done.

This branch & no-cstruct repo (with TLS fix)

$ iperf3 -c 10.8.0.1
Connecting to host 10.8.0.1, port 5201
[  5] local 10.8.0.6 port 33588 connected to 10.8.0.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  87.6 MBytes   735 Mbits/sec  105    664 KBytes       
[  5]   1.00-2.00   sec  88.0 MBytes   738 Mbits/sec    1    584 KBytes       
[  5]   2.00-3.00   sec  88.2 MBytes   740 Mbits/sec    3    491 KBytes       
[  5]   3.00-4.00   sec  78.2 MBytes   656 Mbits/sec    0    596 KBytes       
[  5]   4.00-5.00   sec  78.8 MBytes   661 Mbits/sec    0    686 KBytes       
[  5]   5.00-6.00   sec  79.2 MBytes   664 Mbits/sec    5    584 KBytes       
[  5]   6.00-7.00   sec  77.5 MBytes   650 Mbits/sec    0    676 KBytes       
[  5]   7.00-8.00   sec  74.1 MBytes   622 Mbits/sec   34    553 KBytes       
[  5]   8.00-9.00   sec  77.6 MBytes   651 Mbits/sec    0    647 KBytes       
[  5]   9.00-10.00  sec  78.3 MBytes   657 Mbits/sec   19    534 KBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   807 MBytes   677 Mbits/sec  167             sender
[  5]   0.00-10.05  sec   806 MBytes   673 Mbits/sec                  receiver

iperf Done.
$ iperf3 -c 10.8.0.1 -R
Connecting to host 10.8.0.1, port 5201
Reverse mode, remote host 10.8.0.1 is sending
[  5] local 10.8.0.6 port 45438 connected to 10.8.0.1 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec   111 MBytes   934 Mbits/sec                  
[  5]   1.00-2.00   sec   112 MBytes   941 Mbits/sec                  
[  5]   2.00-3.00   sec   111 MBytes   930 Mbits/sec                  
[  5]   3.00-4.00   sec   111 MBytes   929 Mbits/sec                  
[  5]   4.00-5.00   sec   111 MBytes   930 Mbits/sec                  
[  5]   5.00-6.00   sec   108 MBytes   903 Mbits/sec                  
[  5]   6.00-7.00   sec   109 MBytes   915 Mbits/sec                  
[  5]   7.00-8.00   sec   109 MBytes   912 Mbits/sec                  
[  5]   8.00-9.00   sec   111 MBytes   931 Mbits/sec                  
[  5]   9.00-10.00  sec   112 MBytes   942 Mbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  1.08 GBytes   930 Mbits/sec   24             sender
[  5]   0.00-10.00  sec  1.08 GBytes   927 Mbits/sec                  receiver

iperf Done.

There is a performance improvement in sending (upload) and a performance regression in receiving (download). I think this is promising as the cstruct->string change in Miragevpn was somewhat naïve and there is room for some low-hanging improvements. It is also interesting to me the asymmetry goes from 1:2 to 2:3 in upload/download speeds.

Note: the numbers are collected on a different machine than in #206 so they are unfortunately not comparable. If desired I can measure on that machine (which will require compiling for a different debian release).

@reynir
Copy link
Contributor Author

reynir commented Apr 8, 2024

Below is a flamegraph of miragevpn while I run iperf3 -R (i.e. testing download)
perf-flamegraph-no-cstruct-down

And while running iperf3 (i. e. testing upload)
perf-flamegraph-no-cstruct-up

reynir added 2 commits April 8, 2024 14:39
When decoding operation return offset and length of the package. We
avoid copying the packet only to copy it again without the first byte.
@reynir
Copy link
Contributor Author

reynir commented Apr 8, 2024

I pushed some commits that reduce the number of allocations and copying and I can observe a small improvement in the bechamel benchmark. I was struggling to observe improvements in the openvpn/iperf3 VM benchmark, and so I tried running the openvpn/iperf3 benchmark against main and observed there is too much variance to make precise comparisons:

reynir@solsort:~$ iperf3 -c 10.8.0.1 -R
Connecting to host 10.8.0.1, port 5201
Reverse mode, remote host 10.8.0.1 is sending
[  5] local 10.8.0.10 port 43560 connected to 10.8.0.1 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  82.7 MBytes   694 Mbits/sec                  
[  5]   1.00-2.00   sec  88.9 MBytes   746 Mbits/sec                  
[  5]   2.00-3.00   sec  88.9 MBytes   746 Mbits/sec                  
[  5]   3.00-4.00   sec  92.0 MBytes   771 Mbits/sec                  
[  5]   4.00-5.00   sec  90.7 MBytes   761 Mbits/sec                  
[  5]   5.00-6.00   sec  89.5 MBytes   751 Mbits/sec                  
[  5]   6.00-7.00   sec  90.4 MBytes   758 Mbits/sec                  
[  5]   7.00-8.00   sec  88.7 MBytes   744 Mbits/sec                  
[  5]   8.00-9.00   sec  88.2 MBytes   740 Mbits/sec                  
[  5]   9.00-10.00  sec  89.1 MBytes   748 Mbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.01  sec   892 MBytes   747 Mbits/sec   16             sender
[  5]   0.00-10.00  sec   889 MBytes   746 Mbits/sec                  receiver

iperf Done.
reynir@solsort:~$ iperf3 -i 0 -c 10.8.0.1 -R
Connecting to host 10.8.0.1, port 5201
Reverse mode, remote host 10.8.0.1 is sending
[  5] local 10.8.0.10 port 57322 connected to 10.8.0.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-10.00  sec  1010 MBytes   847 Mbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.02  sec  1013 MBytes   849 Mbits/sec   18             sender
[  5]   0.00-10.00  sec  1010 MBytes   847 Mbits/sec                  receiver

iperf Done.
reynir@solsort:~$ iperf3 -i 0 -c 10.8.0.1 -R
Connecting to host 10.8.0.1, port 5201
Reverse mode, remote host 10.8.0.1 is sending
[  5] local 10.8.0.10 port 44992 connected to 10.8.0.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-10.00  sec  1.19 GBytes  1.02 Gbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  1.19 GBytes  1.03 Gbits/sec    7             sender
[  5]   0.00-10.00  sec  1.19 GBytes  1.02 Gbits/sec                  receiver

iperf Done.
reynir@solsort:~$ iperf3 -i 0 -c 10.8.0.1 -R
Connecting to host 10.8.0.1, port 5201
Reverse mode, remote host 10.8.0.1 is sending
[  5] local 10.8.0.10 port 60600 connected to 10.8.0.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-10.00  sec  1.23 GBytes  1.05 Gbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  1.23 GBytes  1.06 Gbits/sec    9             sender
[  5]   0.00-10.00  sec  1.23 GBytes  1.05 Gbits/sec                  receiver

iperf Done.
reynir@solsort:~$ iperf3 -i 0 -c 10.8.0.1 -R
Connecting to host 10.8.0.1, port 5201
Reverse mode, remote host 10.8.0.1 is sending
[  5] local 10.8.0.10 port 54538 connected to 10.8.0.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-10.00  sec  1.15 GBytes   990 Mbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  1.15 GBytes   992 Mbits/sec   11             sender
[  5]   0.00-10.00  sec  1.15 GBytes   990 Mbits/sec                  receiver

iperf Done.

@reynir
Copy link
Contributor Author

reynir commented Apr 8, 2024

For comparison here is 4 runs of iperf3 with this branch:

reynir@solsort:~$ iperf3 -i 0 -c 10.8.0.1 -R
Connecting to host 10.8.0.1, port 5201
Reverse mode, remote host 10.8.0.1 is sending
[  5] local 10.8.0.11 port 52876 connected to 10.8.0.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-10.00  sec  1.13 GBytes   973 Mbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  1.14 GBytes   976 Mbits/sec   12             sender
[  5]   0.00-10.00  sec  1.13 GBytes   973 Mbits/sec                  receiver

iperf Done.
reynir@solsort:~$ iperf3 -i 0 -c 10.8.0.1 -R
Connecting to host 10.8.0.1, port 5201
Reverse mode, remote host 10.8.0.1 is sending
[  5] local 10.8.0.11 port 44456 connected to 10.8.0.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-10.00  sec  1.24 GBytes  1.07 Gbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  1.24 GBytes  1.07 Gbits/sec   21             sender
[  5]   0.00-10.00  sec  1.24 GBytes  1.07 Gbits/sec                  receiver

iperf Done.
reynir@solsort:~$ iperf3 -i 0 -c 10.8.0.1 -R
Connecting to host 10.8.0.1, port 5201
Reverse mode, remote host 10.8.0.1 is sending
[  5] local 10.8.0.11 port 35176 connected to 10.8.0.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-10.00  sec   888 MBytes   745 Mbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   890 MBytes   747 Mbits/sec   11             sender
[  5]   0.00-10.00  sec   888 MBytes   745 Mbits/sec                  receiver

iperf Done.
reynir@solsort:~$ iperf3 -i 0 -c 10.8.0.1 -R
Connecting to host 10.8.0.1, port 5201
Reverse mode, remote host 10.8.0.1 is sending
[  5] local 10.8.0.11 port 45060 connected to 10.8.0.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-10.00  sec  1.23 GBytes  1.06 Gbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  1.23 GBytes  1.06 Gbits/sec   20             sender
[  5]   0.00-10.00  sec  1.23 GBytes  1.06 Gbits/sec                  receiver

iperf Done.

Unfortunately, appending the empty string to another string results in a
potentially expensive copy.
@reynir
Copy link
Contributor Author

reynir commented Apr 9, 2024

I discovered that "" ^ long_string is not cheap as it results in copying long_string, so in 0de5ad0 I check if state.linger is the empty string before prepending it to the input string.

This is a flamegraph before that commit. As you can see in Mirage.Engine.incoming_inner_* the function Stdlib.^ is taking a decent amount of time.
perf-flamegraph-no-cstruct-down2 5

Here is the flamegraph with that commit. The Stdlib.^ called in Miragevpn.Engine.incoming is only spending slightly less time.
perf-flamegraph-no-cstruct-down3

So it seems for TCP, when running iperf3 load at least, we often have lingering bytes, Maybe it is worth looking into a better suited data structure then.

@reynir
Copy link
Contributor Author

reynir commented Apr 9, 2024

Oops I inadvertently pushed the octets commit

@reynir
Copy link
Contributor Author

reynir commented Sep 30, 2024

Superceded by #279

@reynir reynir closed this Sep 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants