Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iodine performance improvements and features #16

Open
wants to merge 171 commits into
base: master
Choose a base branch
from

Conversation

frekky
Copy link

@frekky frekky commented Nov 11, 2015

Overview

This fork of iodine was intended primarily to improve performance by using a TCP-like sliding window protocol for having multiple "in flight" fragments both upstream and downstream. This allows greatly increased performance on high-latency connections. In order to do so, the whole data/ping structure has been changed (details available in doc/proto_00000800.txt).

Some limited testing has been conducted, the results of which can be found in the updated man page.

This has been almost fully tested on Linux amd64 and compiles without warnings, however no other platforms have been tested yet. Due to some hacks to get millisecond timer precision on Windows - see windows.h for gettimeofday() and struct timeval macros - various functionality may not work as expected.

Unit tests have been updated to suit changes to the main code base, and a basic sliding window test was created which tests some of the essential functions.

Issues

This fork is still in development, and I plan to keep it up to date with the main iodine repository as much as possible. There are probably lots of currently undiscovered bugs and certainly lots of problems with intolerant DNS servers which cause performance and connectivity issues.

To help diagnose these problems, I strongly recommend that you try -V 5 to print connection statistics such as the number of queries per second, fragments lost, failures, timeouts, round-trip time etc.

  • High query rates (such as >50 / sec) will probably result in DNS servers dropping queries or responding with errors or invalid replies
  • Try changing the encoding to something which uses less strange characters (such as base64/base32) or the DNS query type if you have total connection failures
  • Reduce the upstream and downstream window sizes (using -w and -W options) from the default values to something more suitable to your connection: lower round-trip time means the window size does not have to be so large to get the same throughput.
  • If connection succeeds but data stops flowing and DNS queries are still being answered correctly (check the stats printout for this information), rebuild iodine and iodined with make debug. Turn on more debugging with -DDDDDD (use less Ds if you experience graphic lag in your terminal due to excessive output) and copy the debug output on both iodined and iodine corresponding to the time when the problem started.

Features

Most of the important feature additions are listed here.

  • Guaranteed data arrival (no protection from corruption, however if DNS query fails iodine[d] will re-send fragments as required)
  • Command line options have been to adjust timeouts and sliding window behaviour.
  • Multiple nameservers can be specified to reduce load on a single DNS server.
  • Lazy mode now supports any number (within reasonable limits) of pending queries waiting at the server, adjusted using the downstream window timeout option
  • Client-side statistics report every number of seconds (specified with -V option)
  • More fine-grained client control over data compression, server query timeout and other important connection parameters
  • Client side minimum send interval as an attempt to rate-limit connections if using DNS servers which drop queries under high volume
  • Server timeout is adjusted automatically based on target timeout and the round-trip time
  • More fine-grained automatic adjustment of target timeout and immediate mode switching.

I may have forgotten to mention some features here, but this should cover most of them.

Protocol Overview

Due to the nature of the sliding window protocol, the entire data transfer protocol needed to be rewritten. The new protocol (800) is detailed in the docs, and although the basic DNS encapsulation is the same, the headers have been more-or-less completely rearranged. Upstream and downstream are functionally equivalent at the sliding window layer, where new data packets (ie from tun device on either client or server) are treated as follows:

  1. Data is optionally compressed (depending on user-specific upstream/downstream compression flags)
  2. Raw or compressed data is then split into a number of fragments depending on the user's maximum fragsize (calculated beforehand during the handshake process)
  3. Each fragment is added to the outgoing window buffer (same for both downstream and upstream) and assigned a unique sequence ID from 0 to 255. The window buffer maintains a pointer to the current fragment which is the "start" of the sending window, and while sending fragments, only the windowsize number of fragments are sent in order from the fragment at the start of the window.
  4. The fragments are sent in order from the start of the window as described above.
  5. When the fragments are received at the other end, they are placed in the receiving window buffer at an offset determined by their sequence ID. This way, out-of-order fragments (very common with load-balanced DNS servers) can be easily handled without dropping them.
  6. The receiving end will check if it has received both the starting fragment, the final fragment and all the in-between fragments and if it has, the full data packet is retrieved and the pointer to the start of the next received chunk is moved forwards by the number of fragments.
  7. The received full packet is optionally uncompressed and sent to the tun device.
  8. The receiving end immediately ACKs the fragment using its sequence ID using either a ping or a data packet (both have space for an ACK).
  9. When the ACK is received at the sending side for a fragment, it is marked off in the sending buffer as having been successfully received by the other end and based on this the window can be moved forward and the next few fragments sent.

Other Information

Any other information is available in the code (I've put in a reasonable amount of hopefully helpful comments so it shouldn't be too hard to understand).

Feel free to ask any questions or make comments on any of the changes. I've done quite a lot of refactoring to clarify various parts of the code or make things simpler.

Thanks for all the great work in making something like iodine, and thanks again for making it open source. It's truly been a pleasure working with it and I hope to be able to contribute something to this project.

Barak A. Pearlmutter and others added 30 commits April 23, 2014 09:06
Fixed up client side compile issues
Removed old packet handling code - TODO: use sliding window buffer
instead
@jiquera
Copy link

jiquera commented Mar 28, 2017

I haven't tried your code yet, but I think it would be beneficial to lower the "guaranteed arrival" constraint. In general it is a bad idea to to guaranteed delivery in tunnels: http://sites.inka.de/sites/bigred/devel/tcp-tcp.html

Long story short:

  • either a protocol has recovery of its own (TCP) and doesnt need the guarantee
  • or it doesnt have recovery (UDP) but users won't count on it anyway

#21 confirms this as well

Would it work to put a time out on the window packets and just move on after it expires?

@Anime4000 are you using the 9.9 NDIS5 TAP driver? I'd like to make iodine work with the newer one... started debugging this weekend, do you have more knowledge/info on that?

@Anime4000
Copy link

@jiquera I use older OpenVPN TAP (10Mbps).
second, if you can change -R to SOCKS Tunnel like:
plink -ssh [email protected] -P 22 -pw 123456 -C -T -D 127.0.0.1:8888 -N

@frekky
Copy link
Author

frekky commented Apr 2, 2017

@jiquera Currently there is a timeout on the window fragments (each IP packet that is received on the TUN/TAP device is split into several "fragments" which are sent with the window), after which they are resent (rather than being dropped). It would be quite simple to just drop the fragments if they time out, however this would mean that the entire IP packet that the lost fragment is a part of would also be dropped (since in most cases the IP packet must be reconstructed from several fragments).

Since the sliding-window protocol does not detect fragment truncation or corruption (assuming the iodine data-header is valid) it doesn't actually suffice to tunnel TCP connections reliably (as can be seen by testing the TCP forwarding feature), but would be suitable for UDP. In this case the recommended way to use iodine to tunnel UDP "connections" over DNS and then use something like OpenVPN on top to provide encryption and the IP-level TUN/TAP device (and maybe this would also solve the Windows compatibility problem, rather than modifying to use the newer TUN/TAP driver).

I'm not an expert with the operation of TCP but given what you have pointed out, it seems to be the case that TCP works better (=higher throughput/lower latency) with higher packet loss than with high latency and high jitter.

@jiquera
Copy link

jiquera commented Apr 3, 2017

@frekky I can imagine two time outs, having one for resending and one for dropping (which is bigger). I you keep trying to resend, TCP will detect a packet drop (it's taking to long) and start resending itself, filling up the complete queue with redundant packets. Also, tools that use UDP can have their own packet resend system which basically results in the same.

Due to the heavy fragmentation nature of iodine, it's indeed likely that packets will be dropped when the connection gets slow. I'm not sure what will work better, drop or no drop... but i think it's worth playing with.

As for the TAP driver, iodine needs the old TAP driver and openvpn nowadays uses a newer one. Requiring you to install 2 different driver versions (which is a bit of a hassle in windows). It would be nice if we could just use one driver :) but since you're not a windows user I suggest you do yourself a big favor and don't worry about it ;-)

@Masaq-
Copy link
Contributor

Masaq- commented Apr 4, 2017

I proposed #21 because it improves legacy mode performance on every jittery network I have tried however I doubt the method is beneficial here.

See 9425181 which is an adaptation of #21 to drop packets instead of filling the upstream window.

The -W0 option eliminates bufferbloat but performance of -W0 and -W2 are the same.

@traverseda
Copy link

Wow, this is an old pull request. Still, some better performance would be nice...

@megapro17 megapro17 mentioned this pull request Oct 24, 2020
@siraben
Copy link

siraben commented Mar 8, 2022

What's the status of this PR?

@traverseda
Copy link

Not quite 10 years yet Yarrick, want to take another look?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.