Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move parts of the TunnelCommunity to Rust #1119

Closed
qstokkink opened this issue Jan 26, 2023 · 8 comments
Closed

Move parts of the TunnelCommunity to Rust #1119

qstokkink opened this issue Jan 26, 2023 · 8 comments
Assignees
Labels
priority: low Should be addressed at some point in the future

Comments

@qstokkink
Copy link
Collaborator

qstokkink commented Jan 26, 2023

Original discussion: Tribler/tribler#6679 (partially reposted here to make it a formal IPv8 issue)

Summary:
The throughput of anonymized Tribler downloads can be almost doubled by implementing TunnelCrypto.encrypt_str and TunnelCrypto.decrypt_str in C.

A proof of concept - credit to @hbiyik - exists here:
https://github.com/qstokkink/py-ipv8/tree/compiled_evp

One change still needs to be made to the proof of concept: in case compilation fails the exisiting pure-Python implementation should be used as a fallback. Secondarily, the concept code could use a cleanup.

[EDIT] The above no longer works and there are no plans to fix the code.

@qstokkink qstokkink added the priority: low Should be addressed at some point in the future label Jan 26, 2023
@egbertbouman
Copy link
Member

egbertbouman commented Aug 18, 2023

Considering how promising changing the crypto sounded, I did some testing as well. I looked at the difference between the current crypto (ChaCha20-Poly1305 using libnacl) and the proof of concept mentioned here. Unfortunately, I can't reproduce any significant increase in tunnel throughput. In fact, even if I skip the crypto altogether, it doesn't really do all that much (the increase in throughput varies between 8-18%).

The discussion mentioned that testing was done in the wild, but I'm wondering whether this testing was done with an exit node dedicated to this test. If that's the case, I'm not sure how fair this comparison is considering our real-world exit nodes are under much heavier load.

Having said that, I'm a big proponent of improving the tunnel performance, although I prefer a more drastic approach where we stop using Python altogether when processing tunnel data. I'm looking into moving some of the tunnel stuff from Python to Rust, while keeping the control logic in Python. The basic idea is that a separate Rust thread would own the IPv8 socket and tunnel data would never leave Rust. Normal community traffic (including tunnel control messages) would be handed off to communities in Python by acquiring the GIL from Rust and calling a Python callback. This does mean that we're on the wrong thread when calling the packet handler, so we can't call any asyncio function or stuff will break. I'm still working on a solution for this by only switching threads when the packet handler is async.

I've implemented a simple Rust library that just transfers everything between the Rust socket and Python. Overall it's looking that the Rust socket will be about as fast as our current Python socket (although CPU is a bit higher). That means that we can do tunnel crypto in Rust without taking a performance hit on our pure Python communities. We will have to change the EVA code a bit and stop defining message handlers as async where it isn't needed (that tends to hurt performance normally, but with the Rust module it will be significantly worse). Implementing a more advanced prototype will take a while, since it's a pretty big job.

@qstokkink
Copy link
Collaborator Author

I think that - at some point in time - the O.P. code worked in the wild without any experimental servers. However, I also can't get it to work anymore. Then again, if we're switching away from Python anyway: I guess it's not worth investigating 🤷 I'll update the post to reflect that.

I'm all for moving performance-critical code out of Python. I believe it's really the only reasonable way to still significantly improve performance (though I'm not sure what Python 3.12 will bring to the table with subinterpreters). Personally, I'm more a fan of C because it is slightly faster and the Rust community is filled with drama and marketing B.S. (which I see as a long-term risk to the programming language's survival). However, if I'm not the one doing the implementation, I'm not complaining.

I also experimented with handing off data from different endpoint implementations (in C and multi-processed) in the past and my results were the same as yours: about equal - not worth it (on its own) to introduce external languages.

Regarding async handling: @kozlovsky recently uncovered database calls in the (Tribler) message handlers that were blocking the socket communication for ~20 seconds. It would be nice if we could do handling in dedicated I/O threads with the Rust implementation (somehow?). That would solve two issues at once - if it is indeed possible. What do you think?

Ending on an administrative notice: if you've started work on this, I'll assign you to the issue.

@egbertbouman
Copy link
Member

I think that - at some point in time - the O.P. code worked in the wild without any experimental servers. However, I also can't get it to work anymore. Then again, if we're switching away from Python anyway: I guess it's not worth investigating 🤷 I'll update the post to reflect that.

Yes, but the dramatic performance improvement (double the speed) was still measured with a dedicated exit node, right? According to the discussion, testing on localhost only showed a 10% speed inprovement (that I can't reproduce). Just trying to manage the expectations of how fast a Rust implementation would be. If the very least that is expected here is doubling the throughput, I feel like I'm setting myself up for failure.

Personally, I'm more a fan of C because it is slightly faster and the Rust community is filled with drama and marketing B.S. (which I see as a long-term risk to the programming language's survival).

If I didn't have to program the module myself, I would completely agree. I simply chose Rust because I think it's the easiest way to go about it. In the past I've implemented a C++ Python module, and my experience was... not great. And I won't even mention the libswift maintenance disaster. In any case, my approach would be to have the least amount of code as possible, and keep things in Python wherever we can. If we decide to move away from Rust in the future, I don't think that would be such a big deal. At this point -with choosing Rust- I'm just trying to do the most amount of work in the smallest amount of time, so I can have some idea of what any performance benefit would be.

I also experimented with handing off data from different endpoint implementations (in C and multi-processed) in the past and my results were the same as yours: about equal - not worth it (on its own) to introduce external languages.

Yes, and that's not a bad thing. I'm just trying to "steal" some IPv8 packets without causing performance issues in other communities. Now that I have the tunnel data in Rust, I can hopefully hand them over to libtorrent as fast as possible.

It would be nice if we could do handling in dedicated I/O threads with the Rust implementation (somehow?). That would solve two issues at once - if it is indeed possible. What do you think?

Well, sure. Although if you still do Python stuff (i.e., require the GIL) I wouldn't bother with Rust. Besides, that could just as easily be done separately. Even in a different language, like C if you will 😄
BTW, we used to have a dedicated database thread for this (a normal Python thread). I actually thought that's what we were doing: moving calls like this to a separate thread (using run_in_executor).

@qstokkink
Copy link
Collaborator Author

Yes, but the dramatic performance improvement (double the speed) was still measured with a dedicated exit node, right?

To be completely honest: I can't remember. However, that seems likely, given that it doesn't work anymore.

If the very least that is expected here is doubling the throughput, I feel like I'm setting myself up for failure.

If we don't achieve roughly double the speed, we might have to reconsider maintaining non-Python code. As you mentioned, the libswift disaster is also in my memory and that is my main driving force to keep non-Python code out of the repo. If there's no real benefit (speed or otherwise), we shouldn't do this.

I'm just trying to "steal" some IPv8 packets without causing performance issues in other communities.

You could also consider assigning a separate socket for tunnel community traffic to avoid that problem. Perhaps that's too much for the initial prototype though.

I actually thought that's what we were doing: moving calls like this to a separate thread (using run_in_executor).

I think it's only used in two places but many more parts of the Tribler code are accessing the database on the main thread. It would be nice to have some design pattern for threaded db access provided by IPv8.

@egbertbouman
Copy link
Member

egbertbouman commented Aug 18, 2023

If we don't achieve roughly double the speed, we might have to reconsider maintaining non-Python code.

Like I said before I'm trying to manage expectations because I don't think it's fair to suggest that the speed can be doubled by a simple crypto library replacement: I'm just not seeing that. And believe me I tried.

You could also consider assigning a separate socket for tunnel community traffic to avoid that problem. Perhaps that's too much for the initial prototype though.

I already tried that, but the performance was worse 😢

@synctext
Copy link
Member

Difficult issue...
Currently it's taking 8 months to release a new stable Tribler version. Performance is great, but stability might be even more important for our users today.
Currently the roadmap is simply fixing stability and search. A list of next ToDo for our users would be great.

@egbertbouman
Copy link
Member

@synctext Fair enough. I'll put a pin in this for now.

@egbertbouman egbertbouman removed their assignment Aug 18, 2023
@egbertbouman egbertbouman self-assigned this Oct 9, 2023
@egbertbouman egbertbouman changed the title Provide (partial) C implementation for TunnelCrypto Move parts of the TunnelCommunity to Rust Oct 18, 2023
@egbertbouman
Copy link
Member

With #1266, IPv8 now optionally uses the ipv8-rust-tunnels module while running an exit node. Now that the IPv8 part is done, I'll move on to updating Tribler (Tribler/tribler#7684).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: low Should be addressed at some point in the future
Development

No branches or pull requests

3 participants