-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move parts of the TunnelCommunity
to Rust
#1119
Comments
Considering how promising changing the crypto sounded, I did some testing as well. I looked at the difference between the current crypto (ChaCha20-Poly1305 using libnacl) and the proof of concept mentioned here. Unfortunately, I can't reproduce any significant increase in tunnel throughput. In fact, even if I skip the crypto altogether, it doesn't really do all that much (the increase in throughput varies between 8-18%). The discussion mentioned that testing was done in the wild, but I'm wondering whether this testing was done with an exit node dedicated to this test. If that's the case, I'm not sure how fair this comparison is considering our real-world exit nodes are under much heavier load. Having said that, I'm a big proponent of improving the tunnel performance, although I prefer a more drastic approach where we stop using Python altogether when processing tunnel data. I'm looking into moving some of the tunnel stuff from Python to Rust, while keeping the control logic in Python. The basic idea is that a separate Rust thread would own the IPv8 socket and tunnel data would never leave Rust. Normal community traffic (including tunnel control messages) would be handed off to communities in Python by acquiring the GIL from Rust and calling a Python callback. This does mean that we're on the wrong thread when calling the packet handler, so we can't call any asyncio function or stuff will break. I'm still working on a solution for this by only switching threads when the packet handler is async. I've implemented a simple Rust library that just transfers everything between the Rust socket and Python. Overall it's looking that the Rust socket will be about as fast as our current Python socket (although CPU is a bit higher). That means that we can do tunnel crypto in Rust without taking a performance hit on our pure Python communities. We will have to change the EVA code a bit and stop defining message handlers as async where it isn't needed (that tends to hurt performance normally, but with the Rust module it will be significantly worse). Implementing a more advanced prototype will take a while, since it's a pretty big job. |
I think that - at some point in time - the O.P. code worked in the wild without any experimental servers. However, I also can't get it to work anymore. Then again, if we're switching away from Python anyway: I guess it's not worth investigating 🤷 I'll update the post to reflect that. I'm all for moving performance-critical code out of Python. I believe it's really the only reasonable way to still significantly improve performance (though I'm not sure what Python 3.12 will bring to the table with subinterpreters). Personally, I'm more a fan of C because it is slightly faster and the Rust community is filled with drama and marketing B.S. (which I see as a long-term risk to the programming language's survival). However, if I'm not the one doing the implementation, I'm not complaining. I also experimented with handing off data from different endpoint implementations (in C and multi-processed) in the past and my results were the same as yours: about equal - not worth it (on its own) to introduce external languages. Regarding async handling: @kozlovsky recently uncovered database calls in the (Tribler) message handlers that were blocking the socket communication for ~20 seconds. It would be nice if we could do handling in dedicated I/O threads with the Rust implementation (somehow?). That would solve two issues at once - if it is indeed possible. What do you think? Ending on an administrative notice: if you've started work on this, I'll assign you to the issue. |
Yes, but the dramatic performance improvement (double the speed) was still measured with a dedicated exit node, right? According to the discussion, testing on localhost only showed a 10% speed inprovement (that I can't reproduce). Just trying to manage the expectations of how fast a Rust implementation would be. If the very least that is expected here is doubling the throughput, I feel like I'm setting myself up for failure.
If I didn't have to program the module myself, I would completely agree. I simply chose Rust because I think it's the easiest way to go about it. In the past I've implemented a C++ Python module, and my experience was... not great. And I won't even mention the libswift maintenance disaster. In any case, my approach would be to have the least amount of code as possible, and keep things in Python wherever we can. If we decide to move away from Rust in the future, I don't think that would be such a big deal. At this point -with choosing Rust- I'm just trying to do the most amount of work in the smallest amount of time, so I can have some idea of what any performance benefit would be.
Yes, and that's not a bad thing. I'm just trying to "steal" some IPv8 packets without causing performance issues in other communities. Now that I have the tunnel data in Rust, I can hopefully hand them over to libtorrent as fast as possible.
Well, sure. Although if you still do Python stuff (i.e., require the GIL) I wouldn't bother with Rust. Besides, that could just as easily be done separately. Even in a different language, like C if you will 😄 |
To be completely honest: I can't remember. However, that seems likely, given that it doesn't work anymore.
If we don't achieve roughly double the speed, we might have to reconsider maintaining non-Python code. As you mentioned, the
You could also consider assigning a separate socket for tunnel community traffic to avoid that problem. Perhaps that's too much for the initial prototype though.
I think it's only used in two places but many more parts of the Tribler code are accessing the database on the main thread. It would be nice to have some design pattern for threaded db access provided by IPv8. |
Like I said before I'm trying to manage expectations because I don't think it's fair to suggest that the speed can be doubled by a simple crypto library replacement: I'm just not seeing that. And believe me I tried.
I already tried that, but the performance was worse 😢 |
Difficult issue... |
@synctext Fair enough. I'll put a pin in this for now. |
TunnelCrypto
TunnelCommunity
to Rust
With #1266, IPv8 now optionally uses the |
Original discussion: Tribler/tribler#6679 (partially reposted here to make it a formal IPv8 issue)
Summary:
The throughput of anonymized Tribler downloads can be almost doubled by implementing
TunnelCrypto.encrypt_str
andTunnelCrypto.decrypt_str
in C.A proof of concept - credit to @hbiyik - exists here:https://github.com/qstokkink/py-ipv8/tree/compiled_evpOne change still needs to be made to the proof of concept: in case compilation fails the exisiting pure-Python implementation should be used as a fallback. Secondarily, the concept code could use a cleanup.[EDIT] The above no longer works and there are no plans to fix the code.
The text was updated successfully, but these errors were encountered: