Replies: 7 comments 5 replies
-
If there were a way to maintain the protocol logic separately from the IO code, it would remove one of the major cons of option 1.
An angle for further exploration: It may be possible to use this pattern to preserve the flexibility of doing synchronous IO inside the async api - if you decide that that's preferred for performance - since the async vs sync IO is abstracted at a layer below the public api. |
Beta Was this translation helpful? Give feedback.
-
Another aspect of the api that is affected by async is callbacks (anywhere a handler or closure is used as a paraemeter).
If you need to send responses in the callback, it needs to be an async closure:
(Of course, a public async api could support both Error callbacks and other callbacks in Options also need to be addressed.
I'm sure there are ways to make this more ergonomic. Perhaps macros and additional library support can help. The Subscription preprocessor can also be done with a trait:
where
In this case I don't think the ergonomics are a problem |
Beta Was this translation helpful? Give feedback.
-
^ some thoughts above @derekcollison @Jarema Oops, buried the lead: Thank you @Jarema for kicking off the discussion. I'm glad your team is making this issue a priority. My humble contribution to the discussion Edit: same as link in initial post One of the known-unknowns in this discussion is performance of switching to async. If you already have a multi-threaded, multi-connection benchmark test and you want quick idea on performance implications of a hypothetical top-to-bottom async rust client, you should be able to plug that library into the benchmark with very little effort. I haven't done any performance benchmarking or performance tuning. As far as I know, it's very close to functionally complete (for an async-only and tokio-only nats client), but it hasn't had real-world usage yet. |
Beta Was this translation helpful? Give feedback.
-
I'm with @stevelr in the point that a stringer separation of protocol and IO will make maintaining an sync and async NATS client way easier. But I would suggest to separate the current crate into several for this purpose:
This comes with several benefits:
Of course there are negative points as well:
|
Beta Was this translation helpful? Give feedback.
-
Some basic benchmarks: @stevelr async rewrite:binary size of basic app using steve fork: 3.1M subscribe stats for 10 000 000 messages:nats bench --msgs 10 000 000 --pub 10 "events.>" publish stats for 10 000 000 messages:total time: 40.756654458s two instances of async nats, one pub, one sub:10 000 000 messages, nats.rssubscribe stats for 10 000 000 messages:nats bench --msgs 10000000 --pub 10 "events.>" publish stats for 10 000 000 messages:total time: 887.570083ms two instances of sync nats, one pub, one sub:10 000 000 messages, The above shows that subscriptions are pretty close. Publishes on the other hand - have some evident slowdowns. NOTE: the sync publisher was so fast, that the subscriber was dropping some messages (less then 1%) |
Beta Was this translation helpful? Give feedback.
-
Looking forward to the results of your architecture investigation and any info you can share about availability of async client library. IMHO Jetstream apis could come later if that makes it easier to get an alpha release out for testing. It would be useful with basic pub sub and client auth (jwt/seed). |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Async Refactor
Current async code for NATS Client contained in
asynk
module was a simple and easy way to enable async code in Rust NATS Client.It was a compromise in a few regards that affects our users.
This discussion is created to align on what are the options and how we want to push the topic forward.
Current approach and its limits
asynk
module is based on blocking crate, which is a simple thread pool providingasync
semantics.Our way of using it leads to a few issues:
The above is not only inefficient but also can lead to unexpected blocking. If the limit of the thread pool is reached, new subscriptions will be hanged until other subscriptions are shutdown.
That raised quite a few tickets (#226)
Requirements for the solution
As this is a NATS Client, it has few requirements to satisfy:
sync
implementation for no-runtime purely sync scenariosAPI for flavors
There are different ways to provide users way to choose what flavor to use (sync,tokio,async-std)
From all the above
feature
flavor seems to be most idiomatic.It is used by major Rust crates.
async-std
example (used to setup compatibility mode):Unless better options will be proposed or some blockers will arise during implementation, this direction will be taken.
Solution options
There are a few ways we can implement the new API.
1. Async rewrite
Rewrite the whole library to
async
.That simplifies the architecture of the client that also makes it "native" for users that are async (is it safe to assume that most use cases of this lib are async?).
pros
cons
async
(benchmarks needed)For the latter probably compatibility mode of runtime could be used, but that often creates two runtime executors (at least it seems so in the case of async-std - to be confirmed) which
is something we want to avoid. Another approach is to have a separate executor for tasks and make it swappable, though that's not that straightforward as TCPSocket has to be setup per runtime (if compat mode spawns the second runtime), though viable.
2. Sync core, async/sync API
This approach leverages the current sync API for TCPSocket and NATS proto handling but exposes both sync and async API to the users.
pros
cons
block_on
)One of the unwanted issues with this approach is the potential temptation to do the refactor of proto code to enable cleaner sync-async boundaries and better performance.
That is because the current codebase has a potential for lock contention (especially after introducing client-side slow consumers detection) and also has multiple communication patterns between API and proto - messages are sent via channels, everything else is function calls.
Leveraging simple actor model and embracing Go's
Such an approach might be needed to get better sync performance if the amount of locking is an actual factor (to be shown be benchmarks).
Nonetheless, this approach should have a higher performance limit ceiling.
3. Implementation per flavor
This approach is simplest of all but introduces so much code duplication that will cripple the velocity of Rust NATS clients that
it's not worth considering unless all above will have so many issues, that this will become last resort option.
Conclusion
This is just starting point for the discussion. If you know better approaches, don't hesitate to share them. The same applies to feedback and correction of everything said above
@caspervonb @derekcollison
Beta Was this translation helpful? Give feedback.
All reactions