-
Hi all, I am running the latest published crates and attempting to use the AWS Route53 SDK with Rust. In general, the functionality is excellent. However, I have been facing a very frustrating issue when operating on an unstable connection, such as a hotel room or coffeeshop. Simply put, a request, such as ListHostedZones or ListResourceRecordSets will hang for no reason, seemingly indefinitely. I have tried all combinations of stalled stream protection and connect/operation/operation attempt/read timeout settings to no avail. No error is thrown, it just hangs forever! The debugger shows all tokio threads parked waiting for a futex. Frustratingly, even wrapping it in tokio::time:timeout still hangs!
The behaviour seems to be very nondeterministic. The number of requests I can successfully issue changes every time. Delaying my requests by any number of milliseconds to reduce traffic doesn't seem to help, either. I have set RUST_LOG=trace, and no output is emitted after the hang occurs. I tried analysing with Wireshark and discovered that the failure always occurs after the client sends the server a TCP reset: This leads me to believe that this is caused by flaky or unstable network conditions. Unfortunately this is a complete show-stopper for me, as I expect to be on the road for a while longer. Can you advise on how to configure the AWS SDK for these conditions? If my conclusions are incorrect, what else could be causing this? Thank you and best regards. |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 6 replies
-
Thank you for reporting this! Before we proceed, I have clarifying questions:
|
Beta Was this translation helpful? Give feedback.
-
Ok, so I think this might be related to this: So I'm trying to test that hypothesis by setting #448 Meanwhile this: ...doesn't expose the pool_max_idle_per_host() or much of anything else in the builder. Nor does this: ... in fact the method HyperClientBuilder::new().build() no longer even exists. So I'm not sure how one is even supposed to use that. And this: ...complains that So now my question becomes, how do I set up a custom http_client and set pool_max_idle_per_host on the latest version of the SDK? |
Beta Was this translation helpful? Give feedback.
-
Hi all, my apologies for the noise. It turns out it was a complete wild goose chase. After going slightly mad, I found that the issue still persisted even when I replaced calls to the AWS Route53 SDK with a simple tokio::time::sleep().await . I want to thank you all sincerely for your attention on this matter. My hypotheses about any kind of network issue or hyper pool parameters causing this appear to be have been completely false. Best wishes for this excellent project going forward. |
Beta Was this translation helpful? Give feedback.
-
Hello! Reopening this discussion to make it searchable. |
Beta Was this translation helpful? Give feedback.
Hi all, my apologies for the noise. It turns out it was a complete wild goose chase. After going slightly mad, I found that the issue still persisted even when I replaced calls to the AWS Route53 SDK with a simple tokio::time::sleep().await .
Every future was being awaited correctly, and yet, the hang didn't happen with a synchronous sleep - only when there was an await point in a certain "leaf" function!
I thought this was just so ridiculous, and so in a hail-mary, migrated the app from axum to actix-web. And... the bug disappeared! I am puzzled. However, everything is now working correctly!
I want to thank you all sincerely for your attention on this matter. My hypotheses about any kind…