-
Notifications
You must be signed in to change notification settings - Fork 19
TimeoutException on SetAsync() #12
Comments
Looks like there's a new PR out that should help you narrow down the issue; We moved from Set to SetAsync() just a couple days ago have now experienced service failures ("Partition is in Quorum Loss") and the app's exit code appears to have been swallowed (""System.Hosting' reported Error for property 'CodePackageActivation:Code:EntryPoint:132315329573570487'. Did you find the source of the issue or are you still working? |
I was hoping someone from the SoCreate team would chime in with their opinion as to why this might happen and/or whether they can add support for using the CancellationToken to work around. |
I haven't gotten the patch yet either, I'll mention I recently was able to watch SetAsync() run through. When performing multiple set operations in a row, Async was hundreds/thousands of times slower than non-async set. Perhaps there are speed benefits when you queue up these cache sets? I don't have any more information besides the fact I could sit there and watch how slow it was to set individual cache buckets asynchronously. |
Have you run this against the latest version of the of the library? How many items are you adding to the cache and how large are the items? Do you have a load test example I could run to duplicate the issue? |
We’ve been running on 2.0.1. I will upgrade to the latest and try tonight. We start seeing this when running around 100 RPS or higher against a 5 machine cluster (VMs are 4 core, 14GB RAM). We have set HasPersistedState="false". We do a single Read & Write on each call. Typical Read/Write size is ~ 250 bytes.
The load tests I have are specific to our service, so I’m not sure how much value that would be for you. They are just built on VS 2019 Load Test framework so pretty easy to get going with. Happy to help with seeing how we could debug against our service if that’s what you’re thinking.
From: Jami Lurock <[email protected]>
Sent: Thursday, July 2, 2020 12:08 PM
To: SoCreate/service-fabric-distributed-cache <[email protected]>
Cc: Ryan Waller <[email protected]>; Author <[email protected]>
Subject: Re: [SoCreate/service-fabric-distributed-cache] TimeoutException on SetAsync() (#12)
Have you run this against the latest version of the of the library? How many items are you adding to the cache and how large are the items? Do you have a load test example I could run to duplicate the issue?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FSoCreate%2Fservice-fabric-distributed-cache%2Fissues%2F12%23issuecomment-653175467&data=02%7C01%7Crwaller%40microsoft.com%7Cb6bfe23e3549453366f608d81ebb4143%7C72f988bf86f141af91ab2d7cd011db47%7C0%7C0%7C637293136895396822&sdata=BVuzsnaVqMBDY0kU8Mx7gSw%2FdBa4%2BMfQfyDeuB1AB%2B0%3D&reserved=0>, or unsubscribe<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAH2U7VRA4L5FV2G6LPAPBATRZTLJPANCNFSM4N4SPSDQ&data=02%7C01%7Crwaller%40microsoft.com%7Cb6bfe23e3549453366f608d81ebb4143%7C72f988bf86f141af91ab2d7cd011db47%7C0%7C0%7C637293136895406818&sdata=OzYO2Tc%2B97wh3KNpgUUve8sLWDcps11uNZ5h%2FzeGDeg%3D&reserved=0>.
|
Please try version 2.1.2 that I have just pushed up. I found an threading issue causing some random exceptions. Let me know if this fixes your issues. |
Upgraded to 2.1.2 and ran a few cycles of tests over the weekend and we are still seeing some Timeout errors, but it doesn’t seem it’s as many as before (sorry, I don’t have all the old data now ). I also notice a lot of latency in the cache calls. These were probably there before but I had just been focusing on the ones that ran longer than 5 minutes and gave an error.
Here's a couple graphs of the latencies we saw on one of the runs. This run was a constant 1000 user load for 7 hours, which resulted in about 30K cache Reads and 30K Writes. Vertical axis is Count, Horizontal is Call Duration in MS, bucketed into 1 second intervals. This just shows the calls that took < 50 seconds. We only had 14 calls in this run that took > 300 seconds (5 minutes).
Cache Read
Note that there is a large amount of calls at 0 (< 1) seconds. This could be because in our ‘sessions’, for the first call in the session, the cache is always empty. Might suggest that if there is no data to transfer, the call is really fast. In fact, 80% of those 14K calls are < 1 ms, and another 14% < 10 ms, the rest < 100 ms, meaning that the delays we see after could be related to data transfer. This is just a guess. We’re not logging the data size so I can’t definitively correlate that.
[cid:[email protected]]
Cache Write
[cid:[email protected]]
From: Ryan Waller
Sent: Thursday, July 2, 2020 7:11 PM
To: SoCreate/service-fabric-distributed-cache <[email protected]>; SoCreate/service-fabric-distributed-cache <[email protected]>
Cc: Author <[email protected]>
Subject: RE: [SoCreate/service-fabric-distributed-cache] TimeoutException on SetAsync() (#12)
Will do.
From: Jami Lurock <[email protected]<mailto:[email protected]>>
Sent: Thursday, July 2, 2020 7:10 PM
To: SoCreate/service-fabric-distributed-cache <[email protected]<mailto:[email protected]>>
Cc: Ryan Waller <[email protected]<mailto:[email protected]>>; Author <[email protected]<mailto:[email protected]>>
Subject: Re: [SoCreate/service-fabric-distributed-cache] TimeoutException on SetAsync() (#12)
Please try version 2.1.2 that I have just pushed up. I found an threading issue causing some random exceptions. Let me know if this fixes your issues.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FSoCreate%2Fservice-fabric-distributed-cache%2Fissues%2F12%23issuecomment-653296087&data=02%7C01%7Crwaller%40microsoft.com%7Ccc6bd6a5bac34cea22ae08d81ef6305a%7C72f988bf86f141af91ab2d7cd011db47%7C0%7C0%7C637293390007868298&sdata=a4kAnvFbFhRvunyRG4nUiZh%2FfR%2BVgAM9ttsdR593oFg%3D&reserved=0>, or unsubscribe<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAH2U7VQR6JFVPNLNKRLFMMTRZU4XPANCNFSM4N4SPSDQ&data=02%7C01%7Crwaller%40microsoft.com%7Ccc6bd6a5bac34cea22ae08d81ef6305a%7C72f988bf86f141af91ab2d7cd011db47%7C0%7C0%7C637293390007868298&sdata=qEm%2B2Nk4BimKGz%2FZpcKJ1rAJBcPJrPS8n5HLiVLi8Nw%3D&reserved=0>.
|
Hi @lurock we are using SetAsync() heavly it's too slow using vesion 2.1.3 , we are cahcing small data like {"numericId":"300000000000021","expirationTime":"2020-08-24T08:11:31.4126188Z"} do u have any idea? |
While running load tests against our service, we're noticing that call to SetAsync() will sometimes fail with a TimeoutException, with what appears to be a 5 minute timeout.
Tried setting the CancellationToken to limit the duration of the call, but it looks like that is not used in this method.
Our load tests did not seem particularly intensive. They only pushed the machine to around 20% CPU.
The text was updated successfully, but these errors were encountered: