Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crashing on some user devices when try to access searchPlaceIndexForSuggestions #1425

Closed
1 task
drayan85 opened this issue Sep 30, 2024 · 8 comments
Closed
1 task
Labels
bug This issue is a bug. p2 This is a standard priority issue

Comments

@drayan85
Copy link

drayan85 commented Sep 30, 2024

Describe the bug

We have debounce 500ms for user search and if user type text to search for address, we will kill the previous job and create new search for place index request to get the Address suggestions

Regression Issue

  • Select this option if this issue appears to be a regression.

Expected behavior

It should not crash the application since the network operation(AWS library internal implementation) is not on the main thread to call AWS location APIs

Current behavior

Some of the users when they try to search for place index is crashing:

at aws.smithy.kotlin.runtime.http.engine.CoroutineUtilsKt$attachToOuterJob$cleanupHandler$1.invoke(CoroutineUtils.kt:51) at aws.smithy.kotlin.runtime.http.engine.CoroutineUtilsKt$attachToOuterJob$cleanupHandler$1.invoke(CoroutineUtils.kt:49)

          Caused by kotlinx.coroutines.CompletionHandlerException: Exception in completion handler v0@eae879c[job@c4975a5] for y0{Cancelled}@c4975a5
       at kotlinx.coroutines.JobSupport.notifyCompletion(JobSupport.kt:1506)
       at kotlinx.coroutines.JobSupport.completeStateFinalization(JobSupport.kt:322)
       at kotlinx.coroutines.JobSupport.finalizeFinishingState(JobSupport.kt:239)
       at kotlinx.coroutines.JobSupport.tryMakeCompletingSlowPath(JobSupport.kt:917)
       at kotlinx.coroutines.JobSupport.tryMakeCompleting(JobSupport.kt:874)
       at kotlinx.coroutines.JobSupport.cancelMakeCompleting(JobSupport.kt:707)
       at kotlinx.coroutines.JobSupport.cancelImpl$kotlinx_coroutines_core(JobSupport.kt:678)
       at kotlinx.coroutines.JobSupport.cancelInternal(JobSupport.kt:643)
       at kotlinx.coroutines.JobSupport.cancel(JobSupport.kt:628)
       at aws.smithy.kotlin.runtime.http.engine.CoroutineUtilsKt$attachToOuterJob$cleanupHandler$1.invoke(CoroutineUtils.kt:51)
       at aws.smithy.kotlin.runtime.http.engine.CoroutineUtilsKt$attachToOuterJob$cleanupHandler$1.invoke(CoroutineUtils.kt:49)
       at kotlinx.coroutines.InternalCompletionHandler$UserSupplied.invoke(CompletionHandler.common.kt:67)
       at kotlinx.coroutines.InvokeOnCancelling.invoke(JobSupport.kt:1438)
       at kotlinx.coroutines.JobSupport.notifyCancelling(JobSupport.kt:1483)
       at kotlinx.coroutines.JobSupport.tryMakeCancelling(JobSupport.kt:806)
       at kotlinx.coroutines.JobSupport.makeCancelling(JobSupport.kt:766)
       at kotlinx.coroutines.JobSupport.cancelImpl$kotlinx_coroutines_core(JobSupport.kt:682)
       at kotlinx.coroutines.JobSupport.parentCancelled(JobSupport.kt:648)
       at kotlinx.coroutines.ChildHandleNode.invoke(JobSupport.kt:1446)
       at kotlinx.coroutines.JobSupport.notifyCancelling(JobSupport.kt:1483)
       at kotlinx.coroutines.JobSupport.tryMakeCancelling(JobSupport.kt:806)
       at kotlinx.coroutines.JobSupport.makeCancelling(JobSupport.kt:766)
       at kotlinx.coroutines.JobSupport.cancelImpl$kotlinx_coroutines_core(JobSupport.kt:682)
       at kotlinx.coroutines.JobSupport.parentCancelled(JobSupport.kt:648)
       at kotlinx.coroutines.ChildHandleNode.invoke(JobSupport.kt:1446)
       at kotlinx.coroutines.JobSupport.notifyCancelling(JobSupport.kt:1483)
       at kotlinx.coroutines.JobSupport.tryMakeCancelling(JobSupport.kt:806)
       at kotlinx.coroutines.JobSupport.makeCancelling(JobSupport.kt:766)
       at kotlinx.coroutines.JobSupport.cancelImpl$kotlinx_coroutines_core(JobSupport.kt:682)
       at kotlinx.coroutines.JobSupport.parentCancelled(JobSupport.kt:648)
       at kotlinx.coroutines.ChildHandleNode.invoke(JobSupport.kt:1446)
       at kotlinx.coroutines.JobSupport.notifyCancelling(JobSupport.kt:1483)
       at kotlinx.coroutines.JobSupport.tryMakeCancelling(JobSupport.kt:806)
       at kotlinx.coroutines.JobSupport.makeCancelling(JobSupport.kt:766)
       at kotlinx.coroutines.JobSupport.cancelImpl$kotlinx_coroutines_core(JobSupport.kt:682)
       at kotlinx.coroutines.JobSupport.parentCancelled(JobSupport.kt:648)
       at kotlinx.coroutines.ChildHandleNode.invoke(JobSupport.kt:1446)
       at kotlinx.coroutines.JobSupport.notifyCancelling(JobSupport.kt:1483)
       at kotlinx.coroutines.JobSupport.tryMakeCancelling(JobSupport.kt:806)
       at kotlinx.coroutines.JobSupport.makeCancelling(JobSupport.kt:766)
       at kotlinx.coroutines.JobSupport.cancelImpl$kotlinx_coroutines_core(JobSupport.kt:682)
       at kotlinx.coroutines.JobSupport.cancelInternal(JobSupport.kt:643)
       at kotlinx.coroutines.JobSupport.cancel(JobSupport.kt:628)
       at kotlinx.coroutines.Job$DefaultImpls.cancel$default(Job.kt:195)

Steps to Reproduce

We could not be able to reproduce on it our end but we can see lot our end users are getting this and cause the crash the app

val locationCredentialsProvider: LocationCredentialsProvider = 
                       AuthHelper(context).authenticateWithCognitoIdentityPool("xxxxxxxxxx")
val locationClient: LocationClient = locationCredentialsProvider.getLocationClient()
val request = SearchPlaceIndexForSuggestionsRequest {
      text = "sydney"
      indexName = "xxxxx"
}
val response = locationClient.searchPlaceIndexForSuggestions(request)
        Caused by android.os.NetworkOnMainThreadException:
       at android.os.StrictMode$AndroidBlockGuardPolicy.onNetwork(StrictMode.java:1605)
       at com.android.org.conscrypt.Platform.blockGuardOnNetwork(Platform.java:426)
       at com.android.org.conscrypt.ConscryptEngineSocket$SSLOutputStream.writeInternal(ConscryptEngineSocket.java:657)
       at com.android.org.conscrypt.ConscryptEngineSocket$SSLOutputStream.write(ConscryptEngineSocket.java:652)
       at okio.OutputStreamSink.write(JvmOkio.kt:56)
       at okio.AsyncTimeout$sink$1.write(AsyncTimeout.kt:127)
       at okio.RealBufferedSink.flush(RealBufferedSink.kt:268)
       at okhttp3.internal.http2.Http2Writer.rstStream(Http2Writer.kt:144)
       at okhttp3.internal.http2.Http2Connection.writeSynReset$okhttp(Http2Connection.kt:357)
       at okhttp3.internal.http2.Http2Stream.close(Http2Stream.kt:258)
       at okhttp3.internal.http2.Http2Stream.cancelStreamIfNecessary$okhttp(Http2Stream.kt:557)
       at okhttp3.internal.http2.Http2Stream$FramingSource.close(Http2Stream.kt:539)
       at okio.ForwardingSource.close(ForwardingSource.kt:32)
       at okhttp3.internal.connection.Exchange$ResponseBodySource.close(Exchange.kt:324)
       at okio.RealBufferedSource.close(RealBufferedSource.kt:484)
       at aws.smithy.kotlin.runtime.http.engine.okhttp.InstrumentedSource.close(MetricsInterceptor.kt:94)
       at okio.RealBufferedSource.close(RealBufferedSource.kt:484)
       at okhttp3.internal._UtilCommonKt.closeQuietly(-UtilCommon.kt:302)
       at okhttp3.internal._ResponseBodyCommonKt.commonClose(-ResponseBodyCommon.kt:50)
       at okhttp3.ResponseBody.close(ResponseBody.kt:181)
       at aws.smithy.kotlin.runtime.http.engine.okhttp.OkHttpEngine$roundTrip$2$1.invoke(OkHttpEngine.kt:67)
       at aws.smithy.kotlin.runtime.http.engine.okhttp.OkHttpEngine$roundTrip$2$1.invoke(OkHttpEngine.kt:62)
       at kotlinx.coroutines.InternalCompletionHandler$UserSupplied.invoke(CompletionHandler.common.kt:67)
       at kotlinx.coroutines.InvokeOnCompletion.invoke(JobSupport.kt:1392)
       at kotlinx.coroutines.JobSupport.notifyCompletion(JobSupport.kt:1502)
       at kotlinx.coroutines.JobSupport.completeStateFinalization(JobSupport.kt:322)
       at kotlinx.coroutines.JobSupport.finalizeFinishingState(JobSupport.kt:239)
       at kotlinx.coroutines.JobSupport.tryMakeCompletingSlowPath(JobSupport.kt:917)
       at kotlinx.coroutines.JobSupport.tryMakeCompleting(JobSupport.kt:874)
       at kotlinx.coroutines.JobSupport.cancelMakeCompleting(JobSupport.kt:707)
       at kotlinx.coroutines.JobSupport.cancelImpl$kotlinx_coroutines_core(JobSupport.kt:678)
       at kotlinx.coroutines.JobSupport.cancelInternal(JobSupport.kt:643)
       at kotlinx.coroutines.JobSupport.cancel(JobSupport.kt:628)
       at aws.smithy.kotlin.runtime.http.engine.CoroutineUtilsKt$attachToOuterJob$cleanupHandler$1.invoke(CoroutineUtils.kt:51)
       at aws.smithy.kotlin.runtime.http.engine.CoroutineUtilsKt$attachToOuterJob$cleanupHandler$1.invoke(CoroutineUtils.kt:49)
       at kotlinx.coroutines.InternalCompletionHandler$UserSupplied.invoke(CompletionHandler.common.kt:67)
       at kotlinx.coroutines.InvokeOnCancelling.invoke(JobSupport.kt:1438)
       at kotlinx.coroutines.JobSupport.notifyCancelling(JobSupport.kt:1483)
       at kotlinx.coroutines.JobSupport.tryMakeCancelling(JobSupport.kt:806)
       at kotlinx.coroutines.JobSupport.makeCancelling(JobSupport.kt:766)
       at kotlinx.coroutines.JobSupport.cancelImpl$kotlinx_coroutines_core(JobSupport.kt:682)
       at kotlinx.coroutines.JobSupport.parentCancelled(JobSupport.kt:648)
       at kotlinx.coroutines.ChildHandleNode.invoke(JobSupport.kt:1446)
       at kotlinx.coroutines.JobSupport.notifyCancelling(JobSupport.kt:1483)
       at kotlinx.coroutines.JobSupport.tryMakeCancelling(JobSupport.kt:806)
       at kotlinx.coroutines.JobSupport.makeCancelling(JobSupport.kt:766)
       at kotlinx.coroutines.JobSupport.cancelImpl$kotlinx_coroutines_core(JobSupport.kt:682)
       at kotlinx.coroutines.JobSupport.parentCancelled(JobSupport.kt:648)
       at kotlinx.coroutines.ChildHandleNode.invoke(JobSupport.kt:1446)
       at kotlinx.coroutines.JobSupport.notifyCancelling(JobSupport.kt:1483)
       at kotlinx.coroutines.JobSupport.tryMakeCancelling(JobSupport.kt:806)
       at kotlinx.coroutines.JobSupport.makeCancelling(JobSupport.kt:766)
       at kotlinx.coroutines.JobSupport.cancelImpl$kotlinx_coroutines_core(JobSupport.kt:682)
       at kotlinx.coroutines.JobSupport.parentCancelled(JobSupport.kt:648)
       at kotlinx.coroutines.ChildHandleNode.invoke(JobSupport.kt:1446)
       at kotlinx.coroutines.JobSupport.notifyCancelling(JobSupport.kt:1483)
       at kotlinx.coroutines.JobSupport.tryMakeCancelling(JobSupport.kt:806)
       at kotlinx.coroutines.JobSupport.makeCancelling(JobSupport.kt:766)
       at kotlinx.coroutines.JobSupport.cancelImpl$kotlinx_coroutines_core(JobSupport.kt:682)
       at kotlinx.coroutines.JobSupport.parentCancelled(JobSupport.kt:648)
       at kotlinx.coroutines.ChildHandleNode.invoke(JobSupport.kt:1446)
       at kotlinx.coroutines.JobSupport.notifyCancelling(JobSupport.kt:1483)
       at kotlinx.coroutines.JobSupport.tryMakeCancelling(JobSupport.kt:806)
       at kotlinx.coroutines.JobSupport.makeCancelling(JobSupport.kt:766)
       at kotlinx.coroutines.JobSupport.cancelImpl$kotlinx_coroutines_core(JobSupport.kt:682)
       at kotlinx.coroutines.JobSupport.cancelInternal(JobSupport.kt:643)
       at kotlinx.coroutines.JobSupport.cancel(JobSupport.kt:628)
       at kotlinx.coroutines.Job$DefaultImpls.cancel$default(Job.kt:195)

Possible Solution

We could not be able to reproduce on our end using the same build that user have

Context

It cause crashing many user devices on the production and need to fix as soon as possible

AWS SDK for Kotlin version

software.amazon.location:auth:0.2.5 aws.sdk.kotlin:location:1.2.38 (Can't update to the latest version due to #1411)

Platform (JVM/JS/Native)

JVM (Kotlin) JDK 17

Operating system and version

Android 13

@drayan85 drayan85 added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Sep 30, 2024
@ianbotsf ianbotsf added p2 This is a standard priority issue and removed needs-triage This issue or PR still needs to be triaged. labels Sep 30, 2024
@lauzadis
Copy link
Member

lauzadis commented Oct 2, 2024

Hello, you've shared two different exceptions (kotlinx.coroutines.CompletionHandlerException, android.os.NetworkOnMainThreadException), can you clarify which one you're looking to address?

For android.os.NetworkOnMainThreadException we've seen a similar issue before which was related to running aws-sdk-kotlin operations on the main thread. Can you please share more information about how you're using the SDK? Is it running on the main thread of the application?

@drayan85
Copy link
Author

drayan85 commented Oct 3, 2024

Both Exceptions are in the same crash stack trace.

In the LocationRepository Implementation

suspend fun getAddressSuggestion(query: String) : List<SearchForSuggestionsResult> {
    val request = SearchPlaceIndexForSuggestionRequest {
      text = query
    }
    val locationClient = AuthHelper(context).authenticateWithCognitoIdentityPool("poolId").getLocationClient()
    return locationClient.searchPlaceIndexForSuggestions(request).results
}

In our ViewModel

  fun onAdressTextChange(query: String) {
    viewModelScope.launch {
      val suggestions = locationRepository.getAddressSuggestion(query)
   }
 }

This mean we are not calling the AWS Location API call outside of the Main but this crash we could not be able to reproduce on our and only happening for some of our users.

Since this is executing on Main thread why it is not crashing on our device ?

@lauzadis
Copy link
Member

lauzadis commented Oct 3, 2024

I can't say why it's crashing on user devices and not on your test devices. Using the SDK on the main UI thread can result in unpredictable behavior. I'd recommend wrapping your getAddressSuggestion function in a withContext(Dispatchers.IO) { ... } or researching some other ways to get the network operations off the main thread.

@drayan85
Copy link
Author

drayan85 commented Oct 3, 2024

We are doing hotfix by setting the dispatcher.

I was wondering, since locationClient.searchPlaceIndexForSuggestions(request) is a suspend function, shouldn't setting the dispatcher be handled internally by the AWS library implementation?

@lauzadis
Copy link
Member

lauzadis commented Oct 3, 2024

suspend just means the function's execution can be paused and resumed. You still need to specify which coroutine context to run the function in. By default it will run in the context in which it was called (UI thread in this case)

@drayan85
Copy link
Author

drayan85 commented Oct 4, 2024

You are correct in terms of the behavior of the suspend function.

What I am referring to, as an SDK developer, is that those implementing the network calls within the SDK may need to explicitly manage the coroutine context. This would help prevent crashes for the developers or applications consuming the library. Ensuring the appropriate context is set when making asynchronous network calls can mitigate issues related to threading and context switching, which are especially important in environments where concurrency needs to be managed effectively.

@lauzadis
Copy link
Member

Users are responsible for choosing the appropriate coroutine dispatcher to launch SDK requests on. I'm closing this issue for now, please feel free to open a new issue if you have SDK-specific issues to report. Thanks.

Copy link

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. p2 This is a standard priority issue
Projects
None yet
Development

No branches or pull requests

3 participants