Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Device Client Reconnects After Extended 'Disconnected_Retrying' State Without Entering 'Retry_Expired' as Documented #3468

Open
Padmanaabhah opened this issue Jul 30, 2024 · 1 comment
Labels
area-documentation Issues related to fixing or improving documentation for the client library.

Comments

@Padmanaabhah
Copy link

Description:

Issue Summary:
With the latest device client version, we have observed that the device enters the 'Disconnected_Retrying' status with the reason 'Communication_Error' and connects back even if the network is restored after 2 days. According to the documentation, it should enter 'retry_expired' after exhausting the retry attempts and timeout period.

Detailed Description:

  • Environment:

    • Lab: Device transitions back to "Connected" even after days of being in the 'Disconnected_Retrying' state with 'Communication_Error'.
  • Configuration:

    • The 'Operationtimeout_milliseconds' parameter is not defined.

Expected Behavior:
The device should transition from 'Disconnected_Retrying' to 'retry_expired' after the retry period of 20 minutes is exhausted, as per the documentation.

Steps to Reproduce:

  1. Deploy the latest device client version in a lab environment.
  2. Induce a communication error to trigger the 'Disconnected_Retrying' state.
  3. Monitor the device status for state transitions.

Actual Behavior:

  • Device reconnects to the network even after being in the 'Disconnected_Retrying' state for an extended period, contrary to the expected transition to 'retry_expired'.

We need help understanding the following:

  • Why does the device client reconnect even after days of disconnection without using the 'Operationtimeout_milliseconds' parameter, when the documentation states it should enter 'retry_expired' after 20 minutes?

Any guidance or insights into this behavior would be greatly appreciated.

@Padmanaabhah Padmanaabhah added the bug Something isn't working. label Jul 30, 2024
@timtay-microsoft
Copy link
Member

timtay-microsoft commented Aug 1, 2024

It does appear the documentation around this behavior is incorrect/outdated.

The expectation here should be that, by default, the client will attempt to reconnect forever (with attempts slowing down over time). The operation timeout has no bearing on this and is only relevant to individual attempts to send a message, get the twin, etc.

If you want the client to eventually give up on reconnecting, then you will need to provide a custom retry policy that dictates at what point the SDK should stop retrying.

The only caveat here is that this SDK will implicitly try to open the connection for you when you try to send a message, get the twin, etc. so you may see the behavior you outlined above if you are sending that message when the device client is currently closed.

@timtay-microsoft timtay-microsoft added area-documentation Issues related to fixing or improving documentation for the client library. and removed bug Something isn't working. labels Aug 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-documentation Issues related to fixing or improving documentation for the client library.
Projects
None yet
Development

No branches or pull requests

2 participants