-
Notifications
You must be signed in to change notification settings - Fork 272
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Orchestrator stuck in running when saving large custom status values #2918
Comments
Hi @cliedeman. Are you using Application Insights? If so, can you try enabling the Durable Task Framework logging (warnings and errors as shown in the sample should be fine) and then querying the |
@cgillum I do. When I run another batch in the coming days I will try to get some extra logging output |
@cgillum I found this error in the logs
I confirmed that I suspect that it is my customStatus (which reports on the job progress) which is exceeding the limit Ciaran |
@cliedeman thanks for this info! I checked the code, and I think you're right that this could be caused by a large custom status value. We have checks to ensure that it doesn't exceed 16 KB, but it looks like there aren't any checks to ensure that the custom status value combined with other semi-large values (like inputs or outputs) doesn't exceed the 64 KB limit imposed by Azure Storage. I'm labeling this as a bug that needs to be fixed. In the meantime, I recommend reducing the size of your custom status values to avoid this issue in the future. For the current stuck instance, you can terminate it to get it out of the "Running" status. |
I was wrong - while we do have checks for the custom status size for the .NET in-proc SDK, we don't have any such checks in the .NET Isolated SDK, which otherwise would have caught this kind of issue. We may need to introduce a breaking change to ensure that the serialized custom status payload size matches the in-proc limit: 16 KB. |
Description
I have several instances of the same orchestrator that moreo often than not get stuck in Running.
The ochestrator calls 5 sub orchestrators and takes about 2 hours total. The inputs are not large so nothing suspcious there.
If I check the history table I can see that an OrchestratorComplete event is fired with a null instanceId - indicading it should be in Completed state but is not.
Expected behavior
Orchestrator leaves the Running state and becomes Completed
Actual behavior
Orchestrator remains in running state
Relevant source code snippets
// insert code snippet here
Known workarounds
App Details
Dotnet 8
Isolated Worker
Screenshots
If deployed to Azure
The text was updated successfully, but these errors were encountered: