Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Queue trigger function stopped without (visible) reason and long delay before next try #10461

Open
acorbos opened this issue Sep 10, 2024 · 4 comments
Assignees

Comments

@acorbos
Copy link

acorbos commented Sep 10, 2024

Hello,

we have a queue triggered function on consumption plan, which sometimes stops while running.

We want to process 1 message / instance, here are the queue settings in host.json:
"maxPollingInterval": "00:00:02",
"visibilityTimeout": "00:00:05",
"batchSize": 1,
"maxDequeueCount": 2,
"newBatchThreshold": 0
A message was written to the queue at 3:54:56 PM.

We found the following logs

at 9/9/2024, 3:58:30.6648752 PM [HostMonitor] Host CPU threshold exceeded (95 >= 80)

at 9/9/2024, 4:05:09.2535375 PM Trigger Details: MessageId: 0dfa5a35-f122-4db3-9300-6007ca64194e, DequeueCount: 2, InsertedOn: 2024-09-09T13:54:56.000+00:00

and then the function runs succesfully (lots of traces from our function after 4:05:09 PM).

But there is no trace of the first run (I would expect DequeueCount: 1, with the same InsertedOn) and a reason why it failed.

Is it possible that the host is just killed after that "CPU threshold exceeded" warning?

Why is the second try starting more than 10 minutes after queue insertion? My visibilityTimeout is 5 seconds.

@bhagyshricompany
Copy link

bhagyshricompany commented Sep 10, 2024

Thanks for informing please share all repro steps.

@acorbos
Copy link
Author

acorbos commented Sep 10, 2024

Not sure what you mean by reporting steps.
If you need to identify the function: we are in region WestEurope, InvocationID= 7554d4f0-fadb-4783-abe4-340cf2cfec03
Timestamp is in the first post

@acorbos
Copy link
Author

acorbos commented Sep 23, 2024

We have again a failure today, which cannot be explained by existing logs:
Region = Westeurope
AzureFunctions_InvocationId = edd51dfd-db0a-4b0f-8504-e5627f541212

Function started at 10:19:57, we have logs until 10:20:57 when it misteriously stopped without finishing.
No threshold warning this time.
No second try, just
"Message has reached MaxDequeueCount of 2. Moving message to queue 'ams-poison'." at 10:40

In invocations there is also nothing, not even the first invocation.
Can anybody explain this, please?

@bhagyshricompany
Copy link

@kshyju pls comment and validate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants